php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72861 fgetcsv get line error
Submitted: 2016-08-17 03:11 UTC Modified: 2018-09-14 08:35 UTC
Votes:3
Avg. Score:5.0 ± 0.0
Reproduced:3 of 3 (100.0%)
Same Version:1 (33.3%)
Same OS:3 (100.0%)
From: zhouliweb at sina dot com Assigned:
Status: Open Package: Filesystem function related
PHP Version: 7.0.9 OS: Win
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: zhouliweb at sina dot com
New email:
PHP Version: OS:

 

 [2016-08-17 03:11 UTC] zhouliweb at sina dot com
Description:
------------
While has japanese in the csv, fgetcsv get line error. More detail see the test script.

Test script:
---------------
$filename = "a.csv";
$file = fopen($filename, "r");
while(!feof($file) && $data = fgetcsv($file)) {
    echo json_encode($data) . "\n";
}
fclose($file);

a.csv is like this:(3 lines, the first line is unix LF, others is windows CR+LF)
"a
ましょう。"
b



Expected result:
----------------
["a\n\u307e\u3057\u3087\u3046\u3002"]
["b"]

Actual result:
--------------
["a\n\u307e\u3057\u3087\u3046\u3002\"\r\nb\r\n"]

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-13 16:56 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2018-09-13 16:56 UTC] cmb@php.net
I can't reproduce this.  Is a.csv actually UTF-8 encoded? Are you
using an UTF-8 locale (LC_CTYPE)?
 [2018-09-14 01:25 UTC] zhouliweb at sina dot com
I tried it just now(in windows with ;php7.0.9 & in centos with php7.1.12), the bug is already exist.
orz
 [2018-09-14 08:35 UTC] cmb@php.net
-Status: Feedback +Status: Open -Assigned To: cmb +Assigned To:
 [2018-09-14 08:35 UTC] cmb@php.net
I *think* that fgetcsv() works as expected here, but there is some
character encoding issue.  It's important to note that fgetcsv()
is locale (LC_CTYPE) aware, so the character encoding of a.csv and
LC_CTYPE have to match (or at least be compatible), *and* LC_TYPE
has to be compatible with ASCII with regard to the delimiter, the
enclosure, the escape character and the line endings (all of these
are hard coded, unless overwritten).

I guess we'd need a mb_fgetcsv() and mb_fputcsv() to solve all the
related issues.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Dec 09 08:01:27 2024 UTC