php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72861 fgetcsv get line error
Submitted: 2016-08-17 03:11 UTC Modified: 2018-09-14 08:35 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:2 (100.0%)
From: zhouliweb at sina dot com Assigned:
Status: Open Package: Filesystem function related
PHP Version: 7.0.9 OS: Win
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2016-08-17 03:11 UTC] zhouliweb at sina dot com
Description:
------------
While has japanese in the csv, fgetcsv get line error. More detail see the test script.

Test script:
---------------
$filename = "a.csv";
$file = fopen($filename, "r");
while(!feof($file) && $data = fgetcsv($file)) {
    echo json_encode($data) . "\n";
}
fclose($file);

a.csv is like this:(3 lines, the first line is unix LF, others is windows CR+LF)
"a
ましょう。"
b



Expected result:
----------------
["a\n\u307e\u3057\u3087\u3046\u3002"]
["b"]

Actual result:
--------------
["a\n\u307e\u3057\u3087\u3046\u3002\"\r\nb\r\n"]

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-13 16:56 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2018-09-13 16:56 UTC] cmb@php.net
I can't reproduce this.  Is a.csv actually UTF-8 encoded? Are you
using an UTF-8 locale (LC_CTYPE)?
 [2018-09-14 01:25 UTC] zhouliweb at sina dot com
I tried it just now(in windows with ;php7.0.9 & in centos with php7.1.12), the bug is already exist.
orz
 [2018-09-14 08:35 UTC] cmb@php.net
-Status: Feedback +Status: Open -Assigned To: cmb +Assigned To:
 [2018-09-14 08:35 UTC] cmb@php.net
I *think* that fgetcsv() works as expected here, but there is some
character encoding issue.  It's important to note that fgetcsv()
is locale (LC_CTYPE) aware, so the character encoding of a.csv and
LC_CTYPE have to match (or at least be compatible), *and* LC_TYPE
has to be compatible with ASCII with regard to the delimiter, the
enclosure, the escape character and the line endings (all of these
are hard coded, unless overwritten).

I guess we'd need a mb_fgetcsv() and mb_fputcsv() to solve all the
related issues.
 
PHP Copyright © 2001-2018 The PHP Group
All rights reserved.
Last updated: Wed Oct 17 16:01:25 2018 UTC