php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72861 fgetcsv get line error
Submitted: 2016-08-17 03:11 UTC Modified: 2018-09-14 08:35 UTC
Votes:3
Avg. Score:5.0 ± 0.0
Reproduced:3 of 3 (100.0%)
Same Version:1 (33.3%)
Same OS:3 (100.0%)
From: zhouliweb at sina dot com Assigned:
Status: Open Package: Filesystem function related
PHP Version: 7.0.9 OS: Win
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: zhouliweb at sina dot com
New email:
PHP Version: OS:

 

 [2016-08-17 03:11 UTC] zhouliweb at sina dot com
Description:
------------
While has japanese in the csv, fgetcsv get line error. More detail see the test script.

Test script:
---------------
$filename = "a.csv";
$file = fopen($filename, "r");
while(!feof($file) && $data = fgetcsv($file)) {
    echo json_encode($data) . "\n";
}
fclose($file);

a.csv is like this:(3 lines, the first line is unix LF, others is windows CR+LF)
"a
ましょう。"
b



Expected result:
----------------
["a\n\u307e\u3057\u3087\u3046\u3002"]
["b"]

Actual result:
--------------
["a\n\u307e\u3057\u3087\u3046\u3002\"\r\nb\r\n"]

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-13 16:56 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2018-09-13 16:56 UTC] cmb@php.net
I can't reproduce this.  Is a.csv actually UTF-8 encoded? Are you
using an UTF-8 locale (LC_CTYPE)?
 [2018-09-14 01:25 UTC] zhouliweb at sina dot com
I tried it just now(in windows with ;php7.0.9 & in centos with php7.1.12), the bug is already exist.
orz
 [2018-09-14 08:35 UTC] cmb@php.net
-Status: Feedback +Status: Open -Assigned To: cmb +Assigned To:
 [2018-09-14 08:35 UTC] cmb@php.net
I *think* that fgetcsv() works as expected here, but there is some
character encoding issue.  It's important to note that fgetcsv()
is locale (LC_CTYPE) aware, so the character encoding of a.csv and
LC_CTYPE have to match (or at least be compatible), *and* LC_TYPE
has to be compatible with ASCII with regard to the delimiter, the
enclosure, the escape character and the line endings (all of these
are hard coded, unless overwritten).

I guess we'd need a mb_fgetcsv() and mb_fputcsv() to solve all the
related issues.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Oct 27 16:01:27 2024 UTC