php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #53822 fgetcsv parsing error
Submitted: 2011-01-23 15:01 UTC Modified: 2011-01-27 01:27 UTC
Votes:3
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: withskyto at naver dot com Assigned:
Status: Not a bug Package: Filesystem function related
PHP Version: 5.2.17 OS: freebsd
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: withskyto at naver dot com
New email:
PHP Version: OS:

 

 [2011-01-23 15:01 UTC] withskyto at naver dot com
Description:
------------
I save csv file from MS Office excel.

The file looks as below.

A,B,C,D
AAA,"BB,B","CCC,'\C,,CCC","D,DDD"
"AA""AA","BB"",BBB""B","CC\""CC,,C""",DDD


fgetcsv seems to be incorrect, if the cell in EXCEL include escape string.



Test script:
---------------
$fp = fopen('test3.csv', 'r');
while ($arr = fgetcsv($fp, 10000, ',', '"')) {
  print_r($arr);
}

Expected result:
----------------
Array
(
    [0] => A
    [1] => B
    [2] => C
    [3] => D
)
Array
(
    [0] => AAA
    [1] => BB,B
    [2] => CCC,'\C,,CCC
    [3] => D,DDD
)
Array
(
    [0] => AA"AA
    [1] => BB",BBB"B
    [2] => CC\"CC,,C"
    [3] => DDD
)

Actual result:
--------------
Array
(
    [0] => A
    [1] => B
    [2] => C
    [3] => D
)
Array
(
    [0] => AAA
    [1] => BB,B
    [2] => CCC,'\C,,CCC
    [3] => D,DDD
)
Array
(
    [0] => AA"AA
    [1] => BB",BBB"B
    [2] => CC\"CC
    [3] =>
    [4] => C"""
    [5] => DDD
)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-01-23 15:09 UTC] withskyto at naver dot com
I check this file.
OpenOffice.org Calc and MS Offie excel - It seems to parse properly.
 [2011-01-23 15:49 UTC] withskyto at naver dot com
-Status: Open +Status: Closed -Package: Streams related +Package: Filesystem function related
 [2011-01-23 15:49 UTC] withskyto at naver dot com
change Package
 [2011-01-23 15:51 UTC] withskyto at naver dot com
-Status: Closed +Status: Open
 [2011-01-23 15:51 UTC] withskyto at naver dot com
Change Status
 [2011-01-24 03:12 UTC] withskyto at naver dot com
I save csv file from MS Office excel.

The file looks as below.

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
"0-2","sample sample 2","10000","10000","10000","1","A","2000-12-15 10:46:22.802144","2000-12-15 11:46:22.801114","1000","10","C","I","가","나","T"


Test script:
---------------
setlocale(LC_CTYPE, "ko_KR.eucKR");
$fp = fopen('test.csv', 'r');
while ($arr = fgetcsv($fp, 10000, ',', '"')) {
  print_r($arr);
}

Expected result:
----------------

Array
(
    [0] => 0
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 4
    [5] => 5
    [6] => 6
    [7] => 7
    [8] => 8
    [9] => 9
    [10] => 10
    [11] => 11
    [12] => 12
    [13] => 13
    [14] => 14
    [15] => 15
)
Array
(
    [0] => 0-2
    [1] => sample sample 2
    [2] => 10000
    [3] => 10000
    [4] => 10000
    [5] => 1
    [6] => A
    [7] => 2000-12-15 10:46:22.802144
    [8] => 2000-12-15 11:46:22.801114
    [9] => 1000
    [10] => 10
    [11] => C
    [12] => I
    [13] => 가
    [14] => 나
     [15] => T
)


Actual result:
--------------


Array
(
    [0] => 0
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 4
    [5] => 5
    [6] => 6
    [7] => 7
    [8] => 8
    [9] => 9
    [10] => 10
    [11] => 11
    [12] => 12
    [13] => 13
    [14] => 14
    [15] => 15
)
Array
(
    [0] => 0-2
    [1] => sample sample 2
    [2] => 10000
    [3] => 10000
    [4] => 10000
    [5] => 1
    [6] => A
    [7] => 2000-12-15 10:46:22.802144
    [8] => 2000-12-15 11:46:22.801114
    [9] => 1000
    [10] => 10
    [11] => C
    [12] => I
    [13] => 가",나"
    [14] => T
)
 [2011-01-24 03:27 UTC] withskyto at naver dot com
I solve the problem as below. But the escape character's problem not solved yet.

I save csv file from MS Office excel.

The file looks as below.

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
"0-2","sample sample 2","10000","10000","10000","1","A","2000-12-15 10:46:22.802144","2000-12-15 11:46:22.801114","1000","10","C","I","가","나","T"


Test script:
---------------
setlocale(LC_CTYPE, "ko_KR.UTF-8"); // not use ko_KR.eucKR
$fp = fopen('test.csv', 'r');
while ($arr = fgetcsv($fp, 10000, ',', '"')) {
  print_r($arr);
}

Expected result:
----------------

(
    [0] => 0
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 4
    [5] => 5
    [6] => 6
    [7] => 7
    [8] => 8
    [9] => 9
    [10] => 10
    [11] => 11
    [12] => 12
    [13] => 13
    [14] => 14
    [15] => 15
)
Array
(
    [0] => 0-2
    [1] => sample sample 2
    [2] => 10000
    [3] => 10000
    [4] => 10000
    [5] => 1
    [6] => A
    [7] => 2000-12-15 10:46:22.802144
    [8] => 2000-12-15 11:46:22.801114
    [9] => 1000
    [10] => 10
    [11] => C
    [12] => I
    [13] => 가나다
    [14] => 나다라
    [15] => T
)


Actual result:
--------------

Array
(
    [0] => 0
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 4
    [5] => 5
    [6] => 6
    [7] => 7
    [8] => 8
    [9] => 9
    [10] => 10
    [11] => 11
    [12] => 12
    [13] => 13
    [14] => 14
    [15] => 15
)
Array
(
    [0] => 0-2
    [1] => sample sample 2
    [2] => 10000
    [3] => 10000
    [4] => 10000
    [5] => 1
    [6] => A
    [7] => 2000-12-15 10:46:22.802144
    [8] => 2000-12-15 11:46:22.801114
    [9] => 1000
    [10] => 10
    [11] => C
    [12] => I
    [13] => 가나다
    [14] => 나다라
    [15] => T
)
 [2011-01-27 01:27 UTC] cataphract@php.net
-Status: Open +Status: Bogus
 [2011-01-27 01:27 UTC] cataphract@php.net
Use the last parameter to specify " as the escape character.

<?php
$str = <<<EOD
A,B,C,D
AAA,"BB,B","CCC,'\C,,CCC","D,DDD"
"AA""AA","BB"",BBB""B","CC\""CC,,C""",DDD
EOD;
fwrite($fp = fopen('php://temp', 'r+'), $str);
fseek($fp, 0, SEEK_SET);
while ($arr = fgetcsv($fp, 10000, ',', '"', '"')) {
  print_r($arr);
}

gives the expected result.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sat Jul 12 23:01:32 2025 UTC