php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #29719 fgetcsv - double quotes issue
Submitted: 2004-08-17 14:25 UTC Modified: 2005-09-09 09:14 UTC
Votes:5
Avg. Score:5.0 ± 0.0
Reproduced:5 of 5 (100.0%)
Same Version:2 (40.0%)
Same OS:2 (40.0%)
From: tjerk dot meesters at gmail dot com Assigned:
Status: Closed Package: Filesystem function related
PHP Version: 4.3.9RC2 OS: Linux-2.4
Private report: No CVE-ID: None
 [2004-08-17 14:25 UTC] tjerk dot meesters at gmail dot com
Description:
------------
The fgetcsv() of PHP 4.3.4 works fine, however, as of 4.3.8 the behaviour concerning escaped string qualifiers changed.

With single line data, an initial escaped string qualifier doesn't get noticed.
With multiple line data, the last occurrence of an escaped string qualifier doesn't get noticed.

For the example code, use the following data:

-------------------------
CSV DATA (test.csv)
-------------------------
test;test spaced string;"test; with delimeter";"""test with inline double quotes""";"test
with
newlines";"""test
with
newlines and double quotes"""


Reproduce code:
---------------
<?php

$f=fopen('test.csv','rb');
while (!feof($f)) {
    $s = fgetcsv($f,1000,';','"');
    print_r($s);
}
fclose($f);

?>


Expected result:
----------------
Array
(
    [0] => test
    [1] => test spaced string
    [2] => test; with delimeter
    [3] => "test with inline double quotes"
    [4] => test
with
newlines
    [5] => "test
with
newlines and double quotes"
)


Actual result:
--------------
Array
(
    [0] => test
    [1] => test spaced string
    [2] => test; with delimeter
    [3] => test with inline double quotes"
    [4] => test
with
newlines
    [5] => "test
with
newlines and double quotes
)


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-08-17 16:10 UTC] iliaa@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 [2004-08-18 23:08 UTC] t dot meesters at triptic dot nl
Result is now:
Array
(
    [0] => test
    [1] => test spaced string
    [2] => test; with delimeter
    [3] => test with inline double quotes"
    [4] => test
with
newlines
    [5] => "test
with
newlines and double quotes"
)

Although the test with newlines and double quotes works fine now, the problem still lies with test 3: the inline double quote.
 [2004-08-18 23:15 UTC] iliaa@php.net
With latest CVS I get the correct output of: 
Array 
( 
    [0] => test 
    [1] => test spaced string 
    [2] => test; with delimeter 
    [3] => "test with inline 
double quotes" 
    [4] => test 
with 
newlines 
    [5] => "test 
with 
newlines and double quotes" 
) 
 
 [2004-08-18 23:35 UTC] t dot meesters at triptic dot nl
Oops, I think the line wrapping caused an error in my initial input: please note that test #3 should be:

"""test with inline double quotes""" (on one line)

Sorry for the inconvenience.
 [2004-08-20 02:40 UTC] t dot meesters at triptic dot nl
I meant that the problem is still there ;-) I've been browsing through the code and finally came up with the following patch:

*** file.c.orig Fri Aug 20 02:30:27 2004
--- file.c      Fri Aug 20 02:30:37 2004
***************
*** 2391,2399 ****
                if ((p = memchr(p2, delimiter, (e - p2)))) {
                        p2 = s;
                        s = p + 1;
-                       if (*p2 == enclosure) {
-                               p2++;
-                       }

                        /* copy data to buffer */
                        buf2 = erealloc(buf2, buf2_len + (p - p2) + 1);
--- 2391,2396 ----

After setting p2 to s it doesn't seem like a good idea to check if the first character is a delimiter, since you might want to start a string with two consecutive double quotes. By increasing p2, thus effectively removing the first double quote, the trim_enclosed() function will regard the remaining double quote as garbage and ignore it.

The patch has been tested and passes the above mentioned tests.
 [2004-09-02 22:57 UTC] tjerk dot meesters at gmail dot com
I'd like to remind you that this issue is still not resolved. It gives the wrong results when dealing with a line like below:

"""testing""";

The above is returned as:
testing"
(empty field)
 [2005-09-02 09:33 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip


 [2005-09-09 08:41 UTC] tjerk dot meesters at gmail dot com
It's confirmed as fixed in 5.1.0-dev / build 2600. Thanks!
 [2005-09-09 09:00 UTC] derick@php.net
I'm not sure what you mean by "build 2600", but closing this.
 [2005-09-09 09:14 UTC] tjerk dot meesters at gmail dot com
Build 2600 refers to my own machine ... oops ;-)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 19:01:31 2024 UTC