php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #42358 fgetcsv bug on TSV file if a field start by a quote
Submitted: 2007-08-21 09:37 UTC Modified: 2007-08-21 13:35 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: guillaume dot lecanu at noovea dot fr Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.2.3 OS: linux ubuntu feisty
Private report: No CVE-ID: None
 [2007-08-21 09:37 UTC] guillaume dot lecanu at noovea dot fr
Description:
------------
In a TSV (Tab Separated Values) file where every values are not encapsuled by quotes ; if a value start by a quote, the value is wrong, all next lines are get in this field (see my example).

(I can't upgrade in 5.2.3 to test, but this bug was under PHP 4.3.10-16 and still here in PHP 5.2.1).

Reproduce code:
---------------
<?php
    $file = "fgetcsv_bug.txt";
    $handle = fopen($file, "r");
    if (!$handle) {
        die("File {$file} not found !");
    }
    $nb_rows = 0;
    while (($row = fgetcsv($handle, 100, "\t")) !== FALSE) {
        echo "<br>ROW ".( ++$nb_rows )." : ";
        var_export($row);
    }
?>

The fgetcsv_bug.txt is a simple file where values are tab separated :
A1  A2  A3  A4
B1  "B2 B3  B4
C1  C2  C3  C4
D1  D2  D3  D4


Expected result:
----------------
ROW 1 : array ( 0 => 'A1', 1 => 'A2', 2 => 'A3', 3 => 'A4', )
ROW 2 : array ( 0 => 'B1', 1 => '"B2', 2 => 'B3', 3 => 'B4', )
ROW 3 : array ( 0 => 'C1', 1 => 'C2', 2 => 'C3', 3 => 'C4', )
ROW 4 : array ( 0 => 'D1', 1 => 'D2', 2 => 'D3', 3 => 'D4', )

Actual result:
--------------
ROW 1 : array ( 0 => 'A1', 1 => 'A2', 2 => 'A3', 3 => 'A4', )
ROW 2 : array ( 0 => 'B1', 1 => 'B2 B3 B4 C1 C2 C3 C4 D1 D2 D3 D4 ', )

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-08-21 10:08 UTC] jani@php.net
So in your case: use the 4th parameter like this:

while (($row = fgetcsv($handle, 100, "\t", "\t")) !== FALSE) {

With that I get your expected result. (enclosure defaults to ")
 [2007-08-21 11:55 UTC] guillaume dot lecanu at noovea dot fr
Hi Jani, 

Thanks for the trick, this works for the bug of the quote, but this introduce another bug with empty values.

If I removed the value C2 from my file :
A1<tab>A2<tab>A3<tab>A4
B1<tab>"B2<tab>B3<tab>B4
C1<tab><tab>C3<tab>C4
D1<tab>D2<tab>D3<tab>D4

We have currently this result :
ROW 1 : array ( 0 => 'A1', 1 => 'A2', 2 => 'A3', 3 => 'A4', )
ROW 2 : array ( 0 => 'B1', 1 => '"B2', 2 => 'B3', 3 => 'B4', )
ROW 3 : array ( 0 => 'C1', 1 => 'C3C4', )
ROW 4 : array ( 0 => 'D1', 1 => 'D2', 2 => 'D3', 3 => 'D4', )

And the expected result should be :

ROW 1 : array ( 0 => 'A1', 1 => 'A2', 2 => 'A3', 3 => 'A4', )
ROW 2 : array ( 0 => 'B1', 1 => '"B2', 2 => 'B3', 3 => 'B4', )
ROW 3 : array ( 0 => 'C1', 1 => '', 2 => 'C3', 3 => 'C4', )
ROW 4 : array ( 0 => 'D1', 1 => 'D2', 2 => 'D3', 3 => 'D4', )

This make the same bug with a space in the C2 cell (C1<tab> <tab>C3<tab>C4).

Any ideas ?
 [2007-08-21 12:24 UTC] guillaume dot lecanu at noovea dot fr
Just found a trick to works in my case :
Use a not used character like a default separator, for example :
fgetcsv($handle, 100, "\t", chr(13))

This tricks works for this case, but I think a fix should be made on the php sources.
 [2007-08-21 13:35 UTC] jani@php.net
Not really, just provide consistent CSV files. Magic really isn't the way.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Tue Jul 29 12:00:02 2025 UTC