php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #55507 fgetcsv() handles invalid characters inconsistently
Submitted: 2011-08-25 12:46 UTC Modified: 2016-07-21 16:12 UTC
Votes:20
Avg. Score:4.3 ± 0.8
Reproduced:19 of 19 (100.0%)
Same Version:16 (84.2%)
Same OS:14 (73.7%)
From: gtisza at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Filesystem function related
PHP Version: Irrelevant OS: Linux
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: gtisza at gmail dot com
New email:
PHP Version: OS:

 

 [2011-08-25 12:46 UTC] gtisza at gmail dot com
Description:
------------
fgetcsv() throws away the first character of a field if it is invalid in the current locale, but ignores invalid characters which are not at the beginning of a field. The inconsistent behavior makes it hard to locate the source of the bug; it should either throw all invalid characters away, or none of them (IMO the second is much better).


(This is a duplicate of bug 45356, but that one has been closed as "no feedback", and apparently mere mortals are not allowed to reopen it, even if they do provide the feedback...)

Test script:
---------------
<?php

setlocale(LC_ALL,'C');
$utfchar = chr(0xC3).chr(0x89); // U+009C in UTF-8

$csv = $utfchar."x".$utfchar."x\n";

file_put_contents('test.csv', $csv);
$file = fopen('test.csv', 'r');
$data = fgetcsv($file);

for ($i = 0; $i < strlen($data[0]); $i++) {
    echo dechex(ord($data[0][$i])).' ';
}
echo "\n";
unlink('test.csv');

// expected: c3 89 78 c3 89 78 - "ÉxÉx"
// actual: 78 c3 89 78 - "xÉx"

?>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-10-22 09:33 UTC] r dot fiedler at ovm dot de
In the window-versions it works correct,
I got the error under php 5.2.6-1+lenny13
 [2012-05-06 16:47 UTC] dll at fugro dot no
I had similar problems using the Norwegian letter "Ø" as the first letter in the elements it was simply not there after the fgetcsv transfer. The following
WORKAROUND worked for me using explode().
dee ell ell at fugro dot no

//read the text file into a variable
$txt=read_txtfile("test.txt");

//explode the stream into an array of $nr rows					
$rowArr = explode("\n", $txt);					
$nr=count($rowArr);

For ($r=0;$r<$nr;$r++){	
  
     $insert_data="'".str_replace(";", "','",$rowArr[$r])."'";

     //insert each row in the DB table "test"
        $query_string=" INSERT INTO test (name,name2)"
         		." VALUES (".$insert_data.")"; 				$result_id = mysql_query($query_string, $my_conn)
                        or die("display_db_query:" . mysql_error()); 
    }

    If ($result_id ==1){echo $nr." rows transfered<br />\n"; }
    

Function read_txtfile($infile){
// read text data from file into a variable
$txt='';
$fo=fopen($infile,"r");
$txt=fread($fo,filesize($infile)); 
fclose($fo);
return $txt;
}	

===================================================================
If there is a need to access each data column in the row before transferring, these can easily be accessed by exploding each row once more in an inner loop.
 [2013-08-02 01:04 UTC] yohgaki@php.net
Related to #65368
 [2016-07-21 16:12 UTC] cmb@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cmb
 [2016-07-21 16:12 UTC] cmb@php.net
This bug appears to have been fixed as of PHP 5.3.7, see
<https://3v4l.org/984HF>.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 16:01:29 2024 UTC