php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #1443 Function doesn't follow csv spec
Submitted: 1999-05-23 23:13 UTC Modified: 1999-06-20 11:49 UTC
From: C dot Just at its dot uq dot edu dot au Assigned: rasmus (profile)
Status: Closed Package: Misbehaving function
PHP Version: 3.0.8 OS: All
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: C dot Just at its dot uq dot edu dot au
New email:
PHP Version: OS:

 

 [1999-05-23 23:13 UTC] C dot Just at its dot uq dot edu dot au
The Function though excellent doesn't follow the specs for csv's.
A csv should allow newlines within a quoted field.
I have written the workable solution in PHP code but don't feel confident enough to convert it to c code :)
Here is a test csv file
--------------------------------
this,34,23,"this has ""quotes"" 
and 
enters"
field,45,3,no_quotes
"this, has a comma",45,34,"comma and , some ""quotes"""
"THIS_ENDS_wITH_A_QUOTE""","45""",87,"COUPLE, OF , COMMAS AND QUOTES"""
",commafirst",89,7,"""quotes first"
normal,"""9",3,
--------------------------------

Here is the working code.
----------------------------------------------------------------------------
function fgetcsvnew($filepointer){

  // INCASE FORMAT COMING FROM A WINDOWS MACHINE
  $importfile[$i] = ereg_replace("\r\n","\n",$importfile[$i]);
  $endofrecord = 0;
  $resultcounter = 0;
  while (!$endofrecord && ($currentline = fgets($filepointer,2048))){

    // LOOP THRU CURRENT LINE
    for($j=0; $j<strlen($currentline); $j++){
      // CHECK TO SEE IF THE CURRENT POSITION IS WITHIN A QUOTED FIELD
      switch($midfield){
        case 0:
          // THIS MARKS THE BEGINNING OF A QUOTED FIELD
          if ($currentline[$j] == "\""){   
            $midfield = 1;
          // THIS MARKS THE END OF THE CURRENT UNQUOTED FIELD
          }elseif($currentline[$j] == ","){
            $midfield = 0;
            $result = $buffer;
            $buffer = "";
          // THIS MARKS THE END OF THE CURRENT RECORD
          }elseif($currentline[$j] == "\n"){
            $midfield = 0;
            $result = $buffer;
            $buffer = "";
            $endofrecord = 1;
          }else{
          // STILL WITHIN A UNQUOTED FIELD SO STORE IN BUFFER
            $buffer .= $currentline[$j];
          }
          break;
        case 1:
          // CHECKS FOR THE FIRST QUOTE
          if ($currentline[$j] == "\""){
            // CHECKS FOR THE SECOND ESCAPED QUOTE
            if ($currentline[$j+1] == "\""){
              // ADD THE ESCAPED QUOTE TO THE BUFFER
              $buffer .= "\"";
              $j++;  // SKIP LINE POINTER FORWARD ONE.
            // THIS MARKS THE END OF A QUOTED FIELD.
            }else{
              $midfield = 0;
              $result = $buffer;
              $buffer = "";
            }
          // STILL WITHIN A QUOTED FIELD SO STORE IN BUFFER
          }else{
            $buffer .= $currentline[$j];
          }
          break;
        }
      // CHECK TO SEE IF RESULT EXISTS, THE TRIM IS NEEDED AS THE END OF THE RECORD CONTAINS A \n AND THIS SHOULD BE IGNORED.
      if(trim($result)){
        $resultarray[$resultcounter] = $result;
        $resultcounter++;
        $result = 0;
      }
    }
  }
  if (is_array($resultarray)){
    return($resultarray);
  }else{
    return(-1);
  }
} 
// test for the function above
	
  $fp = fopen("test.csv", "r");
  $k = 0;
  while(is_array($test = fgetcsvnew($fp)) && ($k++<10)){
    echo "---$k--------------------------------------------\n";
    for ($i=0; $i<count($test);$i++){
      echo "**\n".$test[$i]."\n";
    }
  }

--------------------------------------------------------------

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [1999-05-23 23:35 UTC] rasmus at cvs dot php dot net
Report sent to Nick Talbott who wrote the fgetcsv() function and either he will fix it, or I will at some point.
 [1999-06-20 11:49 UTC] sas at cvs dot php dot net
That was committed. Thanks for the code.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri May 24 13:01:31 2024 UTC