php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #33847 fgetcsv incorrectly treats backslash
Submitted: 2005-07-25 01:00 UTC Modified: 2005-07-29 01:05 UTC
From: vasilyev at math dot uchicago dot edu Assigned:
Status: Not a bug Package: Filesystem function related
PHP Version: 5.0.4 OS: OS X (irrelevant)
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: vasilyev at math dot uchicago dot edu
New email:
PHP Version: OS:

 

 [2005-07-25 01:00 UTC] vasilyev at math dot uchicago dot edu
Description:
------------
This has been already mentioned in several bugs, that were all dismissed as bogus. Nevertheless, I believe there is a serious problem with the way fgetcsv treats backslash.

Example:

"a\","b"

produces

Array
(
    [0] => a\",b"
)

iliia@php.net says that this is an expected behavior since backslash is an escaping character. 

Well, if this were true then

"a\"b","c"

would give

Array
(
    [0] => a"b
    [1] => c
)

while in fact you get

Array
(
    [0] => a\"b
    [1] => c
)

Another scenario: what do you do if you want to have a backslash at the end of a field (and let's say there are commas in that field, so we do have to use quotes). Well the natural answer is to escape the backslash:


"a\\","b"

but this would produce

Array
(
    [0] => a\\
    [1] => b
)

It seems that the only thing a backslash does is making fgetcsv() not treat the following quote as an enclosure mark, without actually stripping the backslash. This is not escaping.

There are two ways this can be fixed:

1.Make backslash an escaping character. This would further deviate fgetcsv() parsing of CSV files from the wide-spread understanding of what a CSV format is.

2.Treat backslash as any other character.

I would prefer the second choice.



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-07-25 04:18 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The current behaviour is based on how various csv parsers work, any deviation from that would result in BC break. Therefor this functionality is not going to change.
 [2005-07-29 01:05 UTC] vasilyev at math dot uchicago dot edu
I understand your concern that changing the behavior might break BC. However, not changing the behavior makes fgetcsv() not compatible with other csv applications (gnumeric for example). I suggest to introduce an option in php.ini that would make fgetcsv() work the old way.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 19:01:33 2024 UTC