php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #43225 fputcsv incorrectly handles cells ending in \ followed by "
Submitted: 2007-11-09 14:59 UTC Modified: 2018-09-13 12:35 UTC
Votes:45
Avg. Score:4.4 ± 0.9
Reproduced:40 of 40 (100.0%)
Same Version:7 (17.5%)
Same OS:9 (22.5%)
From: ed at bronto dot com Assigned: cmb (profile)
Status: Duplicate Package: Filesystem function related
PHP Version: 5.2.4 OS: Centos
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: ed at bronto dot com
New email:
PHP Version: OS:

 

 [2007-11-09 14:59 UTC] ed at bronto dot com
Description:
------------
Using fputcsv to output a cell that ends with a \ followed by double quotes (") causes it to not use any escape sequence.  Oddly, fgetscsv is able to parse it correctly.  Unlike fgetscsv, I assume fputcsv follows RFC 4180 and uses " as the escape character.

Reproduce code:
---------------
$row = array();
$row[] = 'a\\"';
$row[] = 'bbb';

$fp = fopen('test.csv', 'w+');
fputcsv($fp, $row);
fclose($fp);






Expected result:
----------------
expected output: "a\""",bbb

Actual result:
--------------
actual output: "a\"",bbb

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-04-17 01:00 UTC] dan at expireddomain dot com
Same problem on windows XP PHP version 5.2.5 on cells that contain a \ followed by double quotes (")
 [2009-01-19 12:54 UTC] magicaltux@php.net
This bug is the same as bug #38918 and bug #38929.

* fputcsv() does escape values (replacing " with "", for example)
* It seems that fgetcsv() accepts two incompatible unescaping methods

Reproduced:
php > $fp = fopen('php://temp', 'r');
php > fputcsv($fp, array('foo', 'bar\\', 'baz'));
php > rewind($fp);
php > echo fgets($fp);
foo,"bar\",baz
php > rewind($fp);
php > var_dump(fgetcsv($fp));
array(2) {
  [0]=>
  string(3) "foo"
  [1]=>
  string(10) "bar\",baz
"
}
php > echo PHP_VERSION;
5.2.6-pl7-gentoo
php > 

I believe this problem is due to the fact fgetcsv() accept two escaping methods. An extra argument to fgetcsv() could (maybe?) fix this (and the extra argument could be added to fputcsv too)
 [2009-10-09 19:57 UTC] mbest at icontact dot com
magicaltux@php.net is wrong.  This bug is not about fgetcsv but about fputcsv.  fputcsv should always escape a double quote to two double quotes. But it doesn't do so if the field contains \"  This will mess up the CSV output such that it will not be importable in Excel or other such programs.
 [2011-04-08 20:46 UTC] jani@php.net
-Package: Feature/Change Request +Package: Filesystem function related
 [2013-01-15 07:05 UTC] aharvey@php.net
I've found the cause of this while writing tests for PR 197.

php_fputcsv(), while iterating over the fields to be output, has this fairly odd "escaped" concept — once escape_char (which is hardcoded \ at present) is seen, escaping stops until the next enclosure (" by default) is seen. It doesn't matter whether it's the following character or not.

After that, escape_char is then ignored anyway, and the enclosure is used as the "escape" character.

This came in via https://github.com/php/php-src/commit/af0adbed3963cdee1bfaf5e3a74b029d2b92c8b7 seven years ago to make the use of enclosures optional — the feature in general is good, but this is definitely an issue in the implementation.
 [2013-01-15 07:05 UTC] aharvey@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: aharvey
 [2013-01-15 07:35 UTC] aharvey@php.net
-Status: Assigned +Status: Closed -Type: Feature/Change Request +Type: Bug
 [2013-01-15 09:43 UTC] aharvey@php.net
Sadly, that was a classic case of fixing one thing and breaking another. Reopening, and unassigning myself.
 [2013-01-15 09:43 UTC] aharvey@php.net
-Status: Closed +Status: Re-Opened -Assigned To: aharvey +Assigned To:
 [2014-08-21 23:55 UTC] maris dot radu+phpnet at gmail dot com
Just not to confuse people, aharvey@php.net said it's fixed in "5.3.22 and 5.4.12", but in fact I tested in 5.4.31 and it's not fixed.

Looking at the patch commit it seems it's tagged with php-5.6.0beta4, so I guess will only be available in 5.6.
 [2014-09-17 12:46 UTC] lbarnaud@php.net
To summarize the related bugs: fputcsv() seems to be using inconsistent and broken escaping of the enclosing char (`"`) :

 - `"` followed by `\` are not escaped

 - `"` not followed by `\` are escaped by doubling them (e.g. `"` becomes `""`); leading to inconsistent escaping method

 - `\` themselves are not escaped, leading to generation of invalid CSV if a field is terminated by `\`: `"foo bar\",baz`

 - With the input `\\"`, the `"` is still considered to be escaped

Due to this combinaison of bugs, it is impossible to parse the CSV generated by the following call:

    fputcsv(STDOUT, ['foo\"bar', 'foo""bar', 'foo bar\\']);

Output is:

    "foo\"bar","foo""""bar","foo bar\"

Trying to parse this with a parser using the doubled-char escaping method will break on the first field.

Trying to parse this with a parser using the backslash escaping method will break on the 2nd and 3rd fields.

Trying to parse this with a parser allowing both methods will break on the 3rd field. Without the 3rd field, parsing this CSV document would result in loss of information (some `\` or `"` from the original input would be lost).
 [2015-07-25 22:14 UTC] marc at ermshaus dot org
The issues with PHP’s CSV functions also seem to be exploitable.

<?php

$handle = fopen('php://memory', 'w+b');

fputcsv($handle, [
    'foo bar\\',    # 1
    'baz quz',      # 2
    'x',            # 3
    'y',            # 4
    'z',            # 5
    'foo\\\\",bar'  # 6
]);

rewind($handle);

var_dump(fgetcsv($handle));

// array(6) {
//   [0]=>  string(18) "foo bar\",baz quz""    # 1
//   [1]=>  string(1)  "x"                     # 2  (was # 3)
//   [2]=>  string(1)  "y"                     # 3  (was # 4)
//   [3]=>  string(1)  "z"                     # 4  (was # 5)
//   [4]=>  string(5)  "foo\\"                 # 5
//   [5]=>  string(4)  "bar""                  # 6
// }

3v4l.org: http://3v4l.org/LTnC1
 [2015-09-16 04:33 UTC] plentysu at kkbox dot com
One thinking to `disable` escape method: fputcsv($handle, $array, ',', '"', "\0")
 [2017-04-25 09:38 UTC] enumag at gmail dot com
Still not fixed after almost 10 years? The thing is this can result in formula injection.

http://blog.securelayer7.net/how-to-perform-csv-excel-macro-injection/

For example I have this string (it should go into one cell):

=,test'\","",=cmd|' /C calc'!A0"

According to the article I sanitize it by escaping the = at the beginning with an apostrphe.

'=,test'\","",=cmd|' /C calc'!A0"

Because of this bug however this string can still cause an injection because this bug causes the string to split into multiple cells and the second = is not escped.
 [2017-09-11 23:53 UTC] cmb@php.net
> Still not fixed after almost 10 years?

It is not as simple. We first have to understand why fputcsv() and
friends support an escape character, which is an unrecognized
concept for RFC 4180; however, this RFC is informal and never made
it to a standard, so an escape character may make perfect sense.
But how exactly is it supposed to work?

Anyhow, "disabling" the escape character as plentysu already said,
is suffienct to get the desired result, see
<https://3v4l.org/eW2Mt>. This should be possible in a more
intuitive way, though, see request #51496.
 [2018-09-13 12:35 UTC] cmb@php.net
-Status: Re-Opened +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2018-09-13 12:35 UTC] cmb@php.net
Well, this behavior is there for so many years, that we can't
change it easily for BC reasons (and frankly, the sense of *this*
escaping escapes me).  Therefore I'm closing this ticket as
duplicate of request #38301 and request #51496, respectively.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Sep 12 03:01:28 2024 UTC