php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80256 file_get_contents strip first line with chunked encoding redirect
Submitted: 2020-10-19 11:16 UTC Modified: 2020-10-20 13:06 UTC
From: code at luigifab dot fr Assigned: nikic (profile)
Status: Closed Package: Streams related
PHP Version: 8.0.0RC2 OS: Debian with Docker
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
8 + 24 = ?
Subscribe to this entry?

 
 [2020-10-19 11:16 UTC] code at luigifab dot fr
Description:
------------
I try to read an export of a Google Sheet with file_get_contents.
With PHP 8, the first line of the export "is removed".

Test script:
---------------
error_reporting(E_ALL);
ini_set('display_errors', 1);

$data = file_get_contents('https://docs.google.com/spreadsheets/d/e/2PACX-1vTqS3j4Wd-Bt7Zb52eJiQed_NilvKo0wGdw8noL4vhFOPsUeV9O6EN8odni6YepDGicYApcJ4Zy5opv/pub?gid=1790927668&single=true&output=tsv');
echo 'version:   ',PHP_VERSION,"\n";
echo 'mb_strlen: ',mb_strlen($data),"\n";
echo substr(print_r($data, true), 0, 50),"\n";


Expected result:
----------------
The truth:

php7.2 /var/www/filegetcontents.php
version:   7.2.34-4+0~20201018.51+debian10~1.gbpc553f7
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

php7.3 /var/www/filegetcontents.php
version:   7.3.23-4+0~20201018.71+debian10~1.gbpfc8934
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

php7.4 /var/www/filegetcontents.php
version:   7.4.11
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

(I have replaced the tabulation character with \t)

Actual result:
--------------
php8.0 /var/www/filegetcontents.php 
version:   8.0.0RC2
mb_strlen: 186973
\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t
\t540 cases do

(I have replaced the tabulation character with \t)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-19 11:47 UTC] cmb@php.net
-Status: Open +Status: Verified -Package: *General Issues +Package: Streams related
 [2020-10-20 11:02 UTC] nikic@php.net
Seems to be related to the switch to HTTP 1.1. I get the old result when forcing HTTP 1.0 using

    $ctx = stream_context_create(['http' => ['protocol_version' => '1.0']]);
 [2020-10-20 11:12 UTC] nikic@php.net
Presumably related to the use of chunked transfer encoding in some way.
 [2020-10-20 12:59 UTC] nikic@php.net
I believe the problem is that we first receive a redirect with chunked transfer encoding, and then an actual response with chunked transfer encoding, and end up applying the dechunk filter twice due to that.
 [2020-10-20 13:06 UTC] nikic@php.net
-Summary: file_get_contents strip first line +Summary: file_get_contents strip first line with chunked encoding redirect -Status: Verified +Status: Assigned -Assigned To: +Assigned To: nikic
 [2020-10-20 13:36 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=1c157d3fa2e102ccf375ec4c1ddca8770208dae7
Log: Fixed bug #80256
 [2020-10-20 13:36 UTC] nikic@php.net
-Status: Assigned +Status: Closed
 [2022-07-15 13:19 UTC] s at s dot com
sdsd
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 16 08:01:32 2024 UTC