php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80256 file_get_contents strip first line with chunked encoding redirect
Submitted: 2020-10-19 11:16 UTC Modified: 2020-10-20 13:06 UTC
From: code at luigifab dot fr Assigned: nikic (profile)
Status: Closed Package: Streams related
PHP Version: 8.0.0RC2 OS: Debian with Docker
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: code at luigifab dot fr
New email:
PHP Version: OS:

 

 [2020-10-19 11:16 UTC] code at luigifab dot fr
Description:
------------
I try to read an export of a Google Sheet with file_get_contents.
With PHP 8, the first line of the export "is removed".

Test script:
---------------
error_reporting(E_ALL);
ini_set('display_errors', 1);

$data = file_get_contents('https://docs.google.com/spreadsheets/d/e/2PACX-1vTqS3j4Wd-Bt7Zb52eJiQed_NilvKo0wGdw8noL4vhFOPsUeV9O6EN8odni6YepDGicYApcJ4Zy5opv/pub?gid=1790927668&single=true&output=tsv');
echo 'version:   ',PHP_VERSION,"\n";
echo 'mb_strlen: ',mb_strlen($data),"\n";
echo substr(print_r($data, true), 0, 50),"\n";


Expected result:
----------------
The truth:

php7.2 /var/www/filegetcontents.php
version:   7.2.34-4+0~20201018.51+debian10~1.gbpc553f7
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

php7.3 /var/www/filegetcontents.php
version:   7.3.23-4+0~20201018.71+debian10~1.gbpfc8934
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

php7.4 /var/www/filegetcontents.php
version:   7.4.11
mb_strlen: 187473
config\ten-US (english/Simple)\t\tfr-FR (français/Fr

(I have replaced the tabulation character with \t)

Actual result:
--------------
php8.0 /var/www/filegetcontents.php 
version:   8.0.0RC2
mb_strlen: 186973
\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t
\t540 cases do

(I have replaced the tabulation character with \t)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-19 11:47 UTC] cmb@php.net
-Status: Open +Status: Verified -Package: *General Issues +Package: Streams related
 [2020-10-20 11:02 UTC] nikic@php.net
Seems to be related to the switch to HTTP 1.1. I get the old result when forcing HTTP 1.0 using

    $ctx = stream_context_create(['http' => ['protocol_version' => '1.0']]);
 [2020-10-20 11:12 UTC] nikic@php.net
Presumably related to the use of chunked transfer encoding in some way.
 [2020-10-20 12:59 UTC] nikic@php.net
I believe the problem is that we first receive a redirect with chunked transfer encoding, and then an actual response with chunked transfer encoding, and end up applying the dechunk filter twice due to that.
 [2020-10-20 13:06 UTC] nikic@php.net
-Summary: file_get_contents strip first line +Summary: file_get_contents strip first line with chunked encoding redirect -Status: Verified +Status: Assigned -Assigned To: +Assigned To: nikic
 [2020-10-20 13:36 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=1c157d3fa2e102ccf375ec4c1ddca8770208dae7
Log: Fixed bug #80256
 [2020-10-20 13:36 UTC] nikic@php.net
-Status: Assigned +Status: Closed
 [2022-07-15 13:19 UTC] s at s dot com
sdsd
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 11:01:29 2024 UTC