php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80931 file_get_contents() hangs with HTTP/1.1 if server doesn't close connection
Submitted: 2021-04-03 14:08 UTC Modified: 2021-05-11 13:30 UTC
Votes:14
Avg. Score:4.6 ± 0.5
Reproduced:12 of 13 (92.3%)
Same Version:8 (66.7%)
Same OS:10 (83.3%)
From: gilperon at gmail dot com Assigned:
Status: Verified Package: Streams related
PHP Version: 7.4 OS: any
Private report: No CVE-ID: None
 [2021-04-03 14:08 UTC] gilperon at gmail dot com
Description:
------------
In any version of PHP (previous to 8) this bug didnt happen (I've been using it since 2010) but on PHP 8, the code below hangs indefinately. Tested on Windows 7, Windows 10, Centos 8 and Centos 7. The code below just tries to access an external API and, so far, I just managed to get this bug to happen on this specific URL below (from the API).

<?php

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml");

echo $response;

?>

BUT if I add a timeout, the code above works just fine:

<?php

$context = stream_context_create(array(

	"http" => array(
	
		"timeout" => 2
		
	)

));

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml",0,$context);

echo $response;

?>

I must say that using CURL works just fine, without any timeout, on any version of PHP. So this is a `file_get_contents` problem and the only way I am finding to solve this problem in the short term is using CURL.

Test script:
---------------
<?php

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml");

echo $response;

?>

Expected result:
----------------
Response should be printed.

Actual result:
--------------
The code hangs.

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-04-07 02:27 UTC] carusogabriel@php.net
This bug is reproducible in macOS (v10.15.7) as well. And we can test it with a smaller script:

```
<?php

$context = stream_context_create(['http' => ['timeout' => 2]]);

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);
```
 [2021-04-07 02:39 UTC] carusogabriel@php.net
-Summary: FILE_GET_CONTENTS HANGS ON PHP 8 +Summary: file_get_contents() hangs on PHP 8 -Package: *General Issues +Package: Streams related
 [2021-04-07 02:39 UTC] carusogabriel@php.net
Actually, a smaller script "fixes" it not working in PHP 8.0:

```
<?php

$context = stream_context_create();

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);
```

@gilperon What is the value of your `default_socket_timeout` php.ini entry? That might explain why it is taking forever, because if the default of `60`, I waited and the scripted executed locally :)
 [2021-04-07 13:16 UTC] gilperon at gmail dot com
@carusogabriel@php.net the value of my `default_socket_timeout` is 60. I changed it to 5 seconds, restarted my server, and then I executed your code below:


<?php

$context = stream_context_create();

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);

?>


It always echoes nothing (blank string) after 5 seconds. However, the URL used above should return:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<Servicos></Servicos>

So setting `default_socket_timeout` to a smaller value does not work because the script still lasts all the time configured in `default_socket_timeout` and returns nothing.
 [2021-04-07 15:37 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2021-04-07 15:37 UTC] cmb@php.net
I can confirm the reported behavior.  The request is sent, the
response headers are read, but for some reason select(2) times out
under PHP 8.0, but not under PHP 7.4.  After a few timeouts, the
response body is finally read, though.
 [2021-04-07 16:54 UTC] danack@php.net
Somewhat astoundingly, the difference appears to be come from the remote server.

PHP7
recvfrom(3, "HTTP/1.1 200 OK\r\ncache-control: private\r\ncontent-type: text/xml; charset=iso-8859-1\r\nexpires: Wed, 07 Apr 2021 16:38:47 GMT\r\nserver: Microsoft-IIS/10.0\r\nx-aspnet-version: 4.0.30319\r\nset-cookie: ASP.NET_SessionId=1et22nqgou4jk0ecrns5opu3; path=/; HttpOnly; SameSite=Lax\r\nx-powered-by: ASP.NET\r\ndate: Wed, 07 Apr 2021 16:38:46 GMT\r\ncontent-length: 464\r\nconnection: close\r\n\r\n<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n<Servicos><cServico>

PHP8
recvfrom(3, "HTTP/1.1 200 OK\r\ncache-control: private\r\ncontent-type: text/xml; charset=iso-8859-1\r\nexpires: Wed, 07 Apr 2021 16:40:02 GMT\r\nserver: Microsoft-IIS/10.0\r\nx-aspnet-version: 4.0.30319\r\nset-cookie: ASP.NET_SessionId=4wf1fkvdhyzzz1czsdoqdd01; path=/; HttpOnly; SameSite=Lax\r\nx-powered-by: ASP.NET\r\ndate: Wed, 07 Apr 2021 16:40:02 GMT\r\ncontent-length: 464\r\n\r\n<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n<Servicos><cServico>

The response under PHP 7 includes a 'connection: close", the response under PHP 8 doesn't.

I can't currently see any difference in the way the connection is being made that would cause that difference.
 [2021-04-07 16:57 UTC] nikic@php.net
PHP 8 uses HTTP 1.1 instead of HTTP 1.0 by default, maybe that's related?
 [2021-04-07 16:59 UTC] danack@php.net
Oh, the difference is PHP 7 set HTTP/1.0, PHP 8 does HTTP/1.1
 
"GET /calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml HTTP/1.0\r\nHost: ws.correios.com.br\r\nConnection: close\r\n\r\n"

"GET /calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml HTTP/1.1\r\nHost: ws.correios.com.br\r\nConnection: close\r\n\r\n"
 [2021-04-07 17:21 UTC] gilperon at gmail dot com
I just want to point that Correios (http://correios.com.br/), whose API I am using and which is the one that makes `file_get_contents` dont work as expected, is the largest shipping organization in Brazil (96% marketshare), and it's owned by the brazilian goverment. I am pretty sure many many devs use this API with PHP and I am probably one of the few to face this problem because, probably, many devs are still using older versions of PHP here in brazil. But I am pretty sure it will start bothering many other devs as they update their PHP to 8.x and start seeing their code breaking. Just a side note in case you think I am using a "special URL" only created to make this problem happen (the URL I provided you is the official goverment end point of their webhook).
 [2021-04-07 20:11 UTC] danack@php.net
Setting the protocol back to 1.0 with the protocol version option appears to make it work on 8. I'll need to actually inspect the packets to have a deeper look.


> But I am pretty sure it will start bothering many other devs as they
> update their PHP to 8.x and start seeing their code breaking.

To set your expectation, if this is a bug on the remote server, then it's unlikely we would write a hack around it. People can either use a workaround in their code, stay on PHP7, or ask the person who owns that site to fix it. It's not feasible to put work arounds in php core for all buggy servers out there.



<?php

$context = stream_context_create(array(
    "http" => array(
        "protocol_version" => "1.0"
    )
));

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml", 0, $context);

echo "response length is " . strlen($response) . "\n";
 [2021-04-07 20:43 UTC] gilperon at gmail dot com
danack@php.net Respectfully, setting manually the protocol to 1.0 is more like a hack to me than anything else. Devs 99% of the time, don't have control over the API response, they cannot fix server's misconfiguration only so their code works. Devs expect file_get_contents to simply work. Curl works just fine with this URL while file_get_contents is buggy. I don't expect Curl to have lots of hacks and I am pretty sure Curl has a nice workaround about server not closing connection - and so PHP should do. No hacks, I agree with you, hacks are terrible and probably will break something in the future.
 [2021-04-08 12:23 UTC] cmb@php.net
-PHP Version: 8.0.3 +PHP Version: 7.4
 [2021-04-08 12:23 UTC] cmb@php.net
First, this is not a regression in PHP 8.0, but rather a general
issue with HTTP/1.1.  It seems to me the problem is that the
server does *not* close the connection right away after having
sent the response under HTTP/1.1, what appears to be legit
behavior.  After having received the full response, our HTTP stream
implementation still tries to select(2) the sole readfd, but the
server won't send more data, so the timeout occurs.

FWIW, if I add a Connection:keep-alive header to the context
options, I can reproduce the behavior of the server locally.
 [2021-04-08 12:57 UTC] danack@php.net
Okay, so looking at the packets, what is happening, from the response on is:

# http 1_0 protocol
6. server sends response.
7. php acks 6.
8. server sends finack.
9. php sends finack.
10. servers acks 9.


# http 1_1 protocol
6. server sends response.
7. php acks 10.
8. php sends finack.
9. server acks 8.
10. server sends finack.
11. php acks 10.

That all looks correct, but the difference is that for http 1.1 the client is initiating the connection close. In http 1.0 the server is intiating the connection close.

All the packets look okay, according to https://gitlab.com/wireshark/wireshark/-/wikis/TCP-4-times-close

The problem seems to be that for whatever reason, after sending the last ack, PHP is sitting around doing nothing. btw it does time out after 2 * 60 seconds, which probably confirms the socket is in the appropriate TIME-WAIT status.
 [2021-04-08 13:43 UTC] rowan dot collins at gmail dot com
Note that the PHP client code always sends a "Connection: Close" header in the request, for both HTTP/1.0 and HTTP/1.1 requests: https://heap.space/xref/php-src/ext/standard/http_fopen_wrapper.c?r=5787f91c#570

For some reason, the server appears to only be honouring that for HTTP/1.0 requests, which makes no sense, because it's an HTTP/1.1 feature.
 [2021-04-08 18:00 UTC] rowan dot collins at gmail dot com
OK, I think I have figured out what's happening here:

* If you send the server an HTTP/1.1 request with the header "Connection: Close", it acknowledges with "connection: close"; if you send "Connection: close" (as PHP does), it does not acknowledge it, and presumably defaults to Connection: Keep-Alive
* RFC 7230 clearly states that "Connection options are case-insensitive." so this is definitely a bug in the server. https://tools.ietf.org/html/rfc7230#section-6.1
* A local and up to date IIS server does not exhibit the bug.
* The server at ws.correios.com.br is probably running an old version of IIS. The headers include "x-aspnet-version: 4.0.30319" which was released sometime between 2010 and 2012
 [2021-04-08 18:20 UTC] rowan dot collins at gmail dot com
To confirm the below is enough to make this particular server respond in a way that PHP handles OK:


diff --git a/ext/standard/http_fopen_wrapper.c b/ext/standard/http_fopen_wrapper.c
index da822d9160..7fdfa9448c 100644
--- a/ext/standard/http_fopen_wrapper.c
+++ b/ext/standard/http_fopen_wrapper.c
@@ -574,7 +574,7 @@ finish:
         * HTTP/1.0 to avoid issues when the server respond with a HTTP/1.1
         * keep-alive response, which is the preferred response type. */
        if ((have_header & HTTP_HEADER_CONNECTION) == 0) {
-               smart_str_appends(&req_buf, "Connection: close\r\n");
+               smart_str_appends(&req_buf, "Connection: Close\r\n");
        }

        if (context &&



However, the correct solution is probably to make the PHP implementation forcefully close connections when the server defaults to Keep-Alive behaviour.
 [2021-04-08 18:34 UTC] gilperon at gmail dot com
I think you guys could check how CURL does it, because it handles this "buggy coreios server" well; it does not hang.

Also, could you please take a look at https://stackoverflow.com/questions/34864179/prevent-php-http-wrapper-from-waiting-for-close-of-persistent-connection ? This looks a similar problem from 5 years ago that I just found out talking with people on discord server.
 [2021-04-09 13:11 UTC] kelunik@php.net
Simple reproduce script:

php -r '$server = stream_socket_server("tcp://127.0.0.1:8080"); while ($client = stream_socket_accept($server)) { fwrite($client, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\n\r\n"); }'

php -r 'file_get_contents("http://127.0.0.1:8080/");'
 [2021-04-09 13:14 UTC] kelunik@php.net
-Assigned To: +Assigned To: kelunik
 [2021-04-09 13:15 UTC] kelunik@php.net
-Summary: file_get_contents() hangs on PHP 8 +Summary: file_get_contents() hangs with HTTP/1.1 if server doesn't close connection
 [2021-04-09 14:24 UTC] kelunik@php.net
-Assigned To: kelunik +Assigned To:
 [2021-04-09 14:25 UTC] kelunik@php.net
https://gist.github.com/kelunik/c82ce751c1c203806b10ef7326f3e56a fixes it for chunked encoding, but still fails with a content-length or if auto_decode = false.
 [2021-04-09 17:06 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 [2021-04-14 03:45 UTC] twosee@php.net
In my opinion, this is a PHP design problem for all versions.
`file_get_content("http://*")` (php http stream wrapper) always depends on the behavior of the server. It always expects recv to return 0 and uses this to detect the end of the response.
But, if the peer does not close the connection, it will wait for data forever.
In this case, even if we set "Connection: close" when the protocol version is 1.1, the server still treats it as a persistent connection, although the server also has implementation problems, it did expose the problem of PHP.
So we can reproduce this problem on almost all websites:

<?php
$context = stream_context_create(['http' => ['protocol_version' => 1.1, 'header' => ['Connection: keep-alive']]]);
echo file_get_contents("http://www.baidu.com", 0, $context); // largest search engine in China
// hang...
 [2021-04-14 08:27 UTC] cmb@php.net
> In my opinion, this is a PHP design problem for all versions.

I agree.  I'm working on a possible fix[1].

[1] <https://github.com/php/php-src/compare/master...cmb69:cmb/80931>
 [2021-04-14 09:34 UTC] twosee@php.net
Nice patch, I am not familiar with the filter part... it seems that my patch[1] is too crude...

[1] <https://github.com/php/php-src/compare/master...twose:bug80931>
 [2021-04-14 10:01 UTC] cmb@php.net
On a quick look, it seems to me that your patch would not work for
Transfer-encoding:chunked since it uses `file_size` which is 0 in
that case.  A chunked body ends with an empty chunk (0\r\n).

The idea of using filters to solve the issue (suggested by Rowan
Tommins) allows to cater to that (regardless of auto_decode), at
the cost of a slight inefficiency (due to the need to copy the
buckets).
 [2021-04-16 17:04 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #80931: HTTP stream hangs if server doesn't close connection
On GitHub:  https://github.com/php/php-src/pull/6874
Patch:      https://github.com/php/php-src/pull/6874.patch
 [2021-05-11 13:30 UTC] cmb@php.net
-Assigned To: cmb +Assigned To:
 [2022-11-23 05:20 UTC] samira dot akhlaqi314 at gmail dot com
That's great. I was impressed by your writing. I am happy to see such a topic. Please come to my blog and read it.

<https://www.subwaylistens.me/>/.php.net
 [2022-12-02 08:51 UTC] arjan at avoid dot org
Seems this (or a very similar) issue is back in PHP8.1.12 and 8.1.13.
Super simple test case:

<?php
print file_get_contents('https'.'://'.'www'.'nrc'.'nl'); 
?>

It seems impossible to access the Dutch NRC website (national newspaper).
This takes a very long time to run. It does eventually return.
Accessing the same website from the cmdline via wget or curl is super fast.
I've tested this on multiple PHP instances around the world, they all hang when running on PHP8.1.12 or 8.1.13. 
This same script ran fine (fast) on PHP 7.4.
 [2022-12-13 12:58 UTC] jamie at hazaar dot io
I have noticed this also ocurring in versions of PHP including 7.1.x and 8.1.x when used from inside a docker container.  It appears that regardless of the protocol used, Docker will alter the request headers to use HTTP/1.1 which is particularly problematic when the client has requested HTTP/1.0 with a "Connection: close" header as the server will not see this and as such not close the connection as requested. 

I now have a bunch of projects that I'm trying to port to Docker containers for development and deployment that simply do not work because of this bug thanks to both PHP and Docker not playing nice together.

If anyone reads this and is able to confirm that PHP from inside Docker can not use `file_get_contents($url)`, please respond here.
 [2023-01-09 15:14 UTC] dominik+phpnet at komail dot ch
I can confirm the issue of Jamie, that file_get_contents() hangs indefinitely, if it's calling an URL in the web from a docker container. This started to occur when we updated to the latest docker version. I'm not sure, in which version exactly, though.

Currently using php8.1 and docker4.15
 [2023-01-12 18:06 UTC] tgross at m-s dot de
I can confirm this problem when using php inside a docker 4.15.0 container.

try these docker commands (In my case, I call them inside WSL2 using Docker Desktop):
This one works well (it returns fast):

> docker run --rm php:7.3 php -d default_socket_timeout=10 -r "file_get_contents('https://github.com/');"

while this one hits the timeout (in this case, 10 seconds) (remove spaces from url!):

> docker run --rm php:7.3 php -d default_socket_timeout=10 -r "file_get_contents('https : // www . google . com /');"

It's basically the same with PHP 7.4, 8.1, 8.2. However, with PHP 8.1 and 8.2, the command reaches twice the timeout - so it takes 20 seconds instead of 10!

Not only file_get_contents() is affected by this problem, but also SoapClient::__construct(), for example.
 [2023-05-25 09:59 UTC] nguyenthuongzl633 at gmail dot com
The PHP website is every time give us very imp info. all time our work is easy. (https://www.paybyplatema.one/)github.com
 [2023-08-10 16:52 UTC] r dot hessel dot git at gmail dot com
Same issue, oddly only on Linux. 
Have 2 devs using docker on WSL2 and 1 using Linux Mint, and a UAT environment on Jelastic, all with the same docker container. 

The windows WSL2 users have no issue, but the linux mint environment does, and so does the Jelastic UAT.

Tested with these docker containers:
jelastic/nginxphp:1.22.1-php-8.1.14
jelastic/nginxphp:1.24.0-php-8.1.21


Details on the Linux Mint version:
Linux Mint 21.2 (ubuntu 22.04.2 based)
Docker version 24.0.5, build ced0996
 [2023-09-20 11:49 UTC] taylorgodiva28 at gmail dot com
i want to follow how people fix this problem, i am also facing the same situation https://gist.github.com/kelunik/c82ce751c1c203806b10ef7326f3e56a in
 [2023-09-20 11:53 UTC] taylorgodiva28 at gmail dot com
i want to follow how people fix this problem, i am also facing the same situation in (geometrydashscratch.io/)github.com
 [2023-10-16 11:06 UTC] castleapk1246 at gmail dot com
I am glad to find this website for daily updates worldwide. I usually visit this site to get full details on any trending topic. I found this article very informative.
 [2024-01-15 12:29 UTC] gilperon at gmail dot com
Is anyone gonna fix this? This bug happens for years and it's amazing we cant still trust TIMEOUT to work with file_get_contents.
 [2024-05-31 13:33 UTC] paybyplatema dot run at gmail dot com
Thank you. Your example didn't work for me but I managed to fix it, which was basically the same thing!


(https://github.com/php/php-src/blob/php-8.1.0/NEWS)(https://www.paybyplatema.run/)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Oct 13 17:01:27 2024 UTC