php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80931 file_get_contents() hangs with HTTP/1.1 if server doesn't close connection
Submitted: 2021-04-03 14:08 UTC Modified: 2021-05-11 13:30 UTC
From: gilperon at gmail dot com Assigned:
Status: Verified Package: Streams related
PHP Version: 7.4 OS: any
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2021-04-03 14:08 UTC] gilperon at gmail dot com
Description:
------------
In any version of PHP (previous to 8) this bug didnt happen (I've been using it since 2010) but on PHP 8, the code below hangs indefinately. Tested on Windows 7, Windows 10, Centos 8 and Centos 7. The code below just tries to access an external API and, so far, I just managed to get this bug to happen on this specific URL below (from the API).

<?php

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml");

echo $response;

?>

BUT if I add a timeout, the code above works just fine:

<?php

$context = stream_context_create(array(

	"http" => array(
	
		"timeout" => 2
		
	)

));

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml",0,$context);

echo $response;

?>

I must say that using CURL works just fine, without any timeout, on any version of PHP. So this is a `file_get_contents` problem and the only way I am finding to solve this problem in the short term is using CURL.

Test script:
---------------
<?php

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml");

echo $response;

?>

Expected result:
----------------
Response should be printed.

Actual result:
--------------
The code hangs.

Patches

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-04-07 02:27 UTC] carusogabriel@php.net
This bug is reproducible in macOS (v10.15.7) as well. And we can test it with a smaller script:

```
<?php

$context = stream_context_create(['http' => ['timeout' => 2]]);

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);
```
 [2021-04-07 02:39 UTC] carusogabriel@php.net
-Summary: FILE_GET_CONTENTS HANGS ON PHP 8 +Summary: file_get_contents() hangs on PHP 8 -Package: *General Issues +Package: Streams related
 [2021-04-07 02:39 UTC] carusogabriel@php.net
Actually, a smaller script "fixes" it not working in PHP 8.0:

```
<?php

$context = stream_context_create();

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);
```

@gilperon What is the value of your `default_socket_timeout` php.ini entry? That might explain why it is taking forever, because if the default of `60`, I waited and the scripted executed locally :)
 [2021-04-07 13:16 UTC] gilperon at gmail dot com
@carusogabriel@php.net the value of my `default_socket_timeout` is 60. I changed it to 5 seconds, restarted my server, and then I executed your code below:


<?php

$context = stream_context_create();

echo file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?StrRetorno=xml", 0, $context);

?>


It always echoes nothing (blank string) after 5 seconds. However, the URL used above should return:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<Servicos></Servicos>

So setting `default_socket_timeout` to a smaller value does not work because the script still lasts all the time configured in `default_socket_timeout` and returns nothing.
 [2021-04-07 15:37 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2021-04-07 15:37 UTC] cmb@php.net
I can confirm the reported behavior.  The request is sent, the
response headers are read, but for some reason select(2) times out
under PHP 8.0, but not under PHP 7.4.  After a few timeouts, the
response body is finally read, though.
 [2021-04-07 16:54 UTC] danack@php.net
Somewhat astoundingly, the difference appears to be come from the remote server.

PHP7
recvfrom(3, "HTTP/1.1 200 OK\r\ncache-control: private\r\ncontent-type: text/xml; charset=iso-8859-1\r\nexpires: Wed, 07 Apr 2021 16:38:47 GMT\r\nserver: Microsoft-IIS/10.0\r\nx-aspnet-version: 4.0.30319\r\nset-cookie: ASP.NET_SessionId=1et22nqgou4jk0ecrns5opu3; path=/; HttpOnly; SameSite=Lax\r\nx-powered-by: ASP.NET\r\ndate: Wed, 07 Apr 2021 16:38:46 GMT\r\ncontent-length: 464\r\nconnection: close\r\n\r\n<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n<Servicos><cServico>

PHP8
recvfrom(3, "HTTP/1.1 200 OK\r\ncache-control: private\r\ncontent-type: text/xml; charset=iso-8859-1\r\nexpires: Wed, 07 Apr 2021 16:40:02 GMT\r\nserver: Microsoft-IIS/10.0\r\nx-aspnet-version: 4.0.30319\r\nset-cookie: ASP.NET_SessionId=4wf1fkvdhyzzz1czsdoqdd01; path=/; HttpOnly; SameSite=Lax\r\nx-powered-by: ASP.NET\r\ndate: Wed, 07 Apr 2021 16:40:02 GMT\r\ncontent-length: 464\r\n\r\n<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>\n<Servicos><cServico>

The response under PHP 7 includes a 'connection: close", the response under PHP 8 doesn't.

I can't currently see any difference in the way the connection is being made that would cause that difference.
 [2021-04-07 16:57 UTC] nikic@php.net
PHP 8 uses HTTP 1.1 instead of HTTP 1.0 by default, maybe that's related?
 [2021-04-07 16:59 UTC] danack@php.net
Oh, the difference is PHP 7 set HTTP/1.0, PHP 8 does HTTP/1.1
 
"GET /calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml HTTP/1.0\r\nHost: ws.correios.com.br\r\nConnection: close\r\n\r\n"

"GET /calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml HTTP/1.1\r\nHost: ws.correios.com.br\r\nConnection: close\r\n\r\n"
 [2021-04-07 17:21 UTC] gilperon at gmail dot com
I just want to point that Correios (http://correios.com.br/), whose API I am using and which is the one that makes `file_get_contents` dont work as expected, is the largest shipping organization in Brazil (96% marketshare), and it's owned by the brazilian goverment. I am pretty sure many many devs use this API with PHP and I am probably one of the few to face this problem because, probably, many devs are still using older versions of PHP here in brazil. But I am pretty sure it will start bothering many other devs as they update their PHP to 8.x and start seeing their code breaking. Just a side note in case you think I am using a "special URL" only created to make this problem happen (the URL I provided you is the official goverment end point of their webhook).
 [2021-04-07 20:11 UTC] danack@php.net
Setting the protocol back to 1.0 with the protocol version option appears to make it work on 8. I'll need to actually inspect the packets to have a deeper look.


> But I am pretty sure it will start bothering many other devs as they
> update their PHP to 8.x and start seeing their code breaking.

To set your expectation, if this is a bug on the remote server, then it's unlikely we would write a hack around it. People can either use a workaround in their code, stay on PHP7, or ask the person who owns that site to fix it. It's not feasible to put work arounds in php core for all buggy servers out there.



<?php

$context = stream_context_create(array(
    "http" => array(
        "protocol_version" => "1.0"
    )
));

$response = file_get_contents("http://ws.correios.com.br/calculador/CalcPrecoPrazo.aspx?nCdEmpresa=&sDsSenha=&sCepOrigem=11661690&sCepDestino=88070-480&nVlPeso=1&nCdFormato=1&nVlComprimento=25&nVlAltura=3&nVlLargura=25&sCdMaoPropria=N&nVlValorDeclarado=0&sCdAvisoRecebimento=N&nCdServico=04014&nVlDiametro=0&StrRetorno=xml", 0, $context);

echo "response length is " . strlen($response) . "\n";
 [2021-04-07 20:43 UTC] gilperon at gmail dot com
danack@php.net Respectfully, setting manually the protocol to 1.0 is more like a hack to me than anything else. Devs 99% of the time, don't have control over the API response, they cannot fix server's misconfiguration only so their code works. Devs expect file_get_contents to simply work. Curl works just fine with this URL while file_get_contents is buggy. I don't expect Curl to have lots of hacks and I am pretty sure Curl has a nice workaround about server not closing connection - and so PHP should do. No hacks, I agree with you, hacks are terrible and probably will break something in the future.
 [2021-04-08 12:23 UTC] cmb@php.net
-PHP Version: 8.0.3 +PHP Version: 7.4
 [2021-04-08 12:23 UTC] cmb@php.net
First, this is not a regression in PHP 8.0, but rather a general
issue with HTTP/1.1.  It seems to me the problem is that the
server does *not* close the connection right away after having
sent the response under HTTP/1.1, what appears to be legit
behavior.  After having received the full response, our HTTP stream
implementation still tries to select(2) the sole readfd, but the
server won't send more data, so the timeout occurs.

FWIW, if I add a Connection:keep-alive header to the context
options, I can reproduce the behavior of the server locally.
 [2021-04-08 12:57 UTC] danack@php.net
Okay, so looking at the packets, what is happening, from the response on is:

# http 1_0 protocol
6. server sends response.
7. php acks 6.
8. server sends finack.
9. php sends finack.
10. servers acks 9.


# http 1_1 protocol
6. server sends response.
7. php acks 10.
8. php sends finack.
9. server acks 8.
10. server sends finack.
11. php acks 10.

That all looks correct, but the difference is that for http 1.1 the client is initiating the connection close. In http 1.0 the server is intiating the connection close.

All the packets look okay, according to https://gitlab.com/wireshark/wireshark/-/wikis/TCP-4-times-close

The problem seems to be that for whatever reason, after sending the last ack, PHP is sitting around doing nothing. btw it does time out after 2 * 60 seconds, which probably confirms the socket is in the appropriate TIME-WAIT status.
 [2021-04-08 13:43 UTC] rowan dot collins at gmail dot com
Note that the PHP client code always sends a "Connection: Close" header in the request, for both HTTP/1.0 and HTTP/1.1 requests: https://heap.space/xref/php-src/ext/standard/http_fopen_wrapper.c?r=5787f91c#570

For some reason, the server appears to only be honouring that for HTTP/1.0 requests, which makes no sense, because it's an HTTP/1.1 feature.
 [2021-04-08 18:00 UTC] rowan dot collins at gmail dot com
OK, I think I have figured out what's happening here:

* If you send the server an HTTP/1.1 request with the header "Connection: Close", it acknowledges with "connection: close"; if you send "Connection: close" (as PHP does), it does not acknowledge it, and presumably defaults to Connection: Keep-Alive
* RFC 7230 clearly states that "Connection options are case-insensitive." so this is definitely a bug in the server. https://tools.ietf.org/html/rfc7230#section-6.1
* A local and up to date IIS server does not exhibit the bug.
* The server at ws.correios.com.br is probably running an old version of IIS. The headers include "x-aspnet-version: 4.0.30319" which was released sometime between 2010 and 2012
 [2021-04-08 18:20 UTC] rowan dot collins at gmail dot com
To confirm the below is enough to make this particular server respond in a way that PHP handles OK:


diff --git a/ext/standard/http_fopen_wrapper.c b/ext/standard/http_fopen_wrapper.c
index da822d9160..7fdfa9448c 100644
--- a/ext/standard/http_fopen_wrapper.c
+++ b/ext/standard/http_fopen_wrapper.c
@@ -574,7 +574,7 @@ finish:
         * HTTP/1.0 to avoid issues when the server respond with a HTTP/1.1
         * keep-alive response, which is the preferred response type. */
        if ((have_header & HTTP_HEADER_CONNECTION) == 0) {
-               smart_str_appends(&req_buf, "Connection: close\r\n");
+               smart_str_appends(&req_buf, "Connection: Close\r\n");
        }

        if (context &&



However, the correct solution is probably to make the PHP implementation forcefully close connections when the server defaults to Keep-Alive behaviour.
 [2021-04-08 18:34 UTC] gilperon at gmail dot com
I think you guys could check how CURL does it, because it handles this "buggy coreios server" well; it does not hang.

Also, could you please take a look at https://stackoverflow.com/questions/34864179/prevent-php-http-wrapper-from-waiting-for-close-of-persistent-connection ? This looks a similar problem from 5 years ago that I just found out talking with people on discord server.
 [2021-04-09 13:11 UTC] kelunik@php.net
Simple reproduce script:

php -r '$server = stream_socket_server("tcp://127.0.0.1:8080"); while ($client = stream_socket_accept($server)) { fwrite($client, "HTTP/1.1 200 OK\r\ncontent-length: 0\r\n\r\n"); }'

php -r 'file_get_contents("http://127.0.0.1:8080/");'
 [2021-04-09 13:14 UTC] kelunik@php.net
-Assigned To: +Assigned To: kelunik
 [2021-04-09 13:15 UTC] kelunik@php.net
-Summary: file_get_contents() hangs on PHP 8 +Summary: file_get_contents() hangs with HTTP/1.1 if server doesn't close connection
 [2021-04-09 14:24 UTC] kelunik@php.net
-Assigned To: kelunik +Assigned To:
 [2021-04-09 14:25 UTC] kelunik@php.net
https://gist.github.com/kelunik/c82ce751c1c203806b10ef7326f3e56a fixes it for chunked encoding, but still fails with a content-length or if auto_decode = false.
 [2021-04-09 17:06 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 [2021-04-14 03:45 UTC] twosee@php.net
In my opinion, this is a PHP design problem for all versions.
`file_get_content("http://*")` (php http stream wrapper) always depends on the behavior of the server. It always expects recv to return 0 and uses this to detect the end of the response.
But, if the peer does not close the connection, it will wait for data forever.
In this case, even if we set "Connection: close" when the protocol version is 1.1, the server still treats it as a persistent connection, although the server also has implementation problems, it did expose the problem of PHP.
So we can reproduce this problem on almost all websites:

<?php
$context = stream_context_create(['http' => ['protocol_version' => 1.1, 'header' => ['Connection: keep-alive']]]);
echo file_get_contents("http://www.baidu.com", 0, $context); // largest search engine in China
// hang...
 [2021-04-14 08:27 UTC] cmb@php.net
> In my opinion, this is a PHP design problem for all versions.

I agree.  I'm working on a possible fix[1].

[1] <https://github.com/php/php-src/compare/master...cmb69:cmb/80931>
 [2021-04-14 09:34 UTC] twosee@php.net
Nice patch, I am not familiar with the filter part... it seems that my patch[1] is too crude...

[1] <https://github.com/php/php-src/compare/master...twose:bug80931>
 [2021-04-14 10:01 UTC] cmb@php.net
On a quick look, it seems to me that your patch would not work for
Transfer-encoding:chunked since it uses `file_size` which is 0 in
that case.  A chunked body ends with an empty chunk (0\r\n).

The idea of using filters to solve the issue (suggested by Rowan
Tommins) allows to cater to that (regardless of auto_decode), at
the cost of a slight inefficiency (due to the need to copy the
buckets).
 [2021-04-16 17:04 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #80931: HTTP stream hangs if server doesn't close connection
On GitHub:  https://github.com/php/php-src/pull/6874
Patch:      https://github.com/php/php-src/pull/6874.patch
 [2021-05-11 13:30 UTC] cmb@php.net
-Assigned To: cmb +Assigned To:
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Sun Aug 01 15:01:24 2021 UTC