php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #43098 file_get_contents() freezes (probably caused by fopen())
Submitted: 2007-10-24 18:00 UTC Modified: 2013-02-18 00:33 UTC
Votes:13
Avg. Score:4.2 ± 1.4
Reproduced:11 of 12 (91.7%)
Same Version:1 (9.1%)
Same OS:5 (45.5%)
From: harvie at email dot cz Assigned:
Status: No Feedback Package: HTTP related
PHP Version: 5.2.4 OS: Linux (Debian Etch) - php5-cli
Private report: No CVE-ID: None
 [2007-10-24 18:00 UTC] harvie at email dot cz
Description:
------------
I have writed spider/crawler to make some web search engine as school project.

So... I have small problem:
I am using file_get_contents() (i've tryed fopen() too...).
Crawler works 100% great, but sometimes it freezing. I have tryed to trace what function freezes, and i found it, it's file_get_contents()...

So, i googled and found default_socket_timeout setting, i set it to 1, but sometimes its freezes and never get up again.

I've done this example, so you can see, that it freezes after few iterations. I have supplyed URL, that causes freeze of my crawler (im not sure why...):


Reproduce code:
---------------
#!/usr/bin/php
< ?php

/*Run and wait for a while, this can totaly stop the script at the dead point...*/

ini_set('default_socket_timeout',1);
set_time_limit(0);
//$url='http://ad.doubleclick.net/click';
$url='http://w.moreover.com/';
while(1) {
    @file_get_contents($url, false, null, 0, 10000);
    echo "#";
}

?>

Expected result:
----------------
I will download file from specified URL few times, and after that it will freeze and never be better...
(It works if you are using different url each time too, but it takes more time...)


Actual result:
--------------
harvie-ntb:/home/harvie/Desktop/crawler# ./bugshow.php
#1#2#3#4#5#6#7#8#9#10#11#12#13#14#15#16#17

And in there it freezes for eternity (i thought, that this will continue after 1 second if failed with ini_set('default_socket_timeout',1);, But whole script stops, i tryed to wait realy long long time...)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-10-24 18:16 UTC] harvie at email dot cz
I have runned the script with strace debuger (This is debuging interpreter calls, not PHP code... of course.), if you are interested:

# strace ./bugshow.php
execve("./emails.php", ["./emails.php"], [/* 29 vars */]) = 0
uname({sys="Linux", node="harvie-ntb", ...}) = 0
brk(0)                                  = 0x854c000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fa8000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=67381, ...}) = 0
mmap2(NULL, 67381, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f97000
close(3)                                = 0

...Lot of irrelevant stuff...

connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.2.1")}, 28) = 0
fcntl64(3, F_GETFL)                     = 0x2 (flags O_RDWR)
fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
gettimeofday({1193256681, 718238}, NULL) = 0
poll([{fd=3, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1
send(3, "\6?\1\0\0\1\0\0\0\0\0\0\1w\10moreover\3com\0\0\1\0\1", 32, 0) = 32
poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 5000) = 1
ioctl(3, FIONREAD, [56])                = 0
recvfrom(3, "\6?\201\200\0\1\0\1\0\0\0\0\1w\10moreover\3com\0\0\1\0"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.2.1")}, [16]) = 56
close(3)                                = 0
gettimeofday({1193256681, 768001}, NULL) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
fcntl64(3, F_GETFL)                     = 0x2 (flags O_RDWR)
fcntl64(3, F_SETFL, O_RDWR|O_NONBLOCK)  = 0
connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("170.224.8.50")}, 16) = -1 EINPROGRESS (Operation now in progress)
poll([{fd=3, events=POLLIN|POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 1000) = 1
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
fcntl64(3, F_SETFL, O_RDWR)             = 0
send(3, "GET / HTTP/1.0\r\n", 16, 0)    = 16
send(3, "Host: w.moreover.com\r\n", 22, 0) = 22
send(3, "\r\n", 2, 0)                   = 2
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 1000) = 1
recv(3, "HTTP/1.1 200 OK\r\nDate: Wed, 24 O"..., 8192, 0) = 524
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN}], 1, 1000) = 1
recv(3, "s, online news, current awarenes"..., 8192, 0) = 524
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 1000) = 0
poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0
...This is repeating few times a second...
 [2007-10-24 19:02 UTC] harvie at email dot cz
I tryed to run this at PHP4 - CLI (MS Windows Server2003)
It returned this errors. May be, this error is handled another way in PHP5 and it causes the hang up...

c:/>bugshow.php
#0#1#2#3#4#5#6#7#8#9#10#11#12#13#14#15
Warning: file_get_contents(res:///PHP/http:\\w.moreover.com\): failed to open stream: No such file or directory in D:\bug.php on line 12
#16#17#18#19#20#21#22#23#24#25#26#27#28#29#30#31#32#33#34#35#36#37#38#39#40#41#4
2#43#44#45#46#47
Warning: file_get_contents(res:///PHP/http:\\w.moreover.com\): failed to open stream: No such file or directory in D:\bug.php on line 12
#48#49#50
 [2007-10-25 11:57 UTC] jani@php.net
Are you sure it's not just your network connection freezing? f.e. some kind of firewall stopping you from connecting to one site too many times in too short time? (fyi: your script works fine for me, I stopped it after 10 minutes..)

 [2007-10-25 17:21 UTC] harvie at email dot cz
2jani@php.net: It's possible, but in this case the default_socket_timeout have to close the socket and continue with next URL (crawler freezing too with many different URLs). Or default_socket_timeout doesn't matter in here?

It's true, that my router sux, but i don't see any reason why PHP should crash at first problem with connectivity, that ends in total script freeze, i thought, that is why we have socket timeout option. Or not?
 [2007-10-26 08:16 UTC] jani@php.net
It won't matter what you put in timeout if you wrap everything in a while(1) loop. Of course it just sits there, try adding some error checking there..

 [2007-10-26 10:40 UTC] harvie at email dot cz
jani@php.net: That is what i am saying, this will never comes to next iteration. If i wan't to do some kind of error check, it will never be executed, because whole program will stop at file_get_contents() and will not execute anything after this call. Thats the problem. This function will never return anything.

With this timeout, this script have to print '#' at least once a second.

But the default_socket_timeout stops waiting for connection but in this case the file_gets_contents() is already downloading when server drops the connection because of network unstability or service overload.

Isn't there some kind of timeout, that will stop waiting for broken connection after specified time?
 [2007-10-26 10:49 UTC] harvie at email dot cz
Yeah! Thats what i need... Something like default_poll_timeout setting...
 [2010-03-02 21:49 UTC] dyekimov at uniplat dot ru
So, has anyone found a way out of this problem?

Is it a known bug?
 [2010-11-24 09:19 UTC] jani@php.net
-Status: Open +Status: Feedback -Type: Feature/Change Request +Type: Bug -Package: Feature/Change Request +Package: HTTP related
 [2010-11-24 09:19 UTC] jani@php.net
Does it happen with PHP 5.3.3? I still can not reproduce it.
 [2013-02-18 00:33 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.
 [2014-09-05 17:21 UTC] f dot eeprom at gmail dot com
It still happens. PHP 5.3.10 here.

I have a script that keeps polling data from an online API every 10 seconds. After some days or normal function, it freezes at file_get_contents() even with the timeout parameter set to 10 in the http options. It often happens after/during connection problems with the server.

    protected function retrieveJSON($URL) {
        $opts = array('http' =>
            array(
                'method' => 'GET',
                'timeout' => 10,
            )
        );
        $context = stream_context_create($opts);
        $feed = file_get_contents($URL, false, $context); <--- FREEZE
        $json = json_decode($feed, true);
        return $json;
    }
 [2019-09-28 23:06 UTC] sijanecantonluka at gmail dot com
I am facing the same bug under
Linux 5.2.0-2-amd64 #1 SMP Debian 5.2.9-2 (2019-08-21) x86_64 GNU/Linux

I am using php7.3.9 installed from my operating system repository (apt). My script hangs (cli) and reaches max execution time (fpm) when I execute

file_get_contents("http://example.example/");

.

The thing is that local files can be accessed and, more importantly, localhost can be accessed (accessed means it does not hang). But, because I use curl for Internet requests, I didn't notice this, until (from 7.2), I upgraded to 7.3. I was unable to use composer, because it uses file_get_contents. After a minute of querying it responded with a

failed to open stream: Connection timed out

.
This sounds like a non-php related bug, but simce curl works, I am questioning this...
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 14:01:32 2024 UTC