php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #51056 fread() on blocking stream will block even if data is available
Submitted: 2010-02-16 10:34 UTC Modified: 2014-06-21 09:59 UTC
Votes:19
Avg. Score:3.7 ± 1.1
Reproduced:9 of 11 (81.8%)
Same Version:2 (22.2%)
Same OS:3 (33.3%)
From: magicaltux@php.net Assigned: cataphract
Status: Analyzed Package: Streams related
PHP Version: 5.5.0 alpha OS: Linux Gentoo 2.6.32
Private report: No CVE-ID:
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
43 - 33 = ?
Subscribe to this entry?

 
 [2010-02-16 10:34 UTC] magicaltux@php.net
Description:
------------
On a blocking stream, a call to fread() will return even if the passed 
buffer size has not been reached.

A call to fread() should return immediatly if there is data pending to 
be read (buffered by php). Instead of that, php will call poll() on 
the stream to wait for more data to arrive, then will return the 
previously read data and the new data.

Suggestion: if fread() is called on a blocking stream that already 
contains data, PHP should call poll() with a 0 timeout, read any newly 
available data and return immediatly.
If no data is currently in PHP's internal buffer, current behaviour 
can be kept.

(it is also possible to skip completly the poll() part and directly 
return any pending data without checking if the real stream has 
anything, but I believe that it might not be as logical, a call to 
fread() should read)

Reproduce code:
---------------
<?php

echo 'Testing PHP version: '.phpversion()."\n";

$pair = stream_socket_pair(STREAM_PF_UNIX, STREAM_SOCK_STREAM, STREAM_IPPROTO_IP);

$pid = pcntl_fork();

if ($pid == -1) die("Failed to fork\n");

if ($pid > 0) {
  // parent
  fclose($pair[0]);
  while(!feof($pair[1])) {
    $start = microtime(true);
    $data = fread($pair[1], 256);
    printf("fread took %01.2fms to read %d bytes\n", (microtime(true)-$start)*1000, strlen($data));
  }
  exit;
}

// child
fclose($pair[1]);
while(!feof($pair[0])) {
  fwrite($pair[0], "Hello 1\n"); // 8 bytes
  usleep(5000);
  fwrite($pair[0], str_repeat('a', 300)."\n"); // 301 bytes
  sleep(1);
}


Expected result:
----------------
Testing PHP version: 5.3.1
fread took 0.09ms to read 8 bytes
fread took 5.08ms to read 256 bytes
fread took 0.00ms to read 45 bytes
fread took 1000.10ms to read 8 bytes
fread took 5.04ms to read 256 bytes
fread took 0.00ms to read 45 bytes
fread took 1000.10ms to read 8 bytes
fread took 5.04ms to read 256 bytes
(etc)

Actual result:
--------------
Testing PHP version: 5.3.1
fread took 0.09ms to read 8 bytes
fread took 5.08ms to read 256 bytes
fread took 1000.10ms to read 53 bytes
fread took 5.04ms to read 256 bytes
fread took 1000.10ms to read 53 bytes
fread took 5.04ms to read 256 bytes
(etc)

Patches

51056-3.phpt.txt (last revision 2010-03-12 13:46 UTC) by arnaud dot lb at gmail dot com)
51056-2.phpt.txt (last revision 2010-03-11 01:05 UTC) by arnaud dot lb at gmail dot com)
51056.phpt.txt (last revision 2010-03-11 01:04 UTC) by arnaud dot lb at gmail dot com)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-02-16 10:56 UTC] jani@php.net
Isn't this same (or related) as bug #50856 is? Does it happen with PHP_5_2 ? And I'd guess you have tested latest PHP_5_3 as well?
 [2010-02-16 11:11 UTC] magicaltux@php.net
This report looks a bit like bug #50856 (about non-blocking mode), but 
it seems to be related to a different part of the streams api (non 
blocking mode, fopen wrapper for http, while I'm testing on sockets in 
blocking mode).

I'm about to test with vanilla PHP_5_2 and PHP_5_3 from svn (once 
compilation completes). In the meantime I could reproduce the problem 
on PHP 5.2.12 (gentoo patched version).
 [2010-02-16 12:00 UTC] magicaltux@php.net
Confirmed with PHP_5_3

Testing PHP version: 5.3.3-dev
fread took 0.07ms to read 8 bytes
fread took 5.06ms to read 256 bytes
fread took 1000.10ms to read 53 bytes
fread took 5.03ms to read 256 bytes
fread took 1000.11ms to read 53 bytes
fread took 5.04ms to read 256 bytes
fread took 1000.10ms to read 53 bytes

I'll need a bit more time for PHP_5_2 as flex-2.5.4 is becoming more 
difficult to find.
 [2010-02-16 12:19 UTC] felipe@php.net
Testing PHP version: 5.2.13RC3-dev
fread took 0.04ms to read 8 bytes
fread took 4.88ms to read 256 bytes
fread took 1000.04ms to read 53 bytes
fread took 4.96ms to read 256 bytes
fread took 1000.06ms to read 53 bytes
fread took 4.97ms to read 256 bytes
fread took 1000.06ms to read 53 bytes
(etc)
 [2010-02-16 13:06 UTC] magicaltux@php.net
I tried to switch to non-blocking mode. This solves this issue with 
most sockets, except for SSL sockets when transmitting a lot of data.

This bug is blocking in my case (socket communication transmitting a 
lot of data).
 [2010-02-17 05:39 UTC] magicaltux@php.net

 [2010-02-17 16:00 UTC] jani@php.net
btw. If you really want someone to do something about this, post the patch to internals@lists.php.net as well. :)
 [2010-03-11 02:05 UTC] lbarnaud@php.net
Hi,

I made a test case for this ( 51056.phpt.txt )

fread() in C has exactly the same behavior, it will block if you try to read more bytes than available.

Your patch correctly avoids this, however it introduces an other issue: fread() will return less data than asked for, even if enough data is available ( 51056-2.phpt.txt ).
 [2010-03-11 02:21 UTC] magicaltux@php.net
Hi,

I know about fread() returning less data than asked for, however I could not 
modify this behaviour without passing some kind of value to lower-level read 
operation, which will call poll() if socket is blocking.
When data is already available in buffer, an information should be passed to the 
lower-level read() to let it know it should not block.

The only non-intrusive solution to fix this would be to temporarly pass socket 
in non-blocking mode if data was found in PHP buffer.

Considering any application handling data from network should handle cases when 
received data is not complete, I believe it was best to return immediatly if 
data is found and let the application call fread() again rather than trying to 
workaround this problem with a dirty solution like passing temporarly in non-
blocking mode.
Another solution would be to add an argument to the internal read call ("do not 
block") however it would change the API for the internal stream api, and would 
require the argument to be handled into each stream wrapper.
 [2010-03-11 16:51 UTC] lbarnaud@php.net
-Status: Open +Status: Feedback
 [2010-03-11 16:51 UTC] lbarnaud@php.net
> When data is already available in buffer, an information should be passed to the lower-level read() to let it know it should not block.

This will block anyway when the buffer is empty and you won't be able to known when it is empty, so you can't rely on this (sometimes it will block, sometimes not).

Also, some applications may rely on the blocking and will break if it is changed. This behavior exists since at least PHP 5.1.

> Considering any application handling data from network should handle cases when received data is not complete

As this is not the normal case I would suggest to introduce some timeout handling (this is what applications like e.g. Apache does, I guess), or fixing what prevents you from using non blocking i/o with SSL streams instead.
 [2010-03-11 20:26 UTC] magicaltux@php.net
> This will block anyway when the buffer is empty and you won't be able to known 
when it is empty, so you can't rely on this (sometimes it will block, sometimes 
not).

PHP always calls poll() before read, so it knows if there is nothing to read. 
stream_select() will return the socket as "ready" if there is data pending in 
php buffer (even if there's no data on the socket), just so we can read it.

> Also, some applications may rely on the blocking and will break if it is 
changed. This behavior exists since at least PHP 5.1.

fread() manual explicitly warns about this:

When reading from anything that is not a regular local file, such as streams 
returned when reading remote files or from popen() and fsockopen(), reading will 
stop after a packet is available. This means that you should collect the data 
together in chunks as shown in the examples below.

On the contrary, using blocking streams together with stream_select() may lead 
to async program blocking because stream_select() saw there was pending data, 
but a new packet will not arrive anytime soon.

> As this is not the normal case I would suggest to introduce some timeout 
handling (this is what applications like e.g. Apache does, I guess), or fixing 
what prevents you from using non blocking i/o with SSL streams instead.

It is the normal case to receive less than expected data as documented on the 
php manual.
Apache (or any correctly coded networking app) does not uses timeouts (except to 
detect dead clients), instead it uses read() which is reliable (ie. not hang 
when there is data that can be returned).

By the way I have looked at what causes the problem I have with SSL streams, and 
it could be worked around by switching the streamd between blocking mode and 
non-blocking mode depending on the situation, however I would prefer to avoid 
that (and it doesn't change the fact that fread() does not comply with what is 
expected from it, both from read() syscall behaviour and php's manual)
 [2010-03-11 21:23 UTC] lbarnaud@php.net
> Apache [...] uses timeouts [...] to detect dead clients

This is what I was meaning :) (and I though you was meaning this too : "application handling data from network should handle cases when received data is not complete")

Dead clients, or situations like this are not the "normal case", and sometimes this can be handled with timeouts.

If you are in situations where this is the normal case, one solution is to use non blocking streams.

The following code does exactly what you are asking for (if there is something to read, return it; else, block) :

stream_set_blocking(..., 0);
while (stream_select(...)) {
  $data = fread(...);
}

If it does not works with SSL streams, then SSL stuff should be fixed instead.
 [2010-03-11 21:39 UTC] magicaltux@php.net
I still believe fread() should not hang when it has data it can return. The C 
counterpart doesn't, and the manual says it doesn't.

Regarding test 51056-2.phpt.txt the manual explicitly says that this *can 
happen* on anything else than files (read warning in example #3 on 
http://php.net/fread )

While I understand your concern for people who might be relying on current bogus 
behaviour I find this very unlikely considering network streams are subject to 
lags and different kinds of behaviour due to the large amount of tcp 
implementations on internet.

In the worst case, the manual explicitly warns against relying on fread() 
returning as many bytes as requested, and says buffering must be used.
 [2010-03-11 22:03 UTC] lbarnaud@php.net
-Type: Bug +Type: Documentation Problem
 [2010-03-11 22:03 UTC] lbarnaud@php.net
> I still believe fread() should not hang when it has data it can return.

It follows fread() behavior since years and I believe it should not change.

> The C counterpart doesn't

C's fread() does :)

> and the manual says it doesn't.

The manual looks wrong on this point, "reading will stop after a packet is available" is never true, whatever packet means.

fread() (both PHP's and C's) returns less data than asked only on EOF or errors.

The only reliable way of doing non-blocking i/o is still to use non-blocking streams ;-)
 [2010-03-12 03:12 UTC] magicaltux@php.net
So, it is normal for php's fread() to return immediatly when less data than asked is available, unless this data arrived while a previous call of fread() was done and there was 
too much data ?

Let me just state that this doesn't makes sense.

I tested stdc's fread() and could confirm that its behaviour is consistent: it will only return when it has collected the data it needed, when EOF is reached or when an error 
occurs.

It seems that PHP's php_stream_read() is closer to read() syscall than to stdc's fread(), except for this one specific behaviour.

> It follows fread() behavior since years and I believe it should not change.

I believe the problem comes from the new streams api which is an attempt to make the socket api obsolete. In fact stream functions (including fread()) behave the same way the 
old socket counterpart did when passed a socket.

The correct behaviour (as defined by common sense, and confirmed by PHP 4.4.9) :

Testing PHP version: 4.4.9
socket_read took 0.06ms to read 8 bytes
socket_read took 5.08ms to read 256 bytes
socket_read took 0.01ms to read 45 bytes
socket_read took 0.08ms to read 8 bytes
socket_read took 5.06ms to read 256 bytes
socket_read took 0.01ms to read 45 bytes
socket_read took 0.07ms to read 8 bytes
socket_read took 5.05ms to read 256 bytes
socket_read took 0.01ms to read 45 bytes
socket_read took 0.08ms to read 8 bytes

Testing with PHP 5.1.0 (first version containing stream_socket_pair()) exhibits a change of behaviour due to the new stream api.

Both tests 51056.phpt and 51056-2.phpt pass on PHP 4.4.9.

By the way using nonblocking mode makes no sense with provided example. It would just make the program use 100% cpu. For example a PHP program reading an email from a POP3 
server might lockdown because of this bug in blocking mode. If end of email is reached while a read is in progress and a new read is called, it will block until the server 
closes connections (expected behaviour = return remaining data).

As a PHP sockets programmer (I believe my experience when it comes to php and sockets is not negligeable) I say once more that *this* fread()'s behaviour is not consistent. 
fread() in blocking mode should block until it has enough bytes or return as soon as some bytes are avaialble. Blocking should not depend on when data has arrived.
 [2010-03-12 14:53 UTC] lbarnaud@php.net
I see your point in wanting read() behavior. Whether or not to implement fread() or read() one is arguable. However the specific behavior you are asking for is not reliable for several reasons, and IMHO (I may be wrong) you want this behavior for bad reasons. Let me explain this :

> By the way using nonblocking mode makes no sense with provided example. It would just make the program use 100% cpu.

This is why you don't want to use non-blocking streams. If you use stream_select() you will never end up using 100% CPU : Your PHP process will only do an idle wait in stream_select() and consume no CPU at all.

Example :

stream_set_blocking($stream, 0);
while (stream_select($r,$w,$e, $stream, $sec, $usec)) { /* block until data is available for read and/or write in $stream. */
  $data = fread($stream, 8192); /* read all available data, up to 8192 bytes. Returns only 1 byte if only 1 byte is available and never blocks. */
}


> If end of email is reached while a read is in progress and a new read is called, it will block until the server closes connections

With your patch (or with the read behavior you want) it will still block. And it will block randomly, in an unpredictable manner.

Please see the following example :

Say the buffer has 250 bytes in it.
fread(100) -> buffer.length-=100, buffer.length == 100
fread(100) -> buffer.length-=100, buffer.length == 50
fread(100) -> with your patch it would return the last 50 available bytes

Now this other example with a buffer with only 200 bytes in it :

Say the buffer has 200 bytes in it.
fread(100) -> buffer.length-=100, buffer.length == 100
fread(100) -> buffer.mength-=100, buffer.length == 0
fread(100) -> buffer is 0, this blocks, and you can't control this (you don't control the buffer, and don't know anything about it in a php script)

Please see 51056-3.phpt.

With current behavior it will block too, but in a predictable maner.
 [2010-09-28 01:55 UTC] cataphract@php.net
-Status: Feedback +Status: Re-Opened
 [2010-09-28 01:55 UTC] cataphract@php.net
The bug reported in right on this issue.

As it stands, it is completely unpredictable whether an fread call will block.

The point of putting a stream into the readfs set of stream_select is to know whether a call to fread will block or not. stream_select has an emultaion feature that returns the stream if the stream buffer has data, even if there's no more data to be read on the socket. See bug #52602.

As it stands, if there's data in the stream buffer but not on the socket and the user asks for more data than what's in the buffer, the call to fread will block, even though stream_select returned the stream. The usual (and documented) semantics of select() would mean a call wouldn't block:

«The streams listed in the read array will be watched to see if characters become available for reading (more precisely, to see if a read will not block - in particular, a stream resource is also ready on end-of-file, in which case an fread() will return a zero length string).»

Therefore, it's the current behavior that's unpredictable. We can never know whether a call to fread will block.

The solution could be:

* Return whatever there's on the buffer, even if it's less than what was asked.
* Try to fill the rest of the buffer only with non-blocking reads.
* Call the real select (no emulation like in stream_select) and fill the rest of the buffer only with the pending data.

It would be unwise, however, to have this behavior change introduced for non-sockets. The problem is that many scripts admit that if they asked for n bytes, they will receive n bytes, except in the last call.

For sockets, where the problem is more serious (since we never know when a packet will arrive, we can block for a long time), this would be a minor BC break, because in that case fread never reads from the socket more than one packet at a time and since the application can't know whether the data it expects will arrive on one or several packets it already has to do deal with variable-length fread returns. Indeed, the documentation for fread says:

«When reading from anything that is not a regular local file, such as streams returned when reading remote files or from popen() and fsockopen(), reading will stop after a packet is available. This means that you should collect the data together in chunks as shown in the examples below.»
 [2010-09-28 02:07 UTC] cataphract@php.net
A small correction: it's not that "never reads from the socket more than one packet at a time" as I and the manual say. It's that it does only one call to recv().

If we're blocking waiting for data and a packet arrives, then recv() will return only the contents of that packet ("The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested."). However, if several packets have been received since the last call to fread, recv() will return the most data it can, possibly several packets.

But this is a minor documentation issue and not very relevant in this discussion.
 [2010-10-22 15:49 UTC] kalle@php.net
-Package: Streams related +Package: Documentation problem
 [2013-01-17 17:56 UTC] philip@php.net
-Status: Re-Opened +Status: Analyzed -Package: Documentation problem +Package: Streams related -PHP Version: 5.3.1 +PHP Version: 5.5.0 alpha -Assigned To: +Assigned To: cataphract
 [2014-06-21 09:59 UTC] sobak@php.net
-Type: Documentation Problem +Type: Bug
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Tue Aug 29 15:01:52 2017 UTC