|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80384 filter buffers entire read until file closed
Submitted: 2020-11-20 04:14 UTC Modified: 2020-12-23 12:44 UTC
From: adamjseitz at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Filter related
PHP Version: 7.4.12 OS: Debian Buster on WSL2
Private report: No CVE-ID: None
 [2020-11-20 04:14 UTC] adamjseitz at gmail dot com
"zlib.inflate" filters appear to buffer all previously-read data, as exhibited by the attached code, until the file is closed.

It is unclear from my testing if the data being kept in memory is the original or inflated copy.

Test script:
// Generate "data.gz" file with:
// dd if=/dev/urandom of=data count=$((8*1024*1024)) iflag=count_bytes; gzip data

function PrintMem($message) { echo $message, ": ", (memory_get_usage() / 1024 / 1024) . " MB\n"; }

$fp = fopen("data.gz", 'rb');
$filter = stream_filter_append($fp, "zlib.inflate", STREAM_FILTER_READ, array( 'window' => 31 ));

PrintMem("before read");
PrintMem("after read");

PrintMem("after close");

Expected result:
I expect that very little memory is used.

It should be possible to read smaller blocks at a time from a very large gzipped files without buffering the entire content of what has been read so far, but that does not seem to be the case.

Actual result:
OUTPUT from the above script:

before read: 0.44554138183594 MB
after read: 8.4377517700195 MB
after close: 0.37459564208984 MB


Add a Patch

Pull Requests

Pull requests:

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2020-11-20 15:05 UTC]
-Summary: zlib.inflate filter buffers entire read until file closed +Summary: filter buffers entire read until file closed -Status: Open +Status: Verified -Package: Zlib related +Package: Filter related
 [2020-11-20 15:05 UTC]
This is not particularly related to zlib.inflate, but rather a
general issue for streams which have any filter attached.
php_stream_fill_read_buffer()[1] only reads a single chunk if no
filter is attached, but the full size if a filter is attached.
Frankly, it's not clear to me why we're looping[2] inside that

[1] <>
[2] <>
 [2020-11-21 22:19 UTC] adamjseitz at gmail dot com
The following pull request has been associated:

Patch Name: Fix #80384: limit read buffer size
On GitHub:
 [2020-11-21 23:04 UTC] adamjseitz at gmail dot com
Thanks for pointing me to the right code.

The loop appears to be necessary in the case that the filter chain consumes the full chunk, but yields no data. Some callers expect that _php_stream_fill_read_buffer will always add data to the read buffer in successful cases [1].

I added a PR to break out of the loop once stream->chunk_size has been read out of the filters.

[1] <>
 [2020-12-23 12:44 UTC]
-Assigned To: +Assigned To: cmb
 [2020-12-23 12:55 UTC]
-Status: Verified +Status: Closed
 [2020-12-23 12:55 UTC]
Automatic comment on behalf of
Log: Fix #80384: limit read buffer size
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Jun 17 00:01:29 2024 UTC