php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46026 bz2.decompress/zlib.inflate filter tries to decompress after end of stream
Submitted: 2008-09-08 22:58 UTC Modified: 2009-02-03 19:03 UTC
From: Keisial at gmail dot com Assigned: cellog (profile)
Status: Closed Package: Bzip2 Related
PHP Version: 5.2CVS-2008-09-08 (snap) OS: *
Private report: No CVE-ID: None
 [2008-09-08 22:58 UTC] Keisial at gmail dot com
Description:
------------
If the input is larger than the bzip2 stream, bz2_decompress_filter tries to continue decompressing after it has received a BZ_STREAM_END, so bzlib returns BZ_SEQUENCE_ERROR and the stream filter finalises with PSFS_ERR_FATAL, getting no output.

Reproduce code:
---------------
bz2Decompress.php:
<?php
 $f = fopen($argv[1], "rb");
 stream_filter_append($f, "bzip2.decompress");
 fpassthru($f);
 fclose($f);
?>

$ echo 'Hello world' > hello.txt
$ bzip2 hello.txt
$ php bz2Decompress.php hello.txt.bz2
Hello world
$ echo '!' >> hello.txt.bz2
$ php bz2Decompress.php hello.txt.bz2
/* No output */

Expected result:
----------------
At least the correctly read data should be returned.

Two things can be done on receiving a BZ_STREAM_END:
-Setting an EOF flag and remove any further incoming data (allow to continue reading if the filter is removed?).
-Reinitialising the decompressing library (approach of the bzip2 utility) so concatenated bzip2 streams can be unbzipped as the concatenation of its output.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-09-19 16:47 UTC] Keisial at gmail dot com
Ok, I have implemented a fix.

Patch: http://pastebin.com/f24d58100
Patch ignoring whitespaces: http://pastebin.com/fb084a7b

When adding a filter to a stream, "bzip2.decompress", $parameters only indicated if bzlib should use the alternative decompression algorithm which uses less memory but at the cost of decompressing more slowly, either as a boolean or as an array("small"=>value). I have extended the array syntax to also allow a 'concatenated' parameter.

By default, bzip2.decompress filter won't process any data after it has reached and end-of-stream. However, if you provide array("concatenated"=>true) it will continue reading as if it has another bzip2 stream concatenated, just as bzip2 does.

In the future, it would be nice to be able to apply a bzip2.decompress, read the data, remove the filter, and read non-bzip data appended, but the stream_filter system would need to be extended to support that.

Additionally, if there is a decompression error, it will show a warning. Also, bzip2 internal buffers will now be allocated later and freed sooner.
 [2008-10-11 17:12 UTC] cellog@php.net
this same bug applies to zlib stream filter, and also happens simply when trying to read from a filtered stream that contains a limited amount of compressed data (as in a zip archive with a bz2-compressed file), as php_stream_fill_read_buffer reads in maxlen data, triggering the problem you found.  I'll try to get this fixed up and put in a few tests
 [2008-10-11 18:25 UTC] cellog@php.net
In the future, please follow 2 important elements of the coding standards:

1) use tabs instead of spaces
2) always use {} around if/else blocks

I had to waste 15 minutes just getting the patch formatted correctly before I could even begin testing it.  I don't like wasting time.
 [2008-10-11 19:08 UTC] cellog@php.net
changing summary to reflect addition of zlib problem
 [2008-10-11 19:15 UTC] cellog@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

fixed in PHP 5.2, 5.3, 6.0.  Concatenation was only added in PHP 5.3 and 6.0, as new features cannot be added to PHP 5.2
 [2008-10-11 21:54 UTC] Keisial at gmail dot com
Wow, thank you very much. :-)

Sorry about the tabs. My editor messed them (I know, I know, real programmers use butterflies...).

However, consider doing this:
if (SUCCESS == zend_hash_find(HASH_OF(filterparams), "concatenated", sizeof("concatenated"), (void **) &tmpzval) ) {
-    SEPARATE_ZVAL(tmpzval);
-    convert_to_boolean_ex(tmpzval);
     data->expect_concatenated = Z_LVAL_PP(tmpzval);
-    zval_ptr_dtor(tmpzval);
     tmpzval = NULL;
}

There were memory corruption problems with the previous patch, and zval_ptr_dtor seemed to be the source (thanks to rrichards and pajoye for their help on this on irc). Also, the other zval_ptr_dtor below (present there since first version by Sara) would be a "good possible candidate for another possible point of mem corruption".
 [2009-02-03 19:03 UTC] cellog@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

re-fixed in 5.3 and HEAD
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 22 11:01:28 2025 UTC