php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71396 zlib.deflate may generate unexpected ZLIB Compress data depending on a paramer
Submitted: 2016-01-17 10:58 UTC Modified: 2016-01-19 12:59 UTC
From: salsi at icosaedro dot it Assigned:
Status: Closed Package: Streams related
PHP Version: master-Git-2016-01-17 (Git) OS:
Private report: No CVE-ID: None
 [2016-01-17 10:58 UTC] salsi at icosaedro dot it
Description:
------------
The zlib.compress stream filter may generate 2 completely different streams of compressed data depending on the presence of the 'window' parameter, possibly generating unexpected formatted data.

To be more specific:

- If the 'window' parameter is missing, the resulting compressed stream is perfectly compliant with RFC 1951 DEFLATE and can be decompressed succesfully with the gzinflate() function (gzuncompress() would trigger E_WARNING instead).

- On the contrary, if the 'window' parameter is set, the resulting compressed stream is compliant with RFC 1950 ZLIB Compress and can be succesfully decompressed with the gzuncompress() function (gzinflate() would trigger E_WARNING instead). The resulting stream has then the following structure:

     ZLIB_COMPRESS = HEADER(2bytes) DEFLATE ADLER32(4bytes)

Note how the DEFLATE data are contained inside a ZLIB Compress data, but the 2 formats are quite different.

The following script tests all the cases with any combination of the parameters. The resulting compressed data are checked for compliance with the specifications (see comments in the isZlibCompress() function).

The page http://php.net/manual/it/filters.compression.php does not clarify very much what is happening here and what the zlib.compress filter should do, but certainly if there are 2 sets of functions for 2 quite different compressed formats, there should also be 2 stream filters with different names. For example (its only an idea):

- deflate.compress and deflate.uncompress implementing RFC 1951 (DEFLATE).

- zlib.compress and zlib.uncompress implementing RFC 1950 (ZLIB Compress).

See also bug #68556 about the zlib.inflate filter.

Test script:
---------------
<?php

/*
 * The zlib.compress filter writes DEFLATE (RFC 1951) if no
 * 'window' parameter, and writes ZLIB Compress (RFC 1950) if that
 * parameter is set. Either to be documented, or better having 2
 * different filters for 2 different formats.
 * Tested on: PHP 5.6.3 and PHP 7.1.0-dev from git 2016-01-17.
 * See also: zlib.compress fails on empty data,
 * https://bugs.php.net/bug.php?id=71395
 */

// Set a safe test environment:
error_reporting(-1);
// maps errors to ErrorException:
function my_error_handler($errno, $message)
{ throw new ErrorException($message); }
set_error_handler("my_error_handler");

/**
 * Detect if the passed data is a possible ZLIB Compress stream
 * of bytes.
 * References: RFC1950 2.2, see check bits field.
 * @param string $data Random bytes to test. If less that 7,
 * always return FALSE.
 * @return boolean TRUE if the $data is a possible ZLIB Compress
 * stream, FALSE means it is certainly NOT a ZLIB Compress stream
 * or less than 7 byte were passed.
 */
function isZlibCompress($data) {
	// HEADER (2), DEFLATE (1+), ADLER32 (4) >= 7 bytes:
	if( strlen($data) < 7 )
		return FALSE;
	// First 2 B big-endian must be multiple of 31:
	$CMF = ord($data[0]);
	$FLG = ord($data[1]);
	if( ($CMF * 256 + $FLG) % 31 != 0 )
		return FALSE;
	// Compression method = 2, window size <= 7:
	$CM = $CMF & 0xf;
	$CINFO = $CMF >> 4;
	if( !( $CM == 8 && $CINFO <= 7) )
		return FALSE;
	return TRUE;
}


/**
 * Test zlib.compress filter. The plain data are written to file
 * using the zlib.compress stream filter and the specified
 * parameters, then the file is read back and compared with the
 * gzdeflate() and the gzcompress() functions counterparts. Also
 * tryes to detect the resulting actual encoding generated
 * by the filter.
 * Reference: {@link http://php.net/manual/it/filters.compression.php}
 * @param string $plain Plain data.
 * @param int[string] $params Compression parameters: level, window, memory.
 * @throws ErrorException
 */
function testZlibFilter($plain, $params = array()) {
	echo "\nTesting data: ", rawurlencode($plain), ", params = ", var_export($params), ":\n";
	$fn = "test.deflate";
	
	// Compress with zlib.compress filter:
	$f = fopen($fn, "wb");
	stream_filter_append($f, 'zlib.deflate', STREAM_FILTER_WRITE, $params);
	fwrite($f, $plain);
	fclose($f);
	
	// Read back the compressed file:
	$compressed_with_filter = file_get_contents($fn);
	echo "   compressed with zlib.deflate: ", rawurlencode($compressed_with_filter), "\n";
	
	// Detect actual algo used and compare with compression functions:
	$is_zlib_compress = isZlibCompress($compressed_with_filter);
	if( $is_zlib_compress ){
		echo "   detected ZLIB COMPRESS data\n";
		// Compare with gzcompress():
		$compressed_with_gzcompress = gzcompress($plain);
		echo "   compressed with gzcompress(): ", rawurlencode($compressed_with_gzcompress), "\n";
		if( $compressed_with_gzcompress !== $compressed_with_filter )
			echo "   ERROR: compressed differ!\n";
		
		// Check decompression of the ZLIB Compress data:
		$plain2 = gzuncompress($compressed_with_filter);
		if( $plain !== $plain2 )
			echo "   ERROR: decompression: ", rawurlencode($plain2), "\n";
		
	} else {
		echo "   certainly NOT ZLIB Compress data, assuming DEFLATE data\n";
		$compressed_with_gzdeflate = gzdeflate($plain);
		echo "   compressed with gzdeflate(): ", rawurlencode($compressed_with_gzdeflate), "\n";
		if( $compressed_with_gzdeflate !== $compressed_with_filter )
			echo "   ERROR: compressed differ!\n";
		
		// Check decompression of the DEFLATE data:
		$plain2 = gzinflate($compressed_with_filter);
		if( $plain !== $plain2 )
			echo "   ERROR: decompression: ", rawurlencode($plain2), "\n";
	}
	
}

// Testing the "abc" string with all the possible combination of parametrs, and
// testing if the resultin compressed file is or is not ZLIB Compress:
// It seems that the presence of the 'window' param triggers ZLIB COMPRESS,
// its absence triggers DEFLATE:
testZlibFilter("abc", array(                                            )); // DEFLATE
testZlibFilter("abc", array('level' => -1                               )); // DEFLATE
testZlibFilter("abc", array(                               'memory' => 9)); // DEFLATE
testZlibFilter("abc", array('level' => -1,                 'memory' => 9)); // DEFLATE
testZlibFilter("abc", array('level' => -1, 'window' => 15, 'memory' => 9)); // ZLIB COMPRESS
testZlibFilter("abc", array(               'window' => 15               )); // ZLIB COMPRESS
testZlibFilter("abc", array('level' => -1, 'window' => 15               )); // ZLIB COMPRESS
testZlibFilter("abc", array(               'window' => 15, 'memory' => 9)); // ZLIB COMPRESS

//testZlibFilter(""); // <-- crashes on PHP 7.1, see bug 71395
?>

Expected result:
----------------
(the detected format should always be DEFLATE or ZLIB Compress, depending on the exact meaning of the zlib.compress filter).

Actual result:
--------------
Testing data: abc, params = array (
):
   compressed with zlib.deflate: KLJ%06%00
   certainly NOT ZLIB Compress data, assuming DEFLATE data
   compressed with gzdeflate(): KLJ%06%00

Testing data: abc, params = array (
  'level' => -1,
):
   compressed with zlib.deflate: KLJ%06%00
   certainly NOT ZLIB Compress data, assuming DEFLATE data
   compressed with gzdeflate(): KLJ%06%00

Testing data: abc, params = array (
  'memory' => 9,
):
   compressed with zlib.deflate: KLJ%06%00
   certainly NOT ZLIB Compress data, assuming DEFLATE data
   compressed with gzdeflate(): KLJ%06%00

Testing data: abc, params = array (
  'level' => -1,
  'memory' => 9,
):
   compressed with zlib.deflate: KLJ%06%00
   certainly NOT ZLIB Compress data, assuming DEFLATE data
   compressed with gzdeflate(): KLJ%06%00

Testing data: abc, params = array (
  'level' => -1,
  'window' => 15,
  'memory' => 9,
):
   compressed with zlib.deflate: x%9CKLJ%06%00%02M%01%27
   detected ZLIB COMPRESS data
   compressed with gzcompress(): x%9CKLJ%06%00%02M%01%27

Testing data: abc, params = array (
  'window' => 15,
):
   compressed with zlib.deflate: x%9CKLJ%06%00%02M%01%27
   detected ZLIB COMPRESS data
   compressed with gzcompress(): x%9CKLJ%06%00%02M%01%27

Testing data: abc, params = array (
  'level' => -1,
  'window' => 15,
):
   compressed with zlib.deflate: x%9CKLJ%06%00%02M%01%27
   detected ZLIB COMPRESS data
   compressed with gzcompress(): x%9CKLJ%06%00%02M%01%27

Testing data: abc, params = array (
  'window' => 15,
  'memory' => 9,
):
   compressed with zlib.deflate: x%9CKLJ%06%00%02M%01%27
   detected ZLIB COMPRESS data
   compressed with gzcompress(): x%9CKLJ%06%00%02M%01%27



Patches

xiaoX (last revision 2016-01-18 06:38 UTC by 66622 at qq dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-01-18 01:05 UTC] salsi at icosaedro dot it
-Summary: zlib.compress may generate unexpected ZLIB Compress data depending on a paramer +Summary: zlib.deflate may generate unexpected ZLIB Compress data depending on a paramer
 [2016-01-18 01:05 UTC] salsi at icosaedro dot it
Typos in summary and text: "zlib.compress" --> "zlib.deflate".
The script is ok.
 [2016-01-19 12:54 UTC] salsi at icosaedro dot it
Found the solution myself. The zlib.deflate/.inflate filters implements *all* the algorithms of the zlib library, and may generate all the 3 formats depending on the value of the 'window' parameter:

-15..-8: DEFLATE (RFC 1950)

8..15: ZLIB (RFC 1951)

8+16..15+16: GZIP

The default value of the window parameter is -15 (and not 15 as the manual page currently states, see bug #68556). For the same reason the example no. 1 of that page does not work.

See bug #68556 for more details.

Then, the final reply to topic of this ticket is: the zlib.deflate/.inflate filter actually generates many formats the manual page does not currently explain.
 [2016-01-19 12:59 UTC] salsi at icosaedro dot it
-Status: Open +Status: Closed
 [2016-01-19 12:59 UTC] salsi at icosaedro dot it
Internal programmers should give more feedback to doc maintainer, or write the specifications themselves.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 15:01:30 2024 UTC