php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #19507 incorrect parsing of POST/GET variables
Submitted: 2002-09-19 16:45 UTC Modified: 2002-09-21 06:53 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: mike at featureprice dot com Assigned:
Status: Closed Package: mbstring related
PHP Version: 4.2.3 OS:
Private report: No CVE-ID: None
 [2002-09-19 16:45 UTC] mike at featureprice dot com
When compiling php with the mbstring extension, parsing of POST and GET variables which contain url-encoded keys is not done correctly. this especially effects array input variables in GET/POST request. for example:

let's say we have a page only containing:
<?php phpinfo(); ?>

invoking this page like info.php?tld[]=.com&tld[]=.name&tld[]=.us
shows the correct result:
_GET["tld"] = Array
(
    [0] => .com
    [1] => .name
    [2] => .us
)

but the browser usually encodes [ with %5B and ] with %5D in the url, so let's try that:
info.php?tld%5B%5D=.com&tld%5B%5D=.name&tld%5B%5D=.us
shows an incorrect result:
_GET["tld"] = Array
(
    [0] => 
    [1] => e
    [2] => .us
)

To be frankly :) I already tracked down the problem by reviewing the source code of PHP and noticed that it was introduced specifically in 4.2.3. Around line 1030 of mbstring.c, the code which parses the URL variables was, it looks like, cleaned up a bit. it was in 4.2.2:

	while (var && n < num) {
		val = strchr(var, '=');
		if (val) { /* have a value */
			*val++ = '\0';
			val_list[n] = var;
			len_list[n] = php_url_decode(var, strlen(var));
			n++;
			val_list[n] = val;
			len_list[n] = php_url_decode(val, strlen(val));
		} else {
			val_list[n] = var;
			len_list[n] = php_url_decode(var, strlen(var));
			n++;
			val_list[n] = NULL;
			len_list[n] = 0;
		}
		n++;
		var = php_strtok_r(NULL, separator, &strtok_buf);
	}
	num = n;

and in 4.2.3 changed to:

	/* split and decode the query */
	n = 0;
	strtok_buf = NULL;
	var = php_strtok_r(res, separator, &strtok_buf);
	while (var)  {
		val = strchr(var, '=');
		val_list[n] = var;
		len_list[n] = php_url_decode(var, strlen(var));
		n++;
		if (val) { /* have a value */
			*val++ = '\0';
			val_list[n] = val;
			len_list[n] = php_url_decode(val, strlen(val));
		} else {
			val_list[n] = "";
			len_list[n] = 0;
		}
		n++;
		var = php_strtok_r(NULL, separator, &strtok_buf);
	}
	num = n; /* make sure to process initilized vars only */

the problem with the code change is that the equal sign is not overwritten with the string terminator \0 before php_url_decode is called on the GET/POST variable name, so the GET/POST variable value is urldecoded in the same step. but the code following the first php_url_decode assumes that the equal sign is still on the same place, which is not the case if the key contained characters that were url-encoded (had characters in the %## syntax), because the string gets smaller and the equal sign moves in this case.
one url-encoded character causes the string to get two characters smaller, so we can explain our example from the top:
[ and ] are url-encoded, two characters, for each of them the equal sign (and with it the GET/POST variable value) moves two characters. so, it moves four characters at total:

_GET["tld"] = Array
(
    [0] =>                // .com are four characters, after four disappear to the left, none remains
    [1] => e              // .name are five characters, after four disappear to the left, only the last remains
    [2] => .us            // php_url_decode does place the end-of line character of the decoded string just before the dot, so .us does not get shortened
)

Maybe it's better to rewind this code part to what it was in 4.2.2. just a suggestion, i don't want tell anyone how to do their work :)
You can also use the patch below. It is against PHP 4.2.3.
I already verified that the patch works, it fixes the problem.

--- php-4.2.3/ext/mbstring/mbstring.c.org	2002-09-19 22:32:22.000000000 +0200
+++ php-4.2.3/ext/mbstring/mbstring.c	2002-09-19 22:29:32.000000000 +0200
@@ -1031,15 +1031,18 @@
 	var = php_strtok_r(res, separator, &strtok_buf);
 	while (var)  {
 		val = strchr(var, '=');
-		val_list[n] = var;
-		len_list[n] = php_url_decode(var, strlen(var));
-		n++;
 		if (val) { /* have a value */
 			*val++ = '\0';
+			val_list[n] = var;
+			len_list[n] = php_url_decode(var, strlen(var));
+			n++;
 			val_list[n] = val;
 			len_list[n] = php_url_decode(val, strlen(val));
 		} else {
-			val_list[n] = "";
+			val_list[n] = var;
+			len_list[n] = php_url_decode(var, strlen(var));
+			n++;
+			val_list[n] = "";
 			len_list[n] = 0;
 		}
 		n++;

btw, my configure line is (if you still want to know it):
'./configure' '--with-apxs=/usr/local/apache/bin/apxs' '--prefix=/usr/local/php' '--disable-debug' '--with-gettext' '--with-pear' '--enable-safe-mode' '--enable-mbstring' '--enable-mbstr-enc-trans' '--with-zlib' '--enable-bcmath' '--with-bz2' '--enable-calendar' '--enable-ctype' '--with-curl' '--enable-exif' '--enable-ftp' '--with-gd' '--enable-gd-native-ttf' '--with-ttf' '--with-t1lib' '--with-gmp' '--with-hyperwave' '--enable-mailparse' '--with-mcrypt' '--with-mhash' '--with-mysql' '--with-pcre-regex' '--enable-posix' '--enable-trans-sid' '--enable-sockets' '--with-regex=php'

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-09-21 04:59 UTC] nohn@php.net
Could not verify this with CVS-HEAD on Linux 2.4.18
 [2002-09-21 06:53 UTC] wez@php.net
This bug has been fixed in CVS.

In case this was a PHP problem, snapshots of the sources are packaged
every three hours; this change will be in the next snapshot. You can
grab the snapshot at http://snaps.php.net/.
 
In case this was a documentation problem, the fix will show up soon at
http://www.php.net/manual/.

In case this was a PHP.net website problem, the change will show
up on the PHP.net site and on the mirror sites in short time.
 
Thank you for the report, and for helping us make PHP better.

Try a tarball from snaps.php.net
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 12:01:31 2024 UTC