php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #49374 [PATCH] serialization adds random references
Submitted: 2009-08-26 14:01 UTC Modified: 2020-01-03 10:21 UTC
Votes:5
Avg. Score:4.0 ± 0.9
Reproduced:5 of 5 (100.0%)
Same Version:3 (60.0%)
Same OS:4 (80.0%)
From: wmeler at wp-sa dot pl Assigned: nikic (profile)
Status: Closed Package: Strings related
PHP Version: 5.*, 6 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: wmeler at wp-sa dot pl
New email:
PHP Version: OS:

 

 [2009-08-26 14:01 UTC] wmeler at wp-sa dot pl
Description:
------------
Serialization relies on perfect hashing (without collisions) of variables in ext/standard/var.c - php_add_var_hash. Collision result in random reference to previously serialized variable. 
It is possible to happen because hash function used for objects is not perfect one - for two objects of different classes it is possible to get the same hash result.
I've just fixed the same problem in 4.x where collisions were more frequent because of use of HANDLE_NUMERIC in zend_hash_add and zend_hash_next_index_insert.
Problem is extremely hard to reproduce and debug because of pointer value sensitivity, while easy to fix. Instead of single smart_str_print_long(hash) we could use concatenation - two calls - smart_str_print_long(Z_OBJCE_P(var);smart_str_print_long(Z_OBJ_HANDLE_P(var)) - or even faster version with binary memcpy of two pointers without 'O' prefix. If you wan't to I can provide this simple patch for 5.3.0.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-08-26 15:03 UTC] jani@php.net
How about an example script to reproduce this and the patch? Why do we need to ask for the patch separately anyway? Just show the patch or don't. Don't "ask to ask".
 [2009-08-27 06:42 UTC] wmeler at wp-sa dot pl
I can't provide you simple script to reproduce the problem, as it depends on everything what is on heap - environment variables, compilation, memory manager etc.
Why didn't I send following patch? because your bug reporting form has no such input ... You also state: "Please do not insert any huge text here. Keep the description as short as possible. If we want to get a strace, we will ask for it separately."
Feel free to change bug reporting form. I hope that patch will fit in comment...

--- ext/standard/var.c.orig	2009-08-27 09:18:53.628753000 +0200
+++ ext/standard/var.c	2009-08-27 09:39:06.932120000 +0200
@@ -470,25 +470,23 @@
 static inline int php_add_var_hash(HashTable *var_hash, zval *var, void *var_old TSRMLS_DC) /* {{{ */
 {
 	ulong var_no;
-	char id[32], *p;
+	char id[32];
 	register int len;
 
-	/* relies on "(long)" being a perfect hash function for data pointers,
-	 * however the actual identity of an object has had to be determined
+	/* pointers to variables stored in binary hash keys
+	 * the actual identity of an object has had to be determined
 	 * by its object handle and the class entry since 5.0. */
 	if ((Z_TYPE_P(var) == IS_OBJECT) && Z_OBJ_HT_P(var)->get_class_entry) {
-		p = smart_str_print_long(id + sizeof(id) - 1,
-				(((size_t)Z_OBJCE_P(var) << 5)
-				| ((size_t)Z_OBJCE_P(var) >> (sizeof(long) * 8 - 5)))
-				+ (long) Z_OBJ_HANDLE_P(var));
-		*(--p) = 'O';
-		len = id + sizeof(id) - 1 - p;
+		zend_class_entry *ce = Z_OBJCE_P(var); 
+		memcpy(id,&ce,sizeof(void *));
+		memcpy(id+sizeof(void*),&(Z_OBJ_HANDLE_P(var)),sizeof(zend_object_handle));
+		len = sizeof(void*)+sizeof(zend_object_handle);
 	} else {
-		p = smart_str_print_long(id + sizeof(id) - 1, (long) var);
-		len = id + sizeof(id) - 1 - p;
+		memcpy(id,&var,sizeof(void *));
+		len = sizeof(void *);
 	}
 
-	if (var_old && zend_hash_find(var_hash, p, len, var_old) == SUCCESS) {
+	if (var_old && zend_hash_find(var_hash, id, len, var_old) == SUCCESS) {
 		if (!Z_ISREF_P(var)) {
 			/* we still need to bump up the counter, since non-refs will
 			 * be counted separately by unserializer */
@@ -500,7 +498,7 @@
 
 	/* +1 because otherwise hash will think we are trying to store NULL pointer */
 	var_no = zend_hash_num_elements(var_hash) + 1;
-	zend_hash_add(var_hash, p, len, &var_no, sizeof(var_no), NULL);
+	zend_hash_add(var_hash, id, len, &var_no, sizeof(var_no), NULL);
 	return SUCCESS;
 }
 /* }}} */
 [2009-12-29 19:31 UTC] jani@php.net
Never heard such thing as putting the patch as file somewhere where it can be downloaded from? Don't be so arrogant, we might even commit your patches sometime.
 [2010-04-01 14:43 UTC] grzegorz dot drozd at esky dot pl
Hello.

It seems to me that I just encountered this error. By implementing a new service which includes a series (more than 70) objects each of which contains an array of objects I noticed the appearance of random objects from another part of the data.
The data structure is as follows:

List
 item 1
 item 2
 item 3
   subitem 1
   subitem 2
   subitem 3
      element *
      element *
      element *

Objects marked with an asterisk appear in other places in this structure with a semi random frequency. Serialized session has a size of about 2-3 MB.

All the elements are objects with very similar name:
List_item_subitem_element or List_item_subitem etc.

There are about 70 instances of item with 2-10 instances of subitem and 2-4 instances of element.
 [2014-08-08 02:02 UTC] stephen at heyday dot co dot nz
We ran into this issue yesterday using PHP 5.5.15 with mod_php under OSX 10.9. The issue was consistently repeatable in our application environment with a serialize call involving only three objects, I've tried to produce a standalone test case for this, but as pointed out by the reporter it's extremely difficult demonstrate from PHP code as it depends on the memory addresses of the zvals being serialized lining up perfectly.

Since this issue can occur by chance and corrupt data (though rarely), we are now using the igbinary extension for serialization. The issue described above disappears when igbinary is enabled, and returns when igbinary is disabled.
 [2020-01-03 10:21 UTC] nikic@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: nikic
 [2020-01-03 10:21 UTC] nikic@php.net
This issue has been fixed in (IIRC) PHP 7.0, so closing here.

Link to the relevant code: https://github.com/php/php-src/blob/7ba18b0b392f668617f4cbb9e6129da831819bb4/ext/standard/var.c#L670
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 18 08:02:42 2024 UTC