|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2009-08-26 14:01 UTC] wmeler at wp-sa dot pl
Description: ------------ Serialization relies on perfect hashing (without collisions) of variables in ext/standard/var.c - php_add_var_hash. Collision result in random reference to previously serialized variable. It is possible to happen because hash function used for objects is not perfect one - for two objects of different classes it is possible to get the same hash result. I've just fixed the same problem in 4.x where collisions were more frequent because of use of HANDLE_NUMERIC in zend_hash_add and zend_hash_next_index_insert. Problem is extremely hard to reproduce and debug because of pointer value sensitivity, while easy to fix. Instead of single smart_str_print_long(hash) we could use concatenation - two calls - smart_str_print_long(Z_OBJCE_P(var);smart_str_print_long(Z_OBJ_HANDLE_P(var)) - or even faster version with binary memcpy of two pointers without 'O' prefix. If you wan't to I can provide this simple patch for 5.3.0. PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Oct 24 10:00:01 2025 UTC |
I can't provide you simple script to reproduce the problem, as it depends on everything what is on heap - environment variables, compilation, memory manager etc. Why didn't I send following patch? because your bug reporting form has no such input ... You also state: "Please do not insert any huge text here. Keep the description as short as possible. If we want to get a strace, we will ask for it separately." Feel free to change bug reporting form. I hope that patch will fit in comment... --- ext/standard/var.c.orig 2009-08-27 09:18:53.628753000 +0200 +++ ext/standard/var.c 2009-08-27 09:39:06.932120000 +0200 @@ -470,25 +470,23 @@ static inline int php_add_var_hash(HashTable *var_hash, zval *var, void *var_old TSRMLS_DC) /* {{{ */ { ulong var_no; - char id[32], *p; + char id[32]; register int len; - /* relies on "(long)" being a perfect hash function for data pointers, - * however the actual identity of an object has had to be determined + /* pointers to variables stored in binary hash keys + * the actual identity of an object has had to be determined * by its object handle and the class entry since 5.0. */ if ((Z_TYPE_P(var) == IS_OBJECT) && Z_OBJ_HT_P(var)->get_class_entry) { - p = smart_str_print_long(id + sizeof(id) - 1, - (((size_t)Z_OBJCE_P(var) << 5) - | ((size_t)Z_OBJCE_P(var) >> (sizeof(long) * 8 - 5))) - + (long) Z_OBJ_HANDLE_P(var)); - *(--p) = 'O'; - len = id + sizeof(id) - 1 - p; + zend_class_entry *ce = Z_OBJCE_P(var); + memcpy(id,&ce,sizeof(void *)); + memcpy(id+sizeof(void*),&(Z_OBJ_HANDLE_P(var)),sizeof(zend_object_handle)); + len = sizeof(void*)+sizeof(zend_object_handle); } else { - p = smart_str_print_long(id + sizeof(id) - 1, (long) var); - len = id + sizeof(id) - 1 - p; + memcpy(id,&var,sizeof(void *)); + len = sizeof(void *); } - if (var_old && zend_hash_find(var_hash, p, len, var_old) == SUCCESS) { + if (var_old && zend_hash_find(var_hash, id, len, var_old) == SUCCESS) { if (!Z_ISREF_P(var)) { /* we still need to bump up the counter, since non-refs will * be counted separately by unserializer */ @@ -500,7 +498,7 @@ /* +1 because otherwise hash will think we are trying to store NULL pointer */ var_no = zend_hash_num_elements(var_hash) + 1; - zend_hash_add(var_hash, p, len, &var_no, sizeof(var_no), NULL); + zend_hash_add(var_hash, id, len, &var_no, sizeof(var_no), NULL); return SUCCESS; } /* }}} */Hello. It seems to me that I just encountered this error. By implementing a new service which includes a series (more than 70) objects each of which contains an array of objects I noticed the appearance of random objects from another part of the data. The data structure is as follows: List item 1 item 2 item 3 subitem 1 subitem 2 subitem 3 element * element * element * Objects marked with an asterisk appear in other places in this structure with a semi random frequency. Serialized session has a size of about 2-3 MB. All the elements are objects with very similar name: List_item_subitem_element or List_item_subitem etc. There are about 70 instances of item with 2-10 instances of subitem and 2-4 instances of element.