php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #63180 Corruption of hash tables (bugfix attached)
Submitted: 2012-09-28 16:18 UTC Modified: 2012-10-26 16:54 UTC
From: vesselin dot atanasov at gmail dot com Assigned: dmitry (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 5.4Git-2012-09-28 (Git) OS: Any
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
39 - 21 = ?
Subscribe to this entry?

 
 [2012-09-28 16:18 UTC] vesselin dot atanasov at gmail dot com
Description:
------------
PHP 5.4 has a bug in the handling if interned (literal) strings.

Each time an HTTP request terminates, the interned strings are restored to the snapshot that was taken at the end of the startup sequence, which
effectively removes any interned strings that have been added after the snapshot has been made.

Usually when a hash item is added, the code in zend_hash.c allocates extra space after the end of the Bucket structure, copies the key there and then
points Bucket::arKey to the key copy. However when dealing with interned hash keys the code tries to optimize the algorithm by not allocating extra
space after the end of the Bucket structure, but just pointing Bucket::arKey to the passed arKey parameter.

	if (IS_INTERNED(arKey)) {
		p = (Bucket *) pemalloc(sizeof(Bucket), ht->persistent);
		if (!p) {
			return FAILURE;
		}
		p->arKey = arKey;
	} else {
		p = (Bucket *) pemalloc(sizeof(Bucket) + nKeyLength, ht->persistent);
		if (!p) {
			return FAILURE;
		}
		p->arKey = (const char*)(p + 1);
		memcpy((char*)p->arKey, arKey, nKeyLength);
	}

The problem happens when a persistent hash table gets an interned key as a parameter. What happens is that the table and its bucket are persistent, i.e.
they remain even after the current HTTP request has been terminated, but the bucket key is removed from the interned keys table and its memory will be
reused by other interned keys upon the next request. This leads to corruption of the array keys in the persistent hash table.

One such case is with the PCRE cache. It is initialized in:

php_pcre.c:PHP_GINIT_FUNCTION(pcre)

by the following code:

zend_hash_init(&pcre_globals->pcre_cache, 0, NULL, php_free_pcre_cache, 1);

The last parameter (1) means that it is a persistent hash table. However the code in
php_pcre.c:pcre_get_compiled_regex_cache(char *regex, int regex_len TSRMLS_DC)
just passes the regex parameter to zend_hash_update:

zend_hash_update(&PCRE_G(pcre_cache), regex, regex_len+1, (void *)&new_entry,
					sizeof(pcre_cache_entry), (void**)&pce);

Given that in most cases the regex parameter is created from string literals in the compiled code, this means that in most cases we end up with
interned, non-persistent keys in the persistent PCRE cache table. So when the next HTTP request comes and we create different interned strings
they will overwrite the previous ones in the PCRE cache table.

The suggested solution to this bug is to modify the code in zend_hash.c and change it in such a way that copying of keys is skipped only when the
key is interned and the hash table is persistent. When either the key is non-interned or the hash table is persistent, the key is copied after the
Bucket structure, so that its maximum lifetime will be the same as the lifetime of the hash table.

I am attaching a patch which uses the suggested solution. I tested this patch and it solves the problems with PCRE cache corruption that we observed in our PHP code. The patch is for PHP 5.4.5 but it also applies cleanly to the latest git version of PHP 5.4


Patches

interned_keys.patch (last revision 2012-09-28 16:20 UTC by vesselin dot atanasov at gmail dot com)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-09-28 17:43 UTC] pajoye@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: dmitry
 [2012-10-01 07:08 UTC] dmitry@php.net
In general, this is the right observation and the right fix.

However, this fix won't allow interned strings usage in EG(function_table), EG(class_table) and EG(zend_constants). This is going to lead to larger memory consumption and slower execution.

I think, that might be better to workaround the problem in PCRE extension itself. I know, it's not "sexy" and some other places might need to be fixed as well. :( May be you can think about better solution?
 [2012-10-01 11:39 UTC] vesselin dot atanasov at gmail dot com
I took a look at the EG(function_table), EG(class_table), EG(zend_constants) and indeed they are persistent.

What were the reasons for making these tables persistent? Was it just for performance reasons?
 [2012-10-01 11:49 UTC] dmitry@php.net
These tables store internal entities, that are never changed from request to request, and also entities created during compilation and execution, that are destroyed at end of request.
 [2012-10-01 12:23 UTC] vesselin dot atanasov at gmail dot com
One option would be to store interned pointers without copying when one of the two is true:

1. Either the hash table is non-persistent.
2. Or when the hash table is persistent and (the interned string is below the interned snapshot top or there is no snapshot top defined at all).

What do you think of this approach?
 [2012-10-26 16:49 UTC] dmitry@php.net
Automatic comment on behalf of dmitry@zend.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=68b229ea73b5e975951b5ad02ffb315ec60fca1e
Log: Fixed bug #63180 (Corruption of hash tables)
 [2012-10-26 16:49 UTC] dmitry@php.net
-Status: Assigned +Status: Closed
 [2012-10-26 16:54 UTC] dmitry@php.net
-Status: Closed +Status: Assigned
 [2012-10-26 16:54 UTC] dmitry@php.net
Your proposal will work, but it will slowdown every HashTable update operation. However these checks are really necessary only for very few cases. I prefer to fix the problem with a hack. I've just committed the fix.
 [2012-10-26 16:54 UTC] dmitry@php.net
-Status: Assigned +Status: Closed
 [2012-10-26 16:54 UTC] dmitry@php.net
The fix for this bug has been committed.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.


 [2012-11-30 11:44 UTC] alex at bakboord dot be
Can someone describe symptoms of this bug?
 [2014-10-07 23:21 UTC] stas@php.net
Automatic comment on behalf of dmitry@zend.com
Revision: http://git.php.net/?p=php-src-security.git;a=commit;h=68b229ea73b5e975951b5ad02ffb315ec60fca1e
Log: Fixed bug #63180 (Corruption of hash tables)
 [2014-10-07 23:32 UTC] stas@php.net
Automatic comment on behalf of dmitry@zend.com
Revision: http://git.php.net/?p=php-src-security.git;a=commit;h=68b229ea73b5e975951b5ad02ffb315ec60fca1e
Log: Fixed bug #63180 (Corruption of hash tables)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Mar 19 03:01:29 2024 UTC