php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71115 data race on auto globals names and internal functions names refcount
Submitted: 2015-12-14 11:50 UTC Modified: 2017-03-29 12:21 UTC
Votes:4
Avg. Score:4.5 ± 0.9
Reproduced:3 of 4 (75.0%)
Same Version:2 (66.7%)
Same OS:0 (0.0%)
From: bobby dot mihalca at touchtech dot ro Assigned: ab (profile)
Status: Closed Package: Reproducible crash
PHP Version: 7.0.0 OS: Any
Private report: No CVE-ID: None
 [2015-12-14 11:50 UTC] bobby dot mihalca at touchtech dot ro
Description:
------------
There is a data race on string refcount that could/will cause any multithreaded sapi to segfault.
This happens because during request startup ts_resource, php_request_startup, etc will copy auto_globals, internal functions, etc from php to request.
Since the name is an interned string thus a zend_string* if ZTS enabled, the refcount will be used, but the refcount is not atomic and we have a data race resulting in segfaults.

Ex while copy _SERVER auto global name (refcount=1):
thread 1 load refcount = 1
thread 2 load refount = 1
thread 1 inc refcount = 2
thread 2 inc refcount = 2
thread 1 store refount 
thread 2 store refount
refcount=2 instead of 3
thread 1 release string, refcount = 1
thread 2 release string, refount = 0, free string
thread 3 start, segfault due to accessing freed memory

A quick workaround for this problem is to mark all the names as interned after startup and remove the flag before shutdown.
This will prevent reference counting and thus data race, premature release and segfaults.
I use the following code for my startup/shutdown:
static int startup(sapi_module_struct * sapi_module)
{
	if (php_module_startup(sapi_module, NULL, 0) == SUCCESS) {
		zend_auto_global *global;
		zend_internal_function *function;
		zend_class_entry* class;
		
		ZEND_HASH_FOREACH_PTR(CG(auto_globals), global) {
			GC_FLAGS(global->name) |= IS_STR_INTERNED;
			zend_string_hash_val(global->name);
		} ZEND_HASH_FOREACH_END();
		
		ZEND_HASH_FOREACH_PTR(CG(function_table), function) {
			GC_FLAGS(function->function_name) |= IS_STR_INTERNED;
			zend_string_hash_val(function->function_name);
		} ZEND_HASH_FOREACH_END();
		
		ZEND_HASH_FOREACH_PTR(CG(class_table), class) {
			zend_property_info *property;
			GC_FLAGS(class->name) |= IS_STR_INTERNED;
			zend_string_hash_val(class->name);
			ZEND_HASH_FOREACH_PTR(&class->properties_info, property) {
				GC_FLAGS(property->name) |= IS_STR_INTERNED;
				zend_string_hash_val(property->name);
				if (property->doc_comment) {
					GC_FLAGS(property->doc_comment) |= IS_STR_INTERNED;
				}
			} ZEND_HASH_FOREACH_END();
			
		} ZEND_HASH_FOREACH_END();
		
		return SUCCESS;
	}
	
	return FAILURE;
}

int shutdown(sapi_module_struct *sapi_globals)
{
	zend_auto_global *global;
	zend_internal_function *function;
	zend_class_entry* class;

	ZEND_HASH_FOREACH_PTR(CG(auto_globals), global) {
		GC_FLAGS(global->name) &= ~IS_STR_INTERNED;
	} ZEND_HASH_FOREACH_END();

	ZEND_HASH_FOREACH_PTR(CG(function_table), function) {
		GC_FLAGS(function->function_name) &= ~IS_STR_INTERNED;
	} ZEND_HASH_FOREACH_END();

	ZEND_HASH_FOREACH_PTR(CG(class_table), class) {
		zend_property_info *property;
		GC_FLAGS(class->name) &= ~IS_STR_INTERNED;
		ZEND_HASH_FOREACH_PTR(&class->properties_info, property) {
			GC_FLAGS(property->name) &= ~IS_STR_INTERNED;
			if (property->doc_comment) {
				GC_FLAGS(property->doc_comment) &= ~IS_STR_INTERNED;
			}
		} ZEND_HASH_FOREACH_END();

	} ZEND_HASH_FOREACH_END();
	
	php_module_shutdown();
	return SUCCESS;
}

Note that zend_string_hash_val is not necessary but will prevent data race on hash (witch is probably harmless anyway)
 



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-12-14 13:18 UTC] krakjoe@php.net
This is definitely a problem ...

pthreads uses critical sections around startup/shutdown of user threads to work around the problem ... I'd like to remove those sections ...

I'm not sure what the proper fix should look like though, disabling refcounts on those strings will hide internal programmer error (leaks).
 [2015-12-14 14:14 UTC] bobby dot mihalca at touchtech dot ro
Exclusive locking the sapi startup was also my first attempt.
Unfortunately valgring's helgrind would still detect a data race in zend_lookup_class_ex at ZVAL_STR_COPY(&fcall_info.function_name, EG(autoload_func)->common.function_name);
While exclusive locking during startup stop always segfaulting, the data race is still there and could segfault, is just less probable.

IMHO my fix is quite correct because:
1. names are already persistent allocated and HAVE TO live as long as the sapi
2. AFAIK php uses copy on write for strings, so even is a name is modified a new string will be created
3. names don't change, _SERVER, _GET, _POST, fopen, fclose all keep their name
4. since interned flag is removed php_module_shutdown would free the strings and everything should be ok (assuming that request are active at sapi shutdown, witch i guess is true)

However i think my sample does not cover dynamic loaded extensions, for my use case all modules are static linked, so it works well.

A nice bonus of this fix is a increase from ~35k req/sec to ~50k req/sec due to lower startup cost.
Since all the names are interned even under ZTS it skips hundreds of (pointless) emalloc to copy them and just uses existing interned zend_string*

If i did not miss somthing, a proper fix could integrate the changes in php_module_startup/php_module_shutdown and dynamic module loading/unloading.
 [2015-12-14 14:20 UTC] laruence@php.net
-Assigned To: +Assigned To: ab
 [2015-12-14 14:20 UTC] laruence@php.net
welting, what do you think? Make them interned seems okey,but not sure if there is similar issues out of this?
 [2015-12-14 16:53 UTC] ab@php.net
-Status: Assigned +Status: Feedback
 [2015-12-14 16:53 UTC] ab@php.net
@bobby, could you please show the code reproducing the crash? It is so, that such global names are allocated persistently and this is addressed in the GC info. So they should not be freed by GC and persist all the process life time, while not sure it will go through it.

Currently the TS SAPI I work quite often is Apache, and neither Windows nor Linux show any crashes or issues with dozens of threads. Of course it is not bug free. But exactly for this reason @bobby I'd ask you to please show a piece of code to reproduce and debug the behavior you describe. What SAPI is it? I can see a couple of places that could be improved, but without some real reproducer it might be just a theory of mine.

Thanks.
 [2015-12-14 17:51 UTC] bobby dot mihalca at touchtech dot ro
It's a custom sapi based on a civetweb and while i can't publicly share the code, i can send you a copy.
Just need a bit of time to rip it out of the project.
 [2015-12-14 19:20 UTC] krakjoe@php.net
Persistent strings still use a refcount, they are just allocated using system memory rather than zend heap. 

There is nothing about using system memory that means they should actually persist.
 [2015-12-14 20:28 UTC] bobby dot mihalca at touchtech dot ro
@ab sent you the sources for the sapi

@krakjoe
being persistent allocated just allows my workaround to work as they are not freed at request shutdown.

Now "_GET" is copied to a new persistent string as part php_module_startup in php_startup_auto_globals
 zend_register_auto_global(zend_string_init("_GET", sizeof("_GET")-1, 1), 0,php_auto_globals_create_get);
then each request will copy it in compiler_globals_ctor
  zend_hash_copy(compiler_globals->auto_globals, global_auto_globals_table, auto_global_copy_ctor);
When 2 threads/requests do the copy at the same time you have the described data race and segfault

Sice auto globals are allocated by sapi, they live as long as the sapi and there is no need to copy them on each request, we can safely mark them as interned.
Before module shutdown i remove the interned flag and module shutdown will free the memory.
The same goes to internal function names, classes,etc
For dynamic extensions/dl() the making/unmarking should be part of module load/unload.

Since ZVAL_STR_COPY (as in zend_lookup_class_ex) will increase refcount for non interned strings i see no viable solution other then marking them interned.
I was also thinking about making the refcount atomic but that will have performance implications. Besides would break ABI
 [2015-12-14 21:26 UTC] ab@php.net
Here are some points:

- Persistent allocation is the promise of the API, it is not only about zend_string.
- Marking interned in TS build is a hack, no interned strings are implemented for thread safe builds.
- Multiple threads reading read only global data is not of an issue. 
- Same mechanics does exist in PHP5, disregarding zend_string.

Anyway, we should debug and evaluate the issue. It can have to do with a bug in PHP same way with a bug in the given SAPI implementation. Anyway one should be careful before making any conclusions.

@bobby, thanks for sending. I'll arrive by debugging on Friday. If @krakjoe has time/mood before, please share it with him as well. He is the author of the ext/pthreads and is quite experienced in the matter.

Thanks.
 [2015-12-15 06:31 UTC] krakjoe@php.net
I think bobby knows it's a hack, he's saying the result is correct, and I agree with him. 

A persistent string should not be reference counted, since it can lead to breaking the API's promise and freeing the string early, which is the problem here.

The problem is that the solution used is a hack to get the desired result, but I'm not sure that we can do the same thing internally because I can't see an opportunity to call the final zend_string_free on all strings allocated persistently. 

What we may need to do is store persistent strings in some truly global (safe) structure and destroy them at process shutdown. Then change zend_string_copy to omit increment, and zend_string_release to omit decrement, if & IS_STR_PERSISTENT, and change every use of GC_REFCOUNT(str)++|-- to use the API.

I haven't thought about it for very long, but some of that seems to make sense ... I'll keep thinking about it ...
 [2015-12-15 10:39 UTC] bobby dot mihalca at touchtech dot ro
@krakjoe
Yes is a hack, it only works because the following stars are aligned :)
 * strings are already persistent
 * no interned implemented for TS, (hopefully) nothing to mess up 
 * refcount not used for interned strings

My assumption is that since the names are init during module startup, there should already be code to release them during module shutdown, else we would have a leak now ?
Also i'm assuming no request can be running at module shutdown so nobody other then module holds a pointer.

I should have not brought up the workaround, i was trying to help but only managed to distract and confuse with it.
 [2015-12-15 11:04 UTC] bobby dot mihalca at touchtech dot ro
@ab
- Multiple threads reading read only global data is not of an issue. 
- Same mechanics does exist in PHP5, disregarding zend_string.
Yes but,
php 5 uses const char *, copying a const char * is read only operation
php 7 uses zend_string*, copying a zend_string* is NOT read only as writes refcount.
This is the problem, it copies global zend_string* data concurrently causing data race, causing refcount==0 instead of >=1, causing string free, causing crash.
Global data zend_string* needs to be read only, no refcount without exclusive lock.
Wish it was this clear on initial bug report :)
 [2015-12-15 12:04 UTC] krakjoe@php.net
I have forwarded bobby and anatol a poc solution ... one that breaks the rest of PHP, but I think we can do it with small changes contained in zend_string.* without breaking ABI or API.
 [2015-12-27 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2015-12-27 23:10 UTC] yohgaki@php.net
-Status: No Feedback +Status: Open
 [2017-03-09 13:56 UTC] ab@php.net
@bobby, the interned strings support for TS is in master now. Please check.

Thanks.
 [2017-03-29 12:16 UTC] ab@php.net
-Status: Assigned +Status: Feedback
 [2017-03-29 12:16 UTC] ab@php.net
Please try using this snapshot:

  http://snaps.php.net/php-trunk-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2017-03-29 12:21 UTC] ab@php.net
-Status: Feedback +Status: Closed
 [2017-03-29 12:21 UTC] ab@php.net
The fix for this bug has been committed.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.

Actually, can be closed.
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Sun Nov 19 01:31:42 2017 UTC