php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #31749 deadlock due to libc mutexes not released when timeout signal is delivered
Submitted: 2005-01-29 01:57 UTC Modified: 2016-06-07 11:40 UTC
Votes:20
Avg. Score:4.0 ± 1.2
Reproduced:12 of 12 (100.0%)
Same Version:9 (75.0%)
Same OS:8 (66.7%)
From: phpbug2 at mailinator dot com Assigned: dmitry (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 5CVS, 4CVS (2005-06-19) OS: Linux 2.4.28, 2.4.21-SMP
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: phpbug2 at mailinator dot com
New email:
PHP Version: OS:

 

 [2005-01-29 01:57 UTC] phpbug2 at mailinator dot com
Description:
------------
A certain script (phpWebSite) a client of mine is running causes the Apache child to deadlock after processing is complete. The output is sent to the client however so to a normal visitor it appears to be working fine, but on the server a child deadlocks with every hit to the script.

Srv	PID	Acc	M	CPU 	SS	Req	Conn	Child	Slot	Client	VHost	Request
26-0	24911	1/75/75	W 	1.29	15390	0	0.3	0.73	0.73 	66.249.xxx.xxx	xxxxxxxx.xxx	GET /index.php?module=calendar&calendar%5Bview%5D=month&mo

The request sticks in the "W"riting state until Apache is hupped at which point it moves to "G"racefully finishing and will remain in that state indefinitely. Closing Apache results in a "Child xxxx did not cleanly exit, sending SIGKILL" message in the error_log. No errors are logged prior to the child deadlocking. The hit that causes the deadlock does NOT get logged to the Apache access_log.

php.ini is pretty much the stock 5.0.3-recommended. No 3rd party modules such as Zend or Turck are present. Configure string:

'./configure' '--with-imap=/usr' '--with-expat-dir=/usr' '--enable-sockets' '--with-zlib-dir=/usr' '--with-mysql=/usr' '--with-mysql-sock=/var/lib/mysql/mysql.sock' '--with-mcrypt=/usr' '--with-jpeg-dir=/usr' '--with-png-dir=/usr' '--with-freetype-dir=/usr' '--with-gettext=/usr' '--enable-dba' '--with-db4=/usr' '--with-gdbm=/usr' '--with-curl=/usr' '--enable-exif' '--with-ttf=/usr' '--with-apxs=/www/bin/apxs' '--disable-cgi' '--disable-ipv6' '--enable-gd-native-ttf' '--enable-calendar' '--with-zlib=/usr' '--with-gd' '--with-gettext' '--disable-posix' '--with-pgsql=/usr/local/pgsql' '--enable-memory-limit' '--with-pspell=/usr' '--with-mysqli' '--with-mm' '--with-xsl'

None of the binaries are built with -g which I am afraid seems to limit the debugging output. -fomit-frame-pointer however is NOT used. Please inform if this is a problem and I will rebuild.

Reproduce code:
---------------
phpWebSite
http://phpwebsite.appstate.edu/

The "calendar" module seems to be the one that causes it, however I am unsure as to whether any events need to be added in certain places etc as the calendar is not maintained by myself.

I believe the phpWebSite script needed a few minor changes in order to conform to the new OO model in PHP5. If you require an exact copy of the installed version and/or data please let me know and I will ask my client.

Expected result:
----------------
Not to deadlock.

Actual result:
--------------
#0  0x001cf666 in sigsuspend () from /lib/libc.so.6
#1  0x00c50c28 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
#2  0x00c52ef0 in __pthread_alt_lock () from /lib/libpthread.so.0
#3  0x00c4f170 in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0x00219fda in free () from /lib/libc.so.6
#5  0x008844f9 in php_error_cb () from /www/libexec/libphp5.so
#6  0x008c2724 in zend_error () from /www/libexec/libphp5.so
#7  0x008b8476 in zend_timeout () from /www/libexec/libphp5.so
#8  0x00c54d0d in __pthread_sighandler () from /lib/libpthread.so.0
#9  <signal handler called>
#10 0x00c530f1 in __pthread_alt_unlock () from /lib/libpthread.so.0
#11 0x00c4f42e in pthread_mutex_unlock () from /lib/libpthread.so.0
#12 0x00219fd3 in free () from /lib/libc.so.6
#13 0x0023fcbd in __tzfile_read () from /lib/libc.so.6
#14 0x0023e3ac in tzset_internal () from /lib/libc.so.6
#15 0x0023ef08 in tzset () from /lib/libc.so.6
#16 0x0081b65f in php_putenv_destructor () from /www/libexec/libphp5.so
#17 0x008c93b4 in zend_hash_del_key_or_index () from /www/libexec/libphp5.so
#18 0x0081c828 in zif_putenv () from /www/libexec/libphp5.so
#19 0x008f951c in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#20 0x008f9851 in zend_do_fcall_handler () from /www/libexec/libphp5.so
#21 0x008e76ec in execute () from /www/libexec/libphp5.so
#22 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#23 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#24 0x008e76ec in execute () from /www/libexec/libphp5.so
#25 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#26 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#27 0x008e76ec in execute () from /www/libexec/libphp5.so
#28 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#29 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#30 0x008e76ec in execute () from /www/libexec/libphp5.so
#31 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#32 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#33 0x008e76ec in execute () from /www/libexec/libphp5.so
#34 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#35 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#36 0x008e76ec in execute () from /www/libexec/libphp5.so
#37 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#38 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#39 0x008e76ec in execute () from /www/libexec/libphp5.so
#40 0x008f8fe3 in zend_do_fcall_common_helper () from /www/libexec/libphp5.so
#41 0x008f972f in zend_do_fcall_by_name_handler () from /www/libexec/libphp5.so
#42 0x008e76ec in execute () from /www/libexec/libphp5.so
#43 0x008fe5c9 in zend_include_or_eval_handler () from /www/libexec/libphp5.so
#44 0x008e76ec in execute () from /www/libexec/libphp5.so
#45 0x008c2cad in zend_execute_scripts () from /www/libexec/libphp5.so
#46 0x008861f5 in php_execute_script () from /www/libexec/libphp5.so
#47 0x00000000 in ?? ()
#48 0x00000003 in ?? ()
#49 0x00000000 in ?? ()
#50 0xbffff310 in ?? ()
#51 0x00000000 in ?? ()
#52 0x00000002 in ?? ()
#53 0x002137bd in _IO_new_file_fopen () from /lib/libc.so.6


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-02-03 12:10 UTC] phpbug2 at mailinator dot com
I believe this is the buggy code. It is setting a signal handler for SIGPROF that immediately dumps a timeout error message when the signal is received. However if the signal is delivered whilst PHP is processing a libc call to time(), a pthread mutex within libc is not released, causing a deadlock when Apache tries to call time() for the output.

A fix would be to have PHP set some flag such as "processing time exceeded" for the request and to then abort the request at the next available opporunity from within PHP instead of jumping out and leaving libc in a potentially unstable or dangerous state. Obviously this reduces the effectiveness of the maximum execution timeout as a blocking libc call could cause the timeout to be greatly exceeded, but I see no other safe way of handling this without leaving libc in a potentially dangerous state. Perhaps two signals could be used, the first one to break execution if possible at the next available opportunity and then the current "hard" break if no response within 15 seconds or something.

ZEND_API void zend_timeout(int dummy)
{
        TSRMLS_FETCH();

        if (zend_on_timeout) {
                zend_on_timeout(EG(timeout_seconds) TSRMLS_CC);
        }

        zend_error(E_ERROR, "Maximum execution time of %d second%s exceeded",
                          EG(timeout_seconds), EG(timeout_seconds) == 1 ? "" : "s");
}

...
                t_r.it_value.tv_sec = seconds;
                t_r.it_value.tv_usec = t_r.it_interval.tv_sec = t_r.it_interval.tv_usec = 0;

                setitimer(ITIMER_PROF, &t_r, NULL);
                signal(SIGPROF, zend_timeout);
                sigemptyset(&sigset);
                sigaddset(&sigset, SIGPROF);
                sigprocmask(SIG_UNBLOCK, &sigset, NULL);
        }
 [2005-02-19 22:28 UTC] phpbug2 at mailinator dot com
I've created a patch against 5.0.3 to implement my "soft timeout" idea above. It will try to soft-break execution after the timeout period elapses and then hard-break after another timeout period.

There seem to be a lot of calls to the zend_set_timeout and zend_unset_timeout functions that I'm not too sure are helping the situation. Here's what happens using the CLI PHP and printing whenever certain functions are hit:

$ sapi/cli/php /www/htdocs/time.php
zend_unset_timeout()
zend_set_timeout (0)
zend_set_timeout (0)
zend_set_timeout (0)
zend_unset_timeout()
zend_set_timeout (5)
Array 2005 2217696508 Jan-01-1998 Jan 01 1998 05:00:00
Array 2005 2217696508 Jan-01-1998 Jan 01 1998 05:00:00
Array 2005 2217696510 Jan-01-1998 Jan 01 1998 05:00:00
Array 2005 2217696512 Jan-01-1998 Jan 01 1998 05:00:00
Array 2005 2217696516 Jan-01-1998 Jan 01 1998 05:00:00
Array 2005 2217696516 Jan-01-1998 Jan 01 1998 05:00:00
zend_soft_timeout()
zend_set_timeout (5)
zend_timeout()
zend_set_timeout (5)

Fatal error: Maximum execution time of 5 seconds exceeded in /www/htdocs/time.php on line 10
zend_unset_timeout()
zend_set_timeout (30)
zend_unset_timeout()

And the patch to 5.0.3 to implement the "zend_soft_timeout" that "works for me" in that I'm not getting any more deadlocked Apache children: http://www.r1hosting.net/zend_timeout.patch
 [2005-02-20 15:55 UTC] wez@php.net
Andi, please take a look.  I remember some talk in the past about going this route, but I don't recall the arguments against it.
Similarly, I'm not sure that this is really a problem for PHP to fix; given that the time() function is officially not thread-safe, it seems wrong for the libc to mutex internally.
 [2005-12-24 02:46 UTC] sniper@php.net
Dmitry, can you check this out please?
 [2008-07-21 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2010-06-08 13:27 UTC] michael at makeweb dot no
I have the same problem, reproduced on php 4.4.9 and 5.3.0.

Basically pseudocode to reproduce is set_time_limit(1); do { ... } until time passed = 1 second.

This will create a race condition where sometimes a libc lock will prevent zend_timeout to process properly.

Script to reproduce on 5.3.0: ($limit is not 1, because php seems to not break accurately on time - this value must be adjusted so the error "PHP Fatal error:  Maximum execution time of 1 second exceeded in Unknown on line 0" frequently appears. The script should fail within 100 runs (lock up).

<?php
  echo "Trying to break php...\n";
  set_time_limit(1);
  $limit = 1.235;
  $start = array_sum(explode(' ',$x = microtime()));
  $a = array();
  for ($cnt = 0; $cnt < 10; $cnt++) {
    for ($cntb = 0; $cntb < 100; $cntb++) {
      for ($cntc = 0; $cntc < 100; $cntc++) {
        $end = array_sum(explode(' ',$y = microtime()));
        $a[$cnt][$cntb][$cntc] = md5($cnt.$cntb.$cntc);
        if ($end >= $start+$limit) break;   
      }
      if ($end >= $start+$limit) break;   
    }
    if ($end >= $start+$limit) break;   
  }
  echo "$x - $y (".($end-$start).")\n";
?>

run with: while [ true ]; do php break.php; done

Though this scenario is unlikely, there are other cases where it can happen very often, however I fail to come up with a short example of it. The reason for generating a multidimensional array is that in php4, the hang is very likely to happen within zend_shutdown, as the resources are freed (hangs within 1-10 cycles)

Keep in mind error logging has to be enabled (this is when there's calls for libc "get time" which causes the locking issue within zend_timeout)

pstack for php 5.3.0:

# pstack 1565
#0  0x00319402 in __kernel_vsyscall ()
#1  0x001dede3 in __lll_lock_wait_private () from /lib/libc.so.6
#2  0x0016e6f6 in _L_lock_15330 () from /lib/libc.so.6
#3  0x0016dbb4 in free () from /lib/libc.so.6
#4  0x0012067c in setlocale () from /lib/libc.so.6
#5  0x08084586 in seek_to_tz_position ()
#6  0x08084664 in timelib_timezone_id_is_valid ()
#7  0x0806575c in guess_timezone ()
#8  0x080657a5 in get_timezone_info ()
#9  0x08067f35 in php_format_date ()
#10 0x082ab96e in php_log_err ()
#11 0x082abe17 in php_error_cb ()
#12 0x082f9fe8 in zend_error_noreturn ()
#13 0x082ec745 in zend_timeout ()
#14 <signal handler called>
#15 0x00169c49 in _int_free () from /lib/libc.so.6
#16 0x0016dbc0 in free () from /lib/libc.so.6
#17 0x082ac6df in php_conv_fp ()
#18 0x082ad409 in strx_printv ()
#19 0x082ad804 in ap_php_snprintf ()
#20 0x082565f2 in _php_gettimeofday ()
#21 0x08342509 in zend_do_fcall_common_helper_SPEC ()
#22 0x08319f3d in execute ()
#23 0x082f8e47 in zend_execute_scripts ()
#24 0x082a92be in php_execute_script ()
#25 0x08378457 in main ()

Any suggestions how to patch php4 be greatly appreciated!

Michael
 [2010-06-08 13:34 UTC] mkoppanen@php.net
-Status: No Feedback +Status: Re-Opened
 [2011-11-16 10:22 UTC] tstarling@php.net
The suggested patch is an improvement, but it still calls zend_timeout() when the "hard timeout" is reached, which will usually deadlock if a libc mutex such as the one for malloc() is held. The only safe way to handle this situation is to kill the process without calling any non-signal-safe functions, e.g. by calling abort(). If you return control to Apache using longjmp() (like what zend_bailout() does), Apache will deadlock instead. I haven't found any robust way to display an error message, it would look the same as a segfault to the user.
 [2011-11-17 01:34 UTC] tstarling@php.net
I added a workaround to our custom error handler:

https://www.mediawiki.org/wiki/Special:Code/MediaWiki/103433
 [2012-02-14 10:10 UTC] jpauli@php.net
Interesting as well as weird bug.
I have nothing to say about it that would move things forward but that signal 
handling have been rewritten in PHP5.4. It might handle such cases, see with 
contributors.

You should have a look at http://lxr.php.net/xref/PHP_5_4/Zend/zend_signal.c and 
try to reproduce the bug on a 5.4 basis.
 [2012-02-14 12:15 UTC] jpauli@php.net
Seems like there are problems with 5.4 zend signals.
http://marc.info/?l=php-internals&m=132921501729587

Also, see related #61067
 [2012-02-14 16:29 UTC] rasmus@php.net
For Apache1 longjmp() problems, I added the "exit_on_timeout" ini switch a long 
time ago. It is a crude but effective way to deal with this.
 [2016-06-07 11:40 UTC] dmitry@php.net
-Status: Re-Opened +Status: Closed
 [2016-06-07 11:40 UTC] dmitry@php.net
This is finally fixed in PHP-7.1 with "safe timeout handling.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sun Jan 05 05:01:28 2025 UTC