php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46025 zend_bailout can deadlock APC
Submitted: 2008-09-08 15:23 UTC Modified: 2013-02-18 00:33 UTC
Votes:22
Avg. Score:4.4 ± 1.0
Reproduced:19 of 19 (100.0%)
Same Version:5 (26.3%)
Same OS:7 (36.8%)
From: askalski at gmail dot com Assigned: gopalv (profile)
Status: No Feedback Package: Reproducible crash
PHP Version: 5.2.6 OS: redhat
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: askalski at gmail dot com
New email:
PHP Version: OS:

 

 [2008-09-08 15:23 UTC] askalski at gmail dot com
Description:
------------
A zend_bailout (longjmp) is allowed while HANDLE_BLOCK_INTERRUPTIONS is in effect.  When this happens while APC has its shared memory segment locked, it results in corruption of the segment and deadlocking of the mutex.  An Apache restart is required to get things moving again.

Tested with PHP 5.2.6 and 4.4.8 with APC 3.0.19 using pthread mutexes.

In our particular case, this is happening when a script hits the max_execution_time timeout during an include().

Although APC is involved, I am submitting this as a PHP bug because the fix (zend_bailout / HANDLE_BLOCK_INTERRUPTIONS) is completely PHP-side.


Reproduce code:
---------------
<?php

header('Content-Type: text/plain');

echo "Fetching value from APC...\n";
flush();
apc_fetch('deadlock');

echo "Attempting to deadlock APC with max_execution_time...\n";
flush();
ini_set('max_execution_time', 1);
for (;;) apc_store('deadlock', 1);

?>


Expected result:
----------------
Defer the zend_bailout until HANDLE_UNBLOCK_INTERRUPTIONS is called.


Actual result:
--------------
Deadlock of the entire web server, requiring an Apache restart.


Patches

php-5.3.2-apc_deadlock.patch (last revision 2010-05-31 04:50 UTC by askalski at gmail dot com)
php-5.1.6-apc_deadlock.patch (last revision 2010-05-31 04:50 UTC by askalski at gmail dot com)
php-5.2.13-apc_deadlock.patch (last revision 2010-05-31 04:49 UTC by askalski at gmail dot com)
patch-5.3.2-apc_deadlock.patch (last revision 2010-05-31 04:48 UTC by askalski at gmail dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-09-08 20:50 UTC] askalski at gmail dot com
To assist with implementing a fix:

I wrote up a local fix that uses two executor globals:

    /* HANDLE_BLOCK_INTERRUPTIONS nesting depth */
    zend_uint blocking_interruptions;
    /* true if a bailout was deferred while interruptions were blocked */
    zend_bool deferred_bailout;

In my testing, I quickly realized that APC in conjunction with Zend was making nested calls to HANDLE_BLOCK_INTERRUPTIONS(), so to keep from unblocking prematurely, it was necessary to track nesting depth.

Example from my debugging:

    Block 0 /tmp/APC-3.0.19/php_apc.c:559
    Block 1 /tmp/php-5.2.6/Zend/zend_alloc.c:1876
    Unblock 1 /tmp/php-5.2.6/Zend/zend_alloc.c:1913
    Unblock 0 /tmp/APC-3.0.19/php_apc.c:592
    
My updated macros:

    #define HANDLE_BLOCK_INTERRUPTIONS()            if (!EG(blocking_interruptions)++) { if (zend_block_interruptions) { zend_block_interruptions(); } }
    #define HANDLE_UNBLOCK_INTERRUPTIONS()          if (EG(blocking_interruptions) && !--EG(blocking_interruptions)) { if (zend_unblock_interruptions) { zend_unblock_interruptions(); } if (EG(deferred_bailout)) { zend_bailout(); } }

And my mod to _zend_bailout:

    if (EG(blocking_interruptions))
    {
            EG(deferred_bailout) = 1;
            return;
    }
    EG(deferred_bailout) = 0;
 [2008-09-08 21:16 UTC] jani@php.net
Can you reproduce this with latest CVS checkout of PHP_5_2 (and 
preferrably PHP_5_3) ??
 [2008-09-08 23:56 UTC] askalski at gmail dot com
Reproduced with latest checkouts from both the PHP_5_2 and PHP_5_3 tags.

X-Powered-By: PHP/5.2.7-dev
X-Powered-By: PHP/5.3.0alpha3-dev
 [2008-09-09 00:46 UTC] scottmac@php.net
This is essentially what http://wiki.php.net/rfc/zendsignals is for, it was considered for PHP 5.3 but has been deferred for the moment.
 [2009-02-15 00:01 UTC] dan at archlinux dot org
Any progress here? This is definitely reproducible and we have seen it on archlinux.org about once every 3 or 4 days- it kind of stinks. Is there any current workaround besides increasing the timeout value and hoping for the best?

Running:
Apache 2.2.11
PHP 5.2.8
APC 3.0.19
 [2009-02-27 20:09 UTC] pierre at archlinux dot de
This problem is still reproducable with version 5.2.9. The cgi version is affected, too (as expected).
 [2009-10-21 18:01 UTC] shire@php.net
Lucas Nealan and I are working on fixing some items in the signals patch, and would like to push forward on this getting integrated in core.  Would you be willing to try the patch for PHP to see if it corrects your problem?  It would be of great use to know that it fixes your specific issue.  If so let me know which specific version of PHP you want a patch against and I'll make sure we get you the latest one that cleanly patches against it.
 [2009-10-21 18:42 UTC] askalski at gmail dot com
We are using a modified 5.2 in production, so a patch against 5.2.11 (the latest release in that series) would be good.

Thanks,

Andy
 [2010-05-31 06:54 UTC] askalski at gmail dot com
I uploaded patches against the latest 5.1, 5.2, and 5.3 versions of PHP, for sites with production issues that can't afford to wait years for an upstream fix.
 [2010-06-24 18:52 UTC] askalski at gmail dot com
A note about the above patches:  They work with the stable 3.0.19 release of APC, but not the beta 3.1.3p1.  In the beta version, compilation was moved inside a HANDLE_BLOCK_INTERRUPTIONS/HANDLE_UNBLOCK_INTERRUPTIONS block, so the zend_bailout deferral is no longer safe.  For example, a syntax error in the script will result in a partially compiled opcode array to be cached in APC.  I don't yet have an alternate solution.
 [2010-11-07 21:08 UTC] felipe@php.net
-Status: Suspended +Status: Feedback -Assigned To: +Assigned To: gopalv
 [2010-11-07 21:08 UTC] felipe@php.net
Gopal, this issue has been already fixed?
 [2011-08-17 21:57 UTC] pierre at archlinux dot de
This issue is still reproducable using php-fpm and PHP 5.3.6 with APC 3.1.9
 [2012-05-18 14:55 UTC] zhangjiayin99 at gmail dot com
This issue is still reproducable using php-fpm and PHP 5.3.10 with APC 3.1.9
 [2012-07-03 18:09 UTC] askalski at gmail dot com
Recent versions of APC will also require the patch from bug #59281 in conjunction with the PHP-side patch on this ticket.
 [2013-02-18 00:33 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.
 [2013-03-14 02:13 UTC] hardos at gmail dot com
Any updates on this bug?. We faced a similar issue in production with php 5.3.10 
and different versions of APC (3.0.19 and 3.1.9). We could take in and try out the 
patches provided, but a  review of the patches / feedback to the bug would be 
great. Any reason they haven't been taken in ?
 [2013-03-14 03:32 UTC] askalski at gmail dot com
I haven't tested against PHP 5.4 yet; I wonder if some of the lack of attention can be attributed to the planned bundling of Zend Optimizer+ with 5.5.  The patch works (at least on the versions I've tested), but one drawback of the patch as written is that it breaks API compatibility.  Modules need to be rebuilt against the updated Zend/zend.h header file to honor the new HANDLE_BLOCK_INTERRUPTIONS behavior.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC