|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2007-04-08 16:25 UTC] wnxwhitewolf at warriornation dot net
Description:
------------
I'm not sure if this bug is a duplicate of a previous APC bug, but I'll explain my findings.
Several conditions must be met:
1) Script TTL (and User TTL if cache is used for user entries) > 0 (or non-fcntl locks used)
2) Any locking mechanism (tested with fnctl and sem locks and spinlocks on two 2.6 kernel machines on both php 5.1.6 and 5.2.0)
3) The cache must be filled
Then, once the above conditions are met APC locks the cache and attempts to clear it out but fails. Please note however, that when using any other type of locking than fcntl a TTL of 0 does /not/ fix the issue, although if the default fcntl locks are used a TTL of 0 will clear the cache once it's full as expected.
Errors like the following then appear in the error_log of apache once the apache thread that APC is using to clear out the cache times out (or fails):
[Sat Apr 7 23:00:31 2007] [apc-error] Stuck spinlock (0xb41c0000) detected (spinlocks)
[Thu Apr 5 00:31:55 2007] [apc-error] apc_fcntl_lock failed: Bad file descriptor (fcntl)
Software versions (also on another server with php5.2 and apache 2.0.49, both are apache 2):
PHP 5.1.6 (cli) (built: Mar 7 2007 12:48:25)
Copyright (c) 1997-2006 The PHP Group
Zend Engine v2.1.0, Copyright (c) 1998-2006 Zend Technologies
apache2 -V
Server version: Apache/2.0.55
Server built: Sep 27 2006 16:52:14
Server's Module Magic Number: 20020903:11
Architecture: 32-bit
I'll watch the ticket and provide more information if needed. (If you want me to gdb, give me specific instructions on exactly how you want it)
Reproduce code:
---------------
<?php
$testfiledata = file_get_contents('testfile'); //some file less than the max value for apc storage
for($i=0;$i<96;$i++) { //some value that will fill the cache, but I think at least one file needs to be cached for this bug to reproduce
apc_store('test'.uniqid(), $testfiledata);
}
?>
Expected result:
----------------
I expect APC to loop through the cache and empty out the cache of all expired scripts, and if that fails and the data can't be added, apc_store() should fail.
Actual result:
--------------
APC enters into an infinite loop (eventually dies, apache2 timeout?) and creates the issue defined in the description.
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sat Oct 25 11:00:01 2025 UTC |
I assume we saw this too a few minutes ago, first time its happened in a while. We were using 5.2.2-dev and APC from CVS. I grabbed a backtrace and it looked as follows: #0 0x008b07a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 No symbol table info available. #1 0x00989a41 in ___newselect_nocancel () from /lib/tls/libc.so.6 No symbol table info available. #2 0x0028e80d in apc_spin_pg_usleep (microsec=401000) at /www/src/APC-cvs/pgsql_s_lock.c:114 delay = {tv_sec = 0, tv_usec = 21000} #3 0x0028e888 in apc_spin_s_lock (lock=0xafcc402c "\001\001", file=0x291733 "/www/src/APC-cvs/apc_spin.c", line=38) at /www/src/APC-cvs/pgsql_s_lock.c:213 spins = Variable "spins" is not available. I then checked the apache error logs and it was full of: [Sun Apr 15 21:04:03 2007] [apc-error] Stuck spinlock (0xafcc402c) detectedI have the exact same issue with PHP 5.2.4 and APC 3.0.14 While investigating APC source, i found issue in apc_cache_expunge which is still left in current CVS version. If cache ttl is set, expunge will walk through the cache entries to identify the ones to be cleared (i.e thoses having ttl reached) but it will loop for ever. After 30s (PHP's max_execution_time), request causing the expunge is cancelled but lock is left and no further PHP requests can be served. Code extract with line numbers: 413 for (i = 0; i < cache->num_slots; i++) { 414 p = &cache->slots[i]; 415 while(*p) { 416 /* 417 * For the user cache we look at the individual entry ttl values 418 * and if not set fall back to the default ttl for the user cache 419 */ 420 if((*p)->value->type == APC_CACHE_ENTRY_USER) { 421 if((*p)->value->data.user.ttl) { 422 if((*p)->creation_time + (*p)->value->data.user.ttl < t) { 423 remove_slot(cache, p); 424 continue; 425 } 426 } else if(cache->ttl) { 427 if((*p)->creation_time + cache->ttl < t) { 428 remove_slot(cache, p); 429 continue; 430 } 431 } 432 } else if((*p)->access_time < (t - cache->ttl)) { 433 remove_slot(cache, p); 434 continue; 435 } 436 p = &(*p)->next; 437 } 438 } Pb is that expunge will has a continue statement after each remove_slot: => this will force the code to loop again within the while statement => since the while statement condition hasn't changed (p isn't altered), it will loop forever Workaround: set ttls to 0 meanwhile it is fixed.remove_slot() does change *p though: static void remove_slot(apc_cache_t* cache, slot_t** slot) { slot_t* dead = *slot; *slot = (*slot)->next;Shouldn't it be instead the followings ? static void remove_slot(apc_cache_t* cache, slot_t** slot) { slot_t* dead = *slot; *slot = &(*slot)->next; instead of *slot = (*slot)->next; alike it is done in expunge function: p = &(*p)->next;