php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #56494 APC breaks PHP output after extended periods
Submitted: 2005-08-16 15:41 UTC Modified: 2006-02-16 18:34 UTC
From: david at edeca dot net Assigned:
Status: Closed Package: APC (PECL)
PHP Version: Irrelevant OS: Gentoo
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: david at edeca dot net
New email:
PHP Version: OS:

 

 [2005-08-16 15:41 UTC] david at edeca dot net
Description:
------------
I have a server using PHP 4.4.0 (not available in the 'PHP Version' dropdown!) and Apache 2.  APC 3.0.6 seems to work fine for a while and the apc.php status page shows this.  For a number of hours the cache will slowly fill up and be fairly unfragmented.  However, very suddenly the cache becomes very fragmented and soon PHP will stop serving files.  Apache will serve plain HTML or even CGI/Perl but PHP output does not appear.

This issue seems to occur quicker with apc.optimization set to 1, but still occurs with it set to 0.  The configuration we are using is given below.

; APC cache

extension=apc.so
apc.enabled = 1
apc.shm_segments = 4
apc.shm_size = 32
apc.optimization = 0
apc.num_files_hint = 1000
apc.gc_ttl = 300
apc.ttl = 3600
#apc.filters=""
#apc.mmap_file_mask=""
apc.slam_defense="0"
apc.file_update_protection="2"

An strace of the crashed process seems to have the following last line:

fcntl64(109, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}

If any more debugging output is required, please contact me.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-08-16 15:49 UTC] rasmus@php.net
Which Apache2 mpm?

And yes, don't use the optimization for now.  Also, could you set your shm_segments to 1 and try again?  If it still happens, try compiling APC with --enable-apc-sem and test that.
 [2005-08-16 16:16 UTC] david at edeca dot net
Apache is using the prefork MPM and has threading disabled.

I've set APC to use one segment of 32M, but unfortunately this is not big enough for all the code we are caching.

I will monitor it to see if the problem still occurs and report back.

Thanks.
 [2005-08-18 18:33 UTC] david at edeca dot net
I've not had much time to dedicate to this but it still happens even with one segment.

Might the "Fix apc_fetch() memory corruption" change in 3.0.7 be related?  I will upgrade as soon as possible, sooner if this fix might help!
 [2005-08-19 01:15 UTC] rasmus@php.net
Only if you are actually using apc_fetch() somewhere.

A workaround would be to crank your cache size way up.  Try compiling using --enable-apc-mmap and set your cache size to 128M
 [2005-08-20 12:36 UTC] syntrex at despammed dot com
I have installed apache2, php 4.4.0, pecl-apc 3.0.8 
and the apache stops serving php pages after 2 hours or less
I tried to emerge apc with USE="mmap" but I'm not able to have apc with mmap support.
If I set apc.shm_size="128" apache is not able to show a php page
 [2005-08-20 12:38 UTC] syntrex at despammed dot com
sorry, it is pecl apc 3.0.6
 [2005-08-20 12:41 UTC] rasmus@php.net
Forget this emerge thing.  Just build it.
cvs -d:pserver:cvsread@cvs.php.net:/repository login
Password: phpfi
cvs -d:pserver:cvsread@cvs.php.net:/repository co pecl/apc
cd pecl/apc
phpize
./configure --enable-apc-mmap --with-apxs
make
make install
 [2005-08-20 12:41 UTC] david at edeca dot net
The mmap USE flag (which is Gentoo package specific anyway) doesn't make any difference to the ebuild.  I plan on making a bug for this on http://bugs.gentoo.org, with a change for that.

Apache doesn't serve pages if you set apc.shm_size="128" because, at least on my machine, it seems 32M is the limit.  Therefore apache will have started but the PHP extension wont be working properly.

If you want to fix this quickly, it seems that recompiling APC with --use-apc-mmap and using a large (e.g. 128M) cache will stop the 2 hour (ish) problems occurring.  I will report back later whether --use-apc-mmap has fixed my problem, as it seems to have done.
 [2005-08-20 12:59 UTC] syntrex at despammed dot com
Thank you David ! I installed apc from cvs with MMAP Support and set apc.shm_size="128". 
Actually the php pages are delivered.
 [2005-08-20 13:08 UTC] david at edeca dot net
For what it's worth, I've added a modification to the Gentoo ebuild so that it can be built with mmap.  Let's see what the Gentoo developers think of it, hopefully then it'll build more easily.

See the link below if you're interested.

http://bugs.gentoo.org/show_bug.cgi?id=103159
 [2005-08-20 13:24 UTC] david at edeca dot net
I appear to have cursed it by saying it was working fine.  Below is the output from my logs, after I had recompiled with --enable-apc-mmap.

The rough sequence of events is..

Startup:
[Sat Aug 20 10:08:10 2005] [notice] Apache/2.0.54 (Gentoo/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7e PHP/4.4.0 configured -- resuming normal operations

Then one of these:
[Sat Aug 20 10:29:32 2005] [notice] child pid 1116 exit signal Segmentation fault (11)

Now lots of these, all at 11:40:11:
[Sat Aug 20 11:40:11 2005] [apc-warning] GC cache entry '/usr/lib/php/DB/mysql.php' (dev=65027 ino=0) was on gc-list for 918969 seconds

One of these:
[Sat Aug 20 13:02:26 2005] [notice] child pid 14836 exit signal Floating point exception (8)

And lots of these, for about 4 hours:
[Sat Aug 20 13:03:00 2005] [notice] child pid 2726 exit signal Segmentation fault (11)

At about 18:17 PHP stopped any output completely and I had to restart apache in order to get it to work.  Any info given the above?  I'm still using 3.0.6 with --enable-apc-mmap and a 128M cache (which is full and seems to handle about 40 hits / second at the moment).

I'm honestly not sure how it is full, as 5 minutes after a restart I'm using 21M out of 128M and have 264 cached files.  The total number of cached files doesn't seem to go much over 300 even after a few hours, but the cache usage does creep up to 128M slowly.  I don't know enough about APC to guess properly, but is it leaking (or not releasing) memory properly at some point?
 [2005-09-22 04:05 UTC] gopalv82 at yahoo dot com
Doesn't apc_compile.c version 3.26 fix this bug ?.
 [2005-10-15 14:07 UTC] don at smugmug dot com
I'm seeing this problem on 3.0.8 as well (with Apache2 prefork + PHP 4.3.11).  I've compiled with with mmap support and allocated 128MB, but I still see the problem.

apc.php reports 0% fragmentation, so it doesn't appear to be a fragmentation problem.  Somewhere around 25-28MB is used, so it's not running out of cache or something.

It'll run for a few hours and then silently stop serving pages from my application.  It will, however, continue to serve things like apc.php, so PHP is still functioning, just not liking my app.  Apache serves HTML fine, too.

(To answer one of the other questions, 3.0.8 uses apc_compile.c 3.26 and the problem isn't fixed).

If I telnet directly to port 80 and issue commands, the connection is immediately closed:

[root@XXXXX root]# telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
HEAD /photos/26198422-Ti.jpg HTTP/1.1
Host: augustin.smugmug.com

Connection closed by foreign host.

Any ideas?
 [2005-10-15 14:08 UTC] don at smugmug dot com
I should have mentioned this isn't Gentoo, it's RHEL3.
 [2005-10-15 17:05 UTC] don at smugmug dot com
A little more detail.  When the site stops responding due to whatever this bug is, issuing "apc_clear_cache()" will fix it, no need to restart Apache.

I have yet to be able to tie it down to any specific script or function call, if that's even possible.
 [2005-10-15 17:50 UTC] george at omniti dot com
Your example file name suggests to me that you might be trying 
to cache an enormous number of files (one for each image?),  
Is that accurate?
 [2005-10-15 18:10 UTC] don at smugmug dot com
No, sorry, I should have thought about that before posting that example.  We do some URL processing to make pretty URLs, but that's actually hitting a PHP script.

It's less than 200 files that are being cached.  I'm happy to provide more details about them if there's something more you'd like to know.

I just finished testing with both --enable-apc-mmap and --enable-mmap-sem, the problem still exists.

I'm about to try 3.0.9-dev from the CVS snapshot I took last night, which has been doing fine on my test server.  At some point I'll probably test against PHP 4.4.0 as well.

FYI, I don't see these problems in test, only on my live site with tons of traffic.  

Just realized I didn't include my full apc config settings.  They are:

extension=apc.so
apc.enabled=1
apc.optimization=0
apc.shm_size=128
apc.shm_segments=1
apc.slam_defense=90
apc.file_update_protection=2
apc.mmap_file_mask="/tmp/apc.XXXXXX"
apc.max_file_size=10M
 [2005-10-15 18:11 UTC] don at smugmug dot com
Woops, that was --enable-apc-sem, not --enable-mmap-sem.  :)
 [2005-10-15 19:09 UTC] don at smugmug dot com
3.0.9-dev didn't solve the problem.  (--enable-apc-mmap and --enable-apc-sem).

I'll try 4.4.0 a little later once I'm sure I'm not breaking anything.
 [2005-10-16 16:21 UTC] don at smugmug dot com
I almost hesitate to post this, since I'm not sure how useful it is, but since more data is better than less, here it is:

When one of my Apache+PHP+APC boxes fails, quite a few other other boxes fail at exactly the same time.  Not all, but usually 5 separate servers fail simultaneously.

The interesting thing is that they belong to different clusters.  In other words, no clients are hitting both clusters simultaneously, because they're completely different hostnames & URLs. Nothing is shared, not even the application script directory. Yet APC is freezing a handful of machines simultaneously on both clusters.  Again, not every machine in the cluster, but often more than one.

Seems strange, but is it possible there's some time issue here?
 [2005-10-16 18:33 UTC] don at smugmug dot com
Problem also exists on php-4.4.0 with apc-3.0.9-dev --enable-apc-mmap --enable-apc-sem
 [2005-10-16 18:57 UTC] don at smugmug dot com
The fact that multiple servers from different clusters see the same problem simultaneously has been bugging me all day.  There is one shared characteristic between them, and that's that they all mount their source trees via NFS to the same server.  (They're diskless web cluster boxes)

That server isn't showing any slowdowns or problems, but it's possible that it hiccups once in awhile.  NFS should (and does) resume just fine in such cases, but...  might the cache get screwed up if there's a mild hiccup?

These files haven't changed in days, so it's not an issue with caching new changes or something.  But maybe there's a hitch getting the modification time or something?

Am I barking up the wrong tree?
 [2005-10-17 11:04 UTC] rasmus@php.net
I don't suppose you have a way of testing this on a non-networked filesystem?  It does sound like it could be related to that.  Definitely if APC can't reliably stat a file, like if your NFS statd is buggered, there is going to be some pain.  On a straight stat failure, it should simply not cache.  See the apc_cache_make_file_key() function apc_cache.c for the spot where the actual stat is done.

Running an instance where APC is compiled with __DEBUG_APC__ defined and having a look at the error_log around the time of the failure might also give me a clue.
 [2005-10-17 16:11 UTC] don at smugmug dot com
I don't easily have a way of setting up a disk-based member of this cluster, but I'll see if I can get it done sometime anyway, if we still think it'd help.

Meanwhile, I recompiled with __DEBUG_APC__ defined and get no error messages in the log file (apache's error_log).  All I get is lots of:

[Mon Oct 17 13:00:41 2005] [notice] child pid 17600 exit signal Segmentation fault (11)

until I flush the cache.  I looked over the code for stat() and it looks like it's using Apache's stat() of the file, anyway, and should fire an error if it can't stat, so that doesn't seem to be the issue.

Any other ideas?
 [2005-10-17 16:15 UTC] rasmus@php.net
These are segfaults you don't get without __DEBUG_APC__ defined?  Sounds like it may just be a problem in the debug code if that is the case.  A backtrace from one of these crashes would help.  I still suspect some sort of NFS weirdness.  Do you have any way to replay your logs to a standalone box with a local disk?  And have you checked your system log files for any error messages that might correspond time-wise?
 [2005-10-17 19:01 UTC] david at edeca dot net
Please note that I see this bug without any sort of NFS or network file mounting.  It reproduces itself after a few hours interval, seemingly when the cache has been full a while.

I haven't had time to upgrade and check the new version yet, but it definitely shouldn't be closed because of network file system suspicions.
 [2005-10-17 19:09 UTC] rasmus@php.net
Well, there was a recent cache-full fix.  So please try the latest version.
 [2005-10-17 19:35 UTC] don at smugmug dot com
Sorry, I should have been more clear.  A prior poster on this bug mentioned the seg faults, so I didn't bother to mention it.

I often get the seg faults whenever a particular host starts to go downhill.  Zero seg faults for awhile (variable, sometimes many hours, sometimes just a few minutes) and then a ton of them until I clear the cache.  

There are no unusual error messages in any of the system logs, or anything unusual shown on top (mem, cpu, etc) that would explain this either.

I'm rebuilding Apache so I have access to the debug info.  I'll send along backtraces when I have them.
 [2005-10-17 20:07 UTC] don at smugmug dot com
I've managed to get two different backtraces.  One appears to occur prior to the other, but that could be a coicindence.  Interestingly, they don't include APC in the backtrace.  Here they are in order:

#0  0x0111fd23 in shutdown_memory_manager (silent=0, clean_cache=0) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_alloc.c:490
490                                     REMOVE_POINTER_FROM_LIST(ptr);
(gdb) bt
#0  0x0111fd23 in shutdown_memory_manager (silent=0, clean_cache=0) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_alloc.c:490
#1  0x01107be9 in php_request_shutdown (dummy=0x0) at /home/onethumb/phpbuilds/php-4.4.0/main/main.c:1008
#2  0x0113bcdc in php_handler (r=0x89385d0) at /home/onethumb/phpbuilds/php-4.4.0/sapi/apache2handler/sapi_apache2.c:572
#3  0x08068625 in ap_run_handler (r=0x89385d0) at /usr/src/redhat/BUILD/httpd-2.0.46/server/config.c:200
#4  0x08068c3f in ap_invoke_handler (r=0x89385d0) at /usr/src/redhat/BUILD/httpd-2.0.46/server/config.c:406
#5  0x08065266 in ap_process_request (r=0x89385d0) at /usr/src/redhat/BUILD/httpd-2.0.46/modules/http/http_request.c:288
#6  0x080608dc in ap_process_http_connection (c=0x8932460) at /usr/src/redhat/BUILD/httpd-2.0.46/modules/http/http_core.c:293
#7  0x08072315 in ap_run_process_connection (c=0x8932460) at /usr/src/redhat/BUILD/httpd-2.0.46/server/connection.c:85
#8  0x08066b01 in child_main (child_num_arg=105) at /usr/src/redhat/BUILD/httpd-2.0.46/server/mpm/prefork/prefork.c:705
#9  0x08066c54 in make_child (s=0x10, slot=82) at /usr/src/redhat/BUILD/httpd-2.0.46/server/mpm/prefork/prefork.c:799
#10 0x08066d76 in startup_children (number_to_start=118) at /usr/src/redhat/BUILD/httpd-2.0.46/server/mpm/prefork/prefork.c:817
#11 0x080675cd in ap_mpm_run (_pconf=0x88270a8, plog=0x8853158, s=0x8828e78)
    at /usr/src/redhat/BUILD/httpd-2.0.46/server/mpm/prefork/prefork.c:1036
#12 0x0806dbcf in main (argc=1, argv=0xbfffa3c4) at /usr/src/redhat/BUILD/httpd-2.0.46/server/main.c:661

and

#0  0x0111f9c2 in _efree (ptr=0xafd73938) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_alloc.c:259
259             REMOVE_POINTER_FROM_LIST(p);
(gdb) bt
#0  0x0111f9c2 in _efree (ptr=0xafd73938) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_alloc.c:259
#1  0x01124ef5 in _zval_ptr_dtor (zval_ptr=0x894e920) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute_API.c:289
#2  0x0112f000 in zend_hash_destroy (ht=0x11cd9cc) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_hash.c:556
#3  0x01124d4e in shutdown_executor () at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute_API.c:184
#4  0x0112b3f3 in zend_deactivate () at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend.c:693
#5  0x01107a95 in php_request_shutdown (dummy=0x0) at /home/onethumb/phpbuilds/php-4.4.0/main/main.c:997
#6  0x0113bcdc in php_handler (r=0x893c5e0) at /home/onethumb/phpbuilds/php-4.4.0/sapi/apache2handler/sapi_apache2.c:572
#7  0x08068625 in ap_run_handler (r=0x893c5e0) at /usr/src/debug/httpd-2.0.46/server/config.c:200
#8  0x08068c3f in ap_invoke_handler (r=0x893c5e0) at /usr/src/debug/httpd-2.0.46/server/config.c:406
#9  0x08065266 in ap_process_request (r=0x893c5e0) at /usr/src/debug/httpd-2.0.46/modules/http/http_request.c:288
#10 0x080608dc in ap_process_http_connection (c=0x8932460) at /usr/src/debug/httpd-2.0.46/modules/http/http_core.c:293
#11 0x08072315 in ap_run_process_connection (c=0x8932460) at /usr/src/debug/httpd-2.0.46/server/connection.c:85
#12 0x08066b01 in child_main (child_num_arg=0) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:705
#13 0x08066c54 in make_child (s=0x18, slot=173) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:799
#14 0x08066d76 in startup_children (number_to_start=27) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:817
#15 0x080675cd in ap_mpm_run (_pconf=0x88270a8, plog=0x8853158, s=0x8828e78)
    at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:1036
#16 0x0806dbcf in main (argc=1, argv=0xbfffa3c4) at /usr/src/debug/httpd-2.0.46/server/main.c:661


Also, we do not see segfaults every time we have a problem. Sometimes the server just stops respond with no segfaults.  It's possible that if I gave it time, the segfaults would occur, but I don't have the luxury of letting it sit in a failed state for long periods of time to see what'll happen.
 [2005-10-17 20:24 UTC] don at smugmug dot com
Got a 3rd type of seg fault:

#0  zend_assign_to_variable (result=0xaf6330b0, op1=0x8e6ab84, op2=0xaf6330d0, value=0x0, type=-2147483647, Ts=0xbffeb2a0)
    at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:452
452                                     value->refcount++;
(gdb) bt
#0  zend_assign_to_variable (result=0xaf6330b0, op1=0x8e6ab84, op2=0xaf6330d0, value=0x0, type=-2147483647, Ts=0xbffeb2a0)
    at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:452
#1  0x01102ce2 in execute (op_array=0x8dc4524) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:1390
#2  0x011051d9 in execute (op_array=0x8df1b5c) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:2258
#3  0x01103ee9 in execute (op_array=0x8e56124) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:1716
#4  0x01103ee9 in execute (op_array=0x8de038c) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend_execute.c:1716
#5  0x010f79ff in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /home/onethumb/phpbuilds/php-4.4.0/Zend/zend.c:938
#6  0x010d4ddf in php_execute_script (primary_file=0xbfff97c0) at /home/onethumb/phpbuilds/php-4.4.0/main/main.c:1751
#7  0x01107dac in php_handler (r=0x8d9cf10) at /home/onethumb/phpbuilds/php-4.4.0/sapi/apache2handler/sapi_apache2.c:555
#8  0x08068625 in ap_run_handler (r=0x8d9cf10) at /usr/src/debug/httpd-2.0.46/server/config.c:200
#9  0x08068c3f in ap_invoke_handler (r=0x8d9cf10) at /usr/src/debug/httpd-2.0.46/server/config.c:406
#10 0x080657f5 in ap_internal_redirect (new_uri=0x0, r=0x0) at /usr/src/debug/httpd-2.0.46/modules/http/http_request.c:498
#11 0x0806524c in ap_process_request (r=0x8deb7f0) at /usr/src/debug/httpd-2.0.46/modules/http/http_request.c:301
#12 0x080608dc in ap_process_http_connection (c=0x8d90440) at /usr/src/debug/httpd-2.0.46/modules/http/http_core.c:293
#13 0x08072315 in ap_run_process_connection (c=0x8d90440) at /usr/src/debug/httpd-2.0.46/server/connection.c:85
#14 0x08066b01 in child_main (child_num_arg=0) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:705
#15 0x08066c54 in make_child (s=0x0, slot=14) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:799
#16 0x08066d76 in startup_children (number_to_start=186) at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:817
#17 0x080675cd in ap_mpm_run (_pconf=0x8c850a8, plog=0x8cb1158, s=0x8c86e78)
    at /usr/src/debug/httpd-2.0.46/server/mpm/prefork/prefork.c:1036
#18 0x0806dbcf in main (argc=1, argv=0xbfff9b94) at /usr/src/debug/httpd-2.0.46/server/main.c:661

Still getting plenty of the other two, but just saw a handful of these.

I've never really used gdb, so if I'm doing something wrong or could be doing something better, please let me know.  

Also, do I have to compile APC with something special to get symbols from it and that's why we're not seeing it in the trace?  Or are we crashing somewhere else and this isn't an APC bug afterall?
 [2005-10-18 14:33 UTC] don at smugmug dot com
Well, well.  Doesn't look like my particular situation is an APC bug afterall.  

I switched to eaccelerator after seeing no mention of APC in those backtraces, and I'm still getting crashes, and the backtraces look the same.  The crashes feel less frequent, but that could be just due to different traffic levels today or something.

Now I really have no idea what to do.  Creating a replay box is nearly impossible given the # of connections (millions) and the size of my database that I'd have to duplicate.

If anyone has any ideas, I'm all ears.
 [2005-10-19 12:40 UTC] don at smugmug dot com
Solved!

The solution doesn't make much sense, to me, but nonetheless, I'm running with PHP-4.4.0 + APC-3.0.8 (--enable-apc-mmap) and it's stable.  :)

The problem turned out to be a single require() and it was a file that was only 37 bytes - one line of PHP code:

<?
$thisHost = "www.smugmug.com";
?>

The file hadn't changed since May 19, 2004.  

In my app, there are a lot of require() and require_once(), especially for all my classes, so I have no idea why this one in particular would cause problems but the others wouldn't.

It was called on every page hit, sometimes more than once, and the site would run for quite awhile before failing.

As soon as I remove that one require() line, the problem vanishes.  I've been stable for 14 hours or something.  It fixes the problem with both eaccelerator and APC.

To add strangeness to strangeness, APC would return to usefulness once I cleared the cache, so somehow I guess APC was caching a corrupt version of the file or something?  I'm still not sure what was going on.

FWIW, Apache was also core dumping a ton without even getting a chance to let me know via error_log.  Hundreds of core dumps would appear without any mention in the log, filling up my disk.

If you want to track down what was going on, I'm happy to help in some way.  David, I hope this somehow helps you, but it's possible we saw similar symptoms for two different problems.
 [2005-10-19 12:44 UTC] rasmus@php.net
Don, that makes very little sense to me.  Do you still have that file?  Were there any strange characters in it or something?  If you still have it, do an: od -c filename
 [2005-10-19 20:13 UTC] don at smugmug dot com
Makes very little sense to me, either, but I was fairly methodical tracking it down, and I still haven't had a crash (24 hours).  Here's the output:

[root@XXX memcache]# od -c /var/www/hostinfo 
0000000   <   ?  \n   $   t   h   i   s   H   o   s   t       =       "
0000020   w   w   w   .   s   m   u   g   m   u   g   .   c   o   m   "
0000040   ;  \n   ?   >  \n
0000045

FYI, after I'd exhausted everything I knew how to do, I finally went in and put 'trigger_error("some useful error message");' before every big "chunk" of one of my scripts, and then waited for it to crash.  I gradually narrowed it down to which functions.  Eventually, I had a trigger_error() on every other line inside of a single function, and it errored then segfaulted right before ('require("/var/www/hostinfo");' every time.  I removed that line and came up with $thisHost using the hostname of the server in question instead, and now I've been up for 24 hours.

Again, it makes no sense to me either, but that's what's going on.  And it would take quite some time (minutes at most, but usually at least an hour) before it would crash.  I'm happy to give you anything else you might want to know, though.
 [2005-10-20 05:24 UTC] rasmus@php.net
Ok, back to David's problem.  I think Don's problem was unrelated and too bloody weird for me to grasp right now.  David, did you try version 3.0.8?  Is it still happening there?
 [2005-10-24 18:52 UTC] david at edeca dot net
I've upgraded to APC 3.0.8 and enabled it again.  I will 
watch the logs for a day or so and see if it is still 
unstable or not. 
 
Thanks!
 [2005-10-25 13:21 UTC] david at edeca dot net
Unfortunately, it's still running into problems.  This  
time it managed 14 hours running before loads of these  
  
[Tue Oct 25 17:17:14 2005] [apc-warning] GC cache entry  
'/var/www/my.pengus.net/htdocs/mail/giapeto/lib/api.php' (dev=65026  
ino=0) was on gc-list for 590429 seconds  
  
And then some of these:  
  
[Tue Oct 25 17:44:14 2005] [notice] child pid 23131 exit  
signal Segmentation fault (11)  
  
After this, Apache gives up the ghost at serving content.  
I did notice that after about 8 hours running the cache 
was very fragmented.  The figure reported by the 
statistics page was 50%. 
  
I've got 4 segments of 32M.  At about the 8 hour mark 
there were around 20k hits and ~500 misses.  Unfortunately 
I don't have figures for later in the day as it had 
crashed before I could get to it. 
 
Any suggestions from hereon?
 [2005-10-25 13:26 UTC] rasmus@php.net
Why 4 segments?  Use a single mmap'ed 128M segment.  Much cleaner.
 [2005-10-25 13:45 UTC] david at edeca dot net
As you will see from my previous comments above, I have 
tried a single 128M MMAPed segment but I get exactly the 
same errors with or without it.   
 
From this I can conclude that the problem (whatever it may 
turn out to be) isn't directly related to having 1 or 4 
segments.
 [2005-10-25 13:48 UTC] rasmus@php.net
But that was with an older version I thought.  And it probably isn't directly related, but it certainly doesn't help.
 [2005-10-25 17:06 UTC] david at edeca dot net
With mmap enabled I set one segment of 128M.  This is APC 3.0.8.

After only an hour I see 49% cache fragmentation.  Apache by this point is very slow to respond to requests that involve PHP content.

The cache images are available at:
http://edeca.net/temp/apc.php.png
http://edeca.net/temp/apc2.php.png

Configuration (as reported by apc.php):
apc.cache_by_default	1
apc.enable_cli	0
apc.enabled	1
apc.file_update_protection	2
apc.filters	
apc.gc_ttl	300
apc.max_file_size	1M
apc.mmap_file_mask	
apc.num_files_hint	1000
apc.optimization	0
apc.shm_segments	1
apc.shm_size	128
apc.slam_defense	0
apc.ttl	3600
apc.user_entries_hint	100
apc.user_ttl	0
 [2005-10-25 17:18 UTC] rasmus@php.net
What are you actually doing on that server?  Is the content changing rapidly?  If so, how are you updating files?
 [2005-11-30 12:09 UTC] mark_round at ipcmedia dot com
Just a note to say that I am also seeing this behaviour, albeit in a slightly different form. We didn't see any GC errors in the logs, just sometimes a single segfault an hour or so before the server stops responding (which could be totally unrelated). My system is Apache 1.3.33, PHP 4.4.1, APC 3.0.8 and Solaris 8. My APC settings are as follows :

apc.shm_size=140
apc.num_files_hint=10000
apc.ttl=60
apc.slam_defense=90
apc.file_update_protection=5

APC seems to work fine for hours, or even a day or so - and then suddenly, Apache stops serving pages altogether. A restart of Apache clears things up. I have compiled APC using the --enable-apc-mmap flag.
 [2005-12-02 11:28 UTC] swen at grmblfrz dot de
Well, this rings a bell. See bug #1583, I had the same
symptoms (heavy cache fragmentation), and I have a recipe
to reproduce it. Could this be the same bug?

--Swen
 [2005-12-05 11:30 UTC] mark_round at ipcmedia dot com
Hi Swen,

Thanks for the info. I can confirm that I can reproduce your bug and that your patch fixes the issue - I will install this patched version over the next couple of days and watch what happens. Usually, the server "breaks" with a day or so, so I should have some feedback by the end of the week one way or another.

-Mark
 [2005-12-14 08:02 UTC] mark_round at ipcmedia dot com
Sorry for the delay. I tested your patch from bug 1583, and the latest from CVS, and the same thing happened. I can't retry it again for a while, as it hosed our production server.

All ran fine for a while, then I tried clearing the cache a couple of times. All hell broke loose, and the server started segfaulting on every request. I didn't have the opportunity to investigate further, as this was on a production server - I had to immediately back out APC. I can't recreate this on a dev environment, but it seems only to happen under load on a dual CPU box.
 [2005-12-14 15:34 UTC] gene at pickaprof dot com
I've been watching this thread and trying to avoid a "me too" post.  Just wanted to note that I get the GC cache entry errors on all 3 of our production servers - 1 of them has 2 CPU's and the others have 1, so I don't think it's limited to SMP systems.  

It's sporadic for us, one of the servers will throw the error every few days.  Restart Apache and all is well.  It's never happened on our dev server.

Running CentOS 3.6, stock Apache 2.0.46-54 & PHP 4.3.2-26, with APC 3.0.8 --enable-apc-mmap --with-apxs.  APC is using all the default settings.
 [2005-12-23 03:12 UTC] alant at transtelecom dot md
Centos 4u2. php-4.4.1, apc-3.0.8. I have the same random crashed, after restart apache works ok.
Here is backtrace:
#0  zend_assign_to_variable (result=0xafdb6e9c, op1=Variable "op1" is not available.)
    at /usr/src/redhat/BUILD/php-4.4.1/Zend/zend_execute.c:452
#1  0x010e0752 in execute (op_array=0x91d02c0)
    at /usr/src/redhat/BUILD/php-4.4.1/Zend/zend_execute.c:1966
#2  0x010df59c in execute (op_array=0x9248de4)
    at /usr/src/redhat/BUILD/php-4.4.1/Zend/zend_execute.c:1719
#3  0x010df59c in execute (op_array=0x9242354)
    at /usr/src/redhat/BUILD/php-4.4.1/Zend/zend_execute.c:1719
#4  0x010d07b0 in zend_execute_scripts (type=8, retval=0x0, file_count=3)
    at /usr/src/redhat/BUILD/php-4.4.1/Zend/zend.c:938
#5  0x010a611e in php_execute_script (primary_file=0xbfe5f3b0)
    at /usr/src/redhat/BUILD/php-4.4.1/main/main.c:1743
#6  0x010ea442 in php_handler (r=0x924f840)
    at /usr/src/redhat/BUILD/php-4.4.1/sapi/apache2handler/sapi_apache2.c:572
 [2006-01-03 18:47 UTC] blaind at blaind dot net
I experienced the same behavior on PHP 5.1.1 and APC 3.0.8. 
Strace showed the same line:

fcntl64(109, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}

for crashed processes. But this happened only after updating the source code of the (fairly loaded) server by rsync.

Rsync modified ALL the PHP files of the web site (even if they had not changed) and it seemed to create some problems for APC.

As a workaround I updated rsync to do checksumming for the files, so now it modifies only the really updated files.

APC has no longer crashed. As a conclusion, maybe there is problem with APC if multiple files are edited / changed simultaneously?
 [2006-02-16 18:34 UTC] ilia at prohost dot org
This bug has been fixed in CVS.

In case this was a documentation problem, the fix will show up at the
end of next Sunday (CET) on pecl.php.net.

In case this was a pecl.php.net website problem, the change will show
up on the website in short time.
 
Thank you for the report, and for helping us make PECL better.


 [2006-03-29 18:17 UTC] gene at pickaprof dot com
After installing APC 3.0.10 a week ago, I got the GC cache error again today:

[Wed Jan 25 16:24:46 2006] [apc-warning] GC cache entry '<filename>' (dev=2050 ino=0) was on gc-list for 6308298 seconds

System is the same as my 12/14 post
 [2006-03-30 16:10 UTC] gene at pickaprof dot com
oops. in case anyone was paying attention to the date from the log there, should have been:

[Wed Mar 29 16:05:17 2006] [apc-warning] GC cache entry '<filename>' (dev=2050 ino=0) was on gc-list for 6309890 seconds
 [2006-11-11 16:04 UTC] news at smartflyer dot biz
I'm running PHP 4.4.4/Apache 1.3.37

Here's what I'm seeing in my logs:

[apc-warning] GC cache entry '/pathtofile/file.php' (dev=114 ino=0) was on gc-list for 25038030 seconds

I'm a bit of a newb'; I'm hoping I'm just missing something minor. 
I've run across some conflicting information for settings (like setting a ttl will result in fragmentation) so the only settings I'm using right now are
apc.enabled = 1
apc.shm_size = 128
I'm sure I need to set a ttl but am unsure what to believe.
I'm really looking for an educated opinion. 
Any help would be appreciated.
 [2007-01-11 05:22 UTC] jonhohle at gmail dot com
I'm experiencing a similar issue on a production box right 
now.  APC seemed to stop responding after an update to the 
app (cvs up -r current_tag -dCP).  It happened on 2 of 4 
boxes.  Restarting Apache cleared the problem.

Unfortunately, while my first thoughts went to APC as being 
the source of the problem (I had similar experiences with 
APC about a year ago); I restarted apache before thinking 
about checking APC's info.

I'm thinking deployments will just have to require an apache 
restart (or rollback APC).
 [2007-02-09 07:30 UTC] btesanovic at gmail dot com
I have a problem with running two servers on same mashine windows box, 
I have apache 1.3 and 2.0 , I must stop one server in order to have APC enabled, it cant run on both servers :( 

--message form Apache 1.3
[Fri Feb 09 13:28:57 2007] [apc-error] apc_shm_create: shmget(0, 31457280, 658)
failed: No such file or directory. It is possible that the chosen SHM segment size is higher than the operation system allows. Linux has usually a default limit
 of 32MB per segment.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 17:01:29 2024 UTC