php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #31558 "apachectl graceful" leaks
Submitted: 2005-01-14 19:44 UTC Modified: 2005-06-04 01:00 UTC
Votes:11
Avg. Score:4.5 ± 0.7
Reproduced:8 of 8 (100.0%)
Same Version:6 (75.0%)
Same OS:2 (25.0%)
From: jdw at nearlyfreespeech dot net Assigned:
Status: No Feedback Package: Apache related
PHP Version: 4CVS-2005-01-21 OS: FreeBSD
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
42 - 4 = ?
Subscribe to this entry?

 
 [2005-01-14 19:44 UTC] jdw at nearlyfreespeech dot net
Description:
------------
We have seen this error under the following conditions

- FreeBSD 4-STABLE
- Using Apache 1.3.33
- PHP 4.3.10
- multiple servers
- servers have hundreds of megs of available RAM and gigs of free swap
- httpd processes are well within resource limits
- happens with random PHP scripts, even simple pages that don't use much memory
- pages that generate the error may work if immediately reloaded (unchanged with respect to code and data), suggesting that this is not a per-request memory limit being exceeded
- almost always happens *before* headers are sent
- occurs on pages not using gzip or zlib or output compression or anything else that would defer content output
- happens with alloc amounts from 7500 bytes to about 1meg, averaging between 100-300k.
- happens in repetitive "clusters"

Problem persists across "apachectl graceful" but "goes away" (for awhile at least) after "apachectl restart" so the Apache parent process may tie in somehow.

Is there a way I can obtain more helpful debug information about this in a production environment?

Thanks for any help with this!







Reproduce code:
---------------
n/a... virtually any PHP page appears susceptible, which is consistent with the observation that it occurs before headers are sent.


Expected result:
----------------
PHP script should run.

Actual result:
--------------
Example from Apache error log: (all messages appear in immediate succession)
FATAL:  erealloc():  Unable to allocate 7500 bytes
FATAL:  erealloc():  Unable to allocate 7500 bytes
FATAL:  erealloc():  Unable to allocate 7500 bytes
FATAL:  erealloc():  Unable to allocate 7500 bytes
FATAL:  erealloc():  Unable to allocate 7500 bytes
FATAL:  erealloc():  Unable to allocate 7500 bytes


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-01-14 21:39 UTC] tony2001@php.net
What's your PHP configuration? Any Zend modules enabled?
Any chance to replicate the problem with `strace httpd -X`?
The info you've provided is pretty useless ATM as we don't have your configuration and we're unable to reproduce the problem.
 [2005-01-14 23:47 UTC] jdw at nearlyfreespeech dot net
Our PHP config is at:

http://example.nfshost.com/phpinfo.php

If you need anything beyond that, let me know what, and I'll add it here.  I'll put the php.ini at the bottom, along with example per-site overrides.

It's pretty stock.  They only Zend module I've heard of is the optimizer, and I know we don't use that.  The config line is at the top of the phpinfo page.

strace does not appear on these FreeBSD boxes and 'httpd -?' does not show -X as an option, so I may need some guidance as to what you're hoping to accomplish with that to find the equivalent.  If strace is like ktrace, then I have to assume that the output generated by the parent and all child processes would be problematically large.

From the source code, it looks like we can define a flag that will SEGV the httpd when this occurs.  Is it possible that this would generate a useful backtrace?  Would defining this  flag (ZEND_DEBUG, I think) have other dreadful consequences not consistent with running in production?

Thanks for looking at this!

php.ini:

auto_prepend_file = nfsn_umask_hack.php
upload_tmp_dir = /tmp
max_execution_time = 180
browscap = /nfsn/apps/php/lib/browscap.ini
post_max_size = 10M
safe_mode_exec_dir = ""
safe_mode_include_dir = "/nfsn/apps/php/lib/php/";
safe_mode_allowed_env_vars = TZ
safe_mode_gid = true
upload_max_filesize = 10M
sendmail_path = /nfsn/bin/sendmail -t -i

Example from httpd.conf:
  php_admin_value open_basedir (3 dirs) 
  php_admin_value safe_mode 1
  php_admin_value upload_tmp_dir /nfsn/content/example/tmp
  php_admin_value max_execution_time 180

The file referenced in auto_prepend_file has one line:

<?php umask(0); ?>
 [2005-01-15 00:16 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php4-STABLE-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php4-win32-STABLE-latest.zip

First get the stable snapshot.

Configure with --enable-debug (that will actually make PHP more stable in some cases..)

Then see if you get any weird leaks or something in your logs.
And/or crashes -> generate a GDB backtrace and paste it here.

 [2005-01-20 07:04 UTC] jdw at nearlyfreespeech dot net
We have put the -stable code into place on part of our network.  We'll see what happens.  

I observed the following in "make test."

=====================================================================
FAILED TEST SUMMARY
---------------------------------------------------------------------
pspell basic tests (warning: may fail with pspell/aspell < GNU Aspell 0.50.3) [e
xt/pspell/tests/01pspell_basic.phpt]
Bug #31213 (Sideeffects caused by bug #29493) [ext/standard/tests/array/bug31213.phpt]                
Bug #30069 (floats as strings used in calculations do not work) [ext/standard/tests/math/bug30069.phpt]
Bug #27780 (strtotime(+1 xxx) returns a wrong date/time) [ext/standard/tests/time/bug27780.phpt]
=====================================================================

Of these, only the last one also fails on identical hardware /config under 4.3.10.  If this matters for a "LATEST" release, I'll report it separately; not sure of the protocol there.  Otherwise, I'll report back when we see the problem again.
 [2005-01-20 07:22 UTC] jdw at nearlyfreespeech dot net
I enabled memory limits during the rebuild, and we've already gotten one of these:

Allowed memory size of 8388608 bytes exhausted at /tmp/source/php_apache_debug/php4-STABLE-200501171730/main/output.c:230 (tried to allocate 12 bytes)

I don't have enough info to know if this is related or just coincidence.  Is there any way to tell what request it was?

(I want to reiterate that our error happens at random on simple pages that cannot possibly take up 8mbs of RAM to render, lest anyone see this and say "oh they really are out of RAM after all" and close the bug.)

I can change or remove the memory limit if you think this might mask the real problem.
 [2005-01-20 08:16 UTC] derick@php.net
You can ignore the strtotime() test failing, that is a bug on FreeBSD.
 [2005-01-20 14:11 UTC] jdw at nearlyfreespeech dot net
A very large number of them this morning:

FATAL:  erealloc():  Unable to allocate 61440 bytes
   2 FATAL:  erealloc():  Unable to allocate 983040 bytes
FATAL:  erealloc():  Unable to allocate 61440 bytes
   4 FATAL:  erealloc():  Unable to allocate 983040 bytes
FATAL:  erealloc():  Unable to allocate 61440 bytes
   9 FATAL:  erealloc():  Unable to allocate 983040 bytes
FATAL:  erealloc():  Unable to allocate 61440 bytes
FATAL:  erealloc():  Unable to allocate 983040 bytes

To reproduce this, I believe it is important to have apache do a graceful restart a few hundred times.  Could the leak be in parsing PHP config variables out of httpd.conf?

Here's a stack trace:

#0  0x28178a62 in memcpy () from /usr/lib/libc.so.4
#1  0xbfbf948c in ?? ()
#2  0x283702b1 in php_apache_admin_value_handler (cmd=0xbfbf9460, 
    conf=0x280c7d80, arg1=0x8b1e3cc "max_execution_time", arg2=0x8b1e3e4 "180")
    at /tmp/source/php_apache_debug/php4-STABLE-200501171730/sapi/apache/mod_php4.c:780
#3  0x08074f58 in invoke_cmd (cmd=0x285406b0, parms=0xbfbf9460, 
    mconfig=0x280c7d80, args=0xbfbf52c6 "") at http_config.c:826
#4  0x08075872 in ap_handle_command (parms=0xbfbf9460, config=0x85885c4, 
    l=0xbfbf52a0 "php_admin_value max_execution_time 180")
    at http_config.c:1037
#5  0x080758f9 in ap_srm_command_loop (parms=0xbfbf9460, config=0x85885c4)
    at http_config.c:1051
#6  0x08079c86 in virtualhost_section (cmd=0xbfbf9460, dummy=0x80d79cc, 
    arg=0xbfbf73cd "*") at http_core.c:1972
#7  0x08074dd2 in invoke_cmd (cmd=0x80bc700, parms=0xbfbf9460, 
    mconfig=0x80d79cc, args=0xbfbf73cd "*") at http_config.c:796
#8  0x08075872 in ap_handle_command (parms=0xbfbf9460, config=0x80d6d84, 
    l=0xbfbf73c0 "<VirtualHost *") at http_config.c:1037
#9  0x080758f9 in ap_srm_command_loop (parms=0xbfbf9460, config=0x80d6d84)
    at http_config.c:1051
#10 0x0807607c in ap_process_resource_config (s=0x80d6034, 
    fname=0x8588314 "/nfsn/apps/apache/vhosts/ninjaman.conf", p=0x80d600c, 
---Type <return> to continue, or q <return> to quit---
    ptemp=0xca8b00c) at http_config.c:1343
#11 0x0807b0b8 in include_config (cmd=0xbfbfb650, dummy=0x80d79cc, 
    name=0x8588314 "/nfsn/apps/apache/vhosts/ninjaman.conf")
    at http_core.c:2769
#12 0x08074ea8 in invoke_cmd (cmd=0x80bcdf0, parms=0xbfbfb650, 
    mconfig=0x80d79cc, args=0xbfbf95de "") at http_config.c:814
#13 0x08075872 in ap_handle_command (parms=0xbfbfb650, config=0x80d6d84, 
    l=0xbfbf95b0 "Include /nfsn/apps/apache/vhosts/ninjaman.conf")
    at http_config.c:1037
#14 0x080758f9 in ap_srm_command_loop (parms=0xbfbfb650, config=0x80d6d84)
    at http_config.c:1051
#15 0x0807607c in ap_process_resource_config (s=0x80d6034, 
    fname=0x19d831e4 "/nfsn/apps/apache/conf/vhosts.conf", p=0x80d600c, 
    ptemp=0xca8b00c) at http_config.c:1343
#16 0x0807b0b8 in include_config (cmd=0xbfbfd840, dummy=0x80d79cc, 
    name=0x19d831e4 "/nfsn/apps/apache/conf/vhosts.conf") at http_core.c:2769
#17 0x08074ea8 in invoke_cmd (cmd=0x80bcdf0, parms=0xbfbfd840, 
    mconfig=0x80d79cc, args=0xbfbfb7b8 "") at http_config.c:814
#18 0x08075872 in ap_handle_command (parms=0xbfbfd840, config=0x80d6d84, 
    l=0xbfbfb7a0 "include conf/vhosts.conf") at http_config.c:1037
#19 0x080758f9 in ap_srm_command_loop (parms=0xbfbfd840, config=0x80d6d84)
    at http_config.c:1051
#20 0x0807607c in ap_process_resource_config (s=0x80d6034, 
---Type <return> to continue, or q <return> to quit---
    fname=0xb5bb28c "/nfsn/apps/apache/conf/cluster.conf", p=0x80d600c, 
    ptemp=0xca8b00c) at http_config.c:1343
#21 0x0807b0b8 in include_config (cmd=0xbfbffa30, dummy=0x80d79cc, 
    name=0xb5bb28c "/nfsn/apps/apache/conf/cluster.conf") at http_core.c:2769
#22 0x08074ea8 in invoke_cmd (cmd=0x80bcdf0, parms=0xbfbffa30, 
    mconfig=0x80d79cc, args=0xbfbfd9a9 "") at http_config.c:814
#23 0x08075872 in ap_handle_command (parms=0xbfbffa30, config=0x80d6d84, 
    l=0xbfbfd990 "Include conf/cluster.conf") at http_config.c:1037
#24 0x080758f9 in ap_srm_command_loop (parms=0xbfbffa30, config=0x80d6d84)
    at http_config.c:1051
#25 0x0807607c in ap_process_resource_config (s=0x80d6034, 
    fname=0x80cbaa0 "/nfsn/conf/httpd.conf", p=0x80d600c, ptemp=0xca8b00c)
    at http_config.c:1343
#26 0x0807697c in ap_read_config (p=0x80d600c, ptemp=0xca8b00c, 
    confname=0x80cbaa0 "/nfsn/conf/httpd.conf") at http_config.c:1635
#27 0x08080fb5 in standalone_main (argc=1, argv=0xbfbffbb4) at http_main.c:5373
#28 0x080818fe in main (argc=1, argv=0xbfbffbb4) at http_main.c:5767
#29 0x0804fc26 in _start ()
(gdb)
 [2005-01-20 22:37 UTC] iliaa@php.net
Sounds  like your  system may have a limit on the amount of memory a single process can request. When that limit is reached and php fails to allocate memory it terminates with an error that you are seeing.
 [2005-01-20 23:41 UTC] jdw at nearlyfreespeech dot net
As previously indicated, the system limits are not in play.  Or rather, if they are, they shouldn't be.  I've said that a couple of times now in attempt to avoid a quick "you're out of memory" reaction.  Unfortunately, that strategy does not appear to be a winner.

The per-process allocation limits are currently set to 512M.  
That is a very savvy observation that the process size grows to 512M and then allocations fail.  I checked out the size of the core dump file, and it confirms your theory.  Whatever random PHP script, large or small, is running at that point gets hosed (obviously).  Good catch!

But it is a little hasty to jump from that to "bogus."

The "steady state" size of Apache processes on this system should be about 30M.  Where did the other 480M come from?

Further research today has shown that Apache leaks about 1mb of RAM on every "graceful" (SIGUSR1) restart.  If it's something else leaking, fine, but so far the stack trace says it's PHP.

Perhaps instead of rushing to "bogus," you could suggest some ways I could prove or disprove the theory that PHP leaks memory when Apache is reconfigured.  

That would be very helpful, and I would very much appreciate help from any corner with this problem.

(State changed to "open."  If investigation reveals that the leak is not in PHP I'll be the first one to change it to bogus.)

Thanks!
 [2005-01-21 00:36 UTC] iliaa@php.net
Disable PHP from Apache and see if the problem persists. It is possible that one the libraries used by PHP is leaking etc... Right now all we know is that your PHP is failing to allocate memory and you are claiming that PHP leaks. Well, that does not say a whole lot.
 [2005-01-21 01:04 UTC] jdw at nearlyfreespeech dot net
Keep in mind that this is happen in the "master" Apache process, a context in which no PHP scripts are ever executed.  In fact, very little happens other than parsing config files.  That should severely constrain the influence of libraries & extensions.

Apache obviously won't parse its configs with PHP disabled, and our customers will (justifiably) riot if we turn it off. 
There are quite a large number of PHP-specific conf lines in a proportional number of files (the reason I suspect our setup is finding this earlier than the general population), so I'm not sure how feasible your suggestion is.  

But I'm happy to see if I can rig something up on a test server to explore it.  More to follow.

It would be more direct if there were a means to instrument the amount of memory allocated at start vs. the amount freed at restart.

It would also be handy if there's someone around who can point me in the right direction as to where in the PHP Apache SAPI code to find the relevant setup/cleanup code.

Also, I now have a second backtrace, but I am hestitant to post it here because it's even larger than the first one.  It's similar: died on a "php_admin_value open_basedir" instead of "php_admin_value max_execution_time."
 [2005-01-22 06:48 UTC] jdw at nearlyfreespeech dot net
This test has been completed.  I believe this graph of the results pretty much speaks for itself:

http://example.nfshost.com/ApacheGracefulMem.png

Results:
- With PHP enabled, "graceful" leak averages 903,000 bytes
- With PHP disabled, "graceful" leak averages 8,900 bytes
- "No PHP" test memory usage *converges* (leaks are front-loaded and poorly represented by the linear fit -- hard to see because of the scale of the graph)
- "PHP" test memory usage *diverges* (increases linearly until process limits intervene)

Methodology:
- Apache 1.3.33
- PHP version php4-STABLE-200501210130
- same Apache binary, same server, same conditions
- same conf file (with "LoadModule php4_module" "php_admin_value" and "php_value" lines removed for the "No HP" test)
- 200 consecutive "graceful" restarts of Apache process
- considering only "master" process
- no requests issued during test
 [2005-01-22 06:55 UTC] jdw at nearlyfreespeech dot net
I also looked at the source code.  Without a detailed understanding of zend_hash_*, it isn't clear that all the per_dir_info malloc() cases in sapi/apache/mod_php4.c (particularly php_value_handler_ex() ) are being properly freed by php_destroy_per_dir_info().  That's just a WAG, but it might be something to look into as a possible cause.
 [2005-05-17 12:02 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip


 [2005-05-17 14:11 UTC] jdw at nearlyfreespeech dot net
Um, as this is a PHP 4 bug, trying a PHP 5 snapshot would not seem to be appropriate.
 [2005-05-18 00:16 UTC] sniper@php.net
Yes it would. We're currently focused on PHP 5 development, and if you really want stuff get fixed: Try the snapshot.
(if this happens in PHP 5.1-dev too, it's more likely to be fixed also in the PHP_4_3 branch..)

 [2005-05-18 00:49 UTC] jdw at nearlyfreespeech dot net
Interestingly, there is some evidence that this bug may be either mitigated or fixed in 4.3.11.

We have also switched to using more shared modules, so it is also possible that the leak still exists, but is related to a specific extension that is now not being loaded by the "core" apache process (causing the leak not to accrue across generations).  It may be possible for us to investigate that possibility and provide more information to rule it in or out.
 [2005-06-04 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 18 07:01:27 2024 UTC