php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #32974 Forked socket server segfaults after high connection volume
Submitted: 2005-05-07 10:27 UTC Modified: 2005-05-07 17:01 UTC
From: jim_keller at centerfuse dot net Assigned:
Status: Closed Package: Reproducible crash
PHP Version: 4.3.11 OS: FreeBSD 4.10
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: jim_keller at centerfuse dot net
New email:
PHP Version: OS:

 

 [2005-05-07 10:27 UTC] jim_keller at centerfuse dot net
Description:
------------
This forked socket server works well until I stress test it with a high volume of connections over an extended period (1 hour or more, 20,000-50,000 connections), at which point it segfaults. I had been using socket_select() prior to this implementation, but was seeing segfaults there as well (more frequently), and noticed that the socket_select() segfaults were a known issue as per TODO_SEGFAULTS. I switched over to this approach, which in some ways mimics the behavior of socket_select(), and thought I had the segfaults licked until I started stress testing the daemon. 
There is quite a bit of code to this project, as it it built upon my homegrown PHP component suite (http://www.phpfuse.net), but I've tried to include the more relevant parts. fusepaneld.php is the script that gets executed from the CLI. It instantiates FuseWebTaskDaemon and calls start_daemon(), which is where most of the actual work is done. 


Reproduce code:
---------------
(see description for explanation of these two files)

http://jim.centerfuse.net/php_segfaults/fusepaneld.phps
http://jim.centerfuse.net/php_segfaults/FuseWebTaskDaemon.class.phps


Expected result:
----------------
This code is a socket server that uses socket_accept() (with a nonblocking socket) to accept incoming connections from a web script. The web script sends the daemon a message in the form of a serialized object. The message contains authentication information, a function name, and function parameters. The elevated-privilege listening daemon then forks a child, executes the function as requested and returns the result to the front end web application. Everything works fine until the daemon has handled several thousand connections, when it segfaults and dies. 

Actual result:
--------------
php4.3.11 in malloc(): warning: recursive call

Program received signal SIGSEGV, Segmentation fault.
0x811090b in zend_llist_add_element (l=0x8171448, element=0x7fbfbb74) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_llist.c:40
40              tmp->prev = l->tail;
(gdb) bt

#0  0x811090b in zend_llist_add_element (l=0x8171448, element=0x7fbfbb74) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_llist.c:40
#1  0x80745a6 in pcntl_signal_handler (signo=20) at /usr/src/php-4.3.11/php-4.3.11/ext/pcntl/pcntl.c:509
#2  0x7fbfffac in ?? ()
#3  0x283ddb77 in isatty () from /usr/lib/libc.so.4
#4  0x283dddcd in isatty () from /usr/lib/libc.so.4
#5  0x283de4e5 in malloc () from /usr/lib/libc.so.4
#6  0x8109c26 in _emalloc (size=129) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_alloc.c:164
#7  0x8109e4b in _erealloc (ptr=0x0, size=129, allow_failure=0) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_alloc.c:301
#8  0x80f4c9c in xbuf_resize (xbuf=0x7fbfbefc, add=0) at /usr/src/php-4.3.11/php-4.3.11/main/spprintf.c:143
#9  0x80f4cec in xbuf_init (xbuf=0x7fbfbefc, max_len=1024) at /usr/src/php-4.3.11/php-4.3.11/main/spprintf.c:160
#10 0x80f57e2 in vspprintf (pbuf=0x7fbfbf4c, max_len=1024, format=0x8162080 "Use of undefined constant %s - assumed '%s'", 
    ap=0x7fbfbfd0 "L\005\036\bL\005\036\bܿ?\177\214\001_\bL?'\b\003\001\003")
    at /usr/src/php-4.3.11/php-4.3.11/main/spprintf.c:630
#11 0x80f1af9 in php_error_cb (type=8, 
    error_filename=0x822b38c "/usr/local/share/FUSE/support/fusewebtask/FuseWebTaskDaemon.class.php", error_lineno=154, 
    format=0x8162080 "Use of undefined constant %s - assumed '%s'", 
    args=0x7fbfbfd0 "L\005\036\bL\005\036\bܿ?\177\214\001_\bL?'\b\003\001\003") at /usr/src/php-4.3.11/php-4.3.11/main/main.c:599
#12 0x8116def in zend_error (type=8, format=0x8162080 "Use of undefined constant %s - assumed '%s'")
    at /usr/src/php-4.3.11/php-4.3.11/Zend/zend.c:789
#13 0x812ae19 in execute (op_array=0x81ce98c) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_execute.c:2031
#14 0x8128010 in execute (op_array=0x85e3400) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_execute.c:1698
#15 0x8128010 in execute (op_array=0x81cc40c) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_execute.c:1698
#16 0x8117124 in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend.c:926
#17 0x80f3773 in php_execute_script (primary_file=0x7fbffafc) at /usr/src/php-4.3.11/php-4.3.11/main/main.c:1745
#18 0x812e8d9 in main (argc=3, argv=0x7fbffb64) at /usr/src/php-4.3.11/php-4.3.11/sapi/cli/php_cli.c:828

(gdb) frame 13
#13 0x812ae19 in execute (op_array=0x81ce98c) at /usr/src/php-4.3.11/php-4.3.11/Zend/zend_execute.c:2031
2031                                            zend_error(E_NOTICE, "Use of undefined constant %s - assumed '%s'",

(gdb) print (char *)(executor_globals.function_state_ptr->function)->common.function_name
$2 = 0x82707cc "start_daemon"
(gdb) print (char *)executor_globals.active_op_array->function_name
$3 = 0x82707cc "start_daemon"
(gdb) print (char *)executor_globals.active_op_array->filename
$4 = 0x822b38c "/usr/local/share/FUSE/support/fusewebtask/FuseWebTaskDaemon.class.php"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-05-07 15:00 UTC] wez@php.net
What's happening here is that PHP is in the middle of allocating some memory when SIGCHLD is delivered.  The pcntl signal handler then allocates some memory.
The fbsd malloc() is not re-entrant so emits the "in malloc warning: recursive call" and fails the memory allocation.
Normally, PHP allocates all memory via emalloc(), which will abort the PHP request when memory allocation fails, but for some reason, the person that wrote pcntl decided to use a persistent linked list that calls malloc direct.

So, what we've got here is a triple bug:

- pcntl should not malloc inside a signal handler
- pcntl should not use a persistent llist
- zend llist code should check the pemalloc return value


 [2005-05-07 17:01 UTC] wez@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

Grab the next 4.3.x snapshot from http://snaps.php.net to try it out.

I fixed the problem by pre-allocating records for the signal queue.  There is a limit of 32 pending signals in the current implementation; if more than 32 signals are delivered before the tick handler can dispatch them, those signals will be ignored.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC