php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76930 SIGQUIT with process_control_timeout doesn't properly kill idle children
Submitted: 2018-09-24 21:06 UTC Modified: -
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:1 (50.0%)
From: giovanni at giacobbi dot net Assigned:
Status: Open Package: FPM related
PHP Version: 7.2.10 OS: Linux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2018-09-24 21:06 UTC] giovanni at giacobbi dot net
Description:
------------
php-fpm spawns a series of workers, these workers can be gracefully stopped by enabling "process_control_timeout", which grants a grace period when SIGQUIT is received to complete the ongoing task before terminating the process.

Unfortunately, there are a series of bug which cause a freshly started daemon to needlessly hang until the timeout expires.

First problem: SA_RESTART flag causes SIGQUIT to be ignored until first request
The master process sets up signal handling in function fpm_signals_init_child(), setting the SA_RESTART flag.
The SIGQUIT handler causes a flag (in_shutdown) inside fastcgi.c, but this flag is not checked if the worker is blocked inside the accept() syscall.
This problem is solved by removing the following line from fpm_signals.c:
   act.sa_flags |= SA_RESTART;

Second problem: (bug-in-a-bug) php_request_shutdown() doesn't properly restore the initial worker process state
The "First problem" occurs only before the worker executed any PHP script, because after that the call to php_request_startup() causes signals handlers to be redefined inside the Zend engine (zend_signal.c), which define them WITHOUT the SA_RESTART flag.
I believe that for consistency the dual function php_request_shutdown() should restore the signal handlers as they were before the execution of the PHP script, to "return" to the original state.

Third problem: Idle clients prevent the worker from gracefully terminate
If a worker is holding an open FCGI socket with a worker, this worker won't gracefull terminate even though it is not executing any PHP script.
This behaviour is unneeded and an idle client can be safely dropped.


If you are interested in solving the three problems above I can provide a pull request, I might need some help with the second point to avoid breaking other SAPIs while modifying the zend internal behaviour.


Test script:
---------------
1) Freshly start php-fpm with some workers with process_control_timeout eg. 60s
2) Issue a SIGQUIT to the parent process when all workers are idle (and before they handled any request)


Expected result:
----------------
All FPM processes should terminate immediately (since they are idle)

Actual result:
--------------
Processes wait until process_control_timeout to terminate

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-11-14 17:57 UTC] tasikoglu at webuy dot com
I can reproduce this on versions 7.2.9 though to 7.2.11.

Running on Amazon Linux 2 and CentOS 7.

Debug output with process_control_timeout set to 60s on a silent vagrant box.:

14-Nov-2018 17:03:32.323116] DEBUG: pid 22501, fpm_pctl_kill_all(), line 159: [pool www] sending signal 3 SIGQUIT to child 28721
[14-Nov-2018 17:03:32.323118] DEBUG: pid 22501, fpm_pctl_kill_all(), line 168: 201 child(ren) still alive
[14-Nov-2018 17:03:32.323123] DEBUG: pid 22501, fpm_event_loop(), line 419: event module triggered 1 events
[14-Nov-2018 17:04:32.324348] DEBUG: pid 22501, fpm_pctl_kill_all(), line 159: [pool wss.cex.au] sending signal 15 SIGTERM to child 28525
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 20:01:27 2019 UTC