php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #67796 php-fpm: repeatedly spinning many many workers
Submitted: 2014-08-06 11:28 UTC Modified: 2018-02-23 21:10 UTC
Votes:16
Avg. Score:4.9 ± 0.5
Reproduced:15 of 15 (100.0%)
Same Version:11 (73.3%)
Same OS:11 (73.3%)
From: kenny at kennynet dot co dot uk Assigned:
Status: Duplicate Package: FPM related
PHP Version: 5.5.15 OS: Debian
Private report: No CVE-ID: None
 [2014-08-06 11:28 UTC] kenny at kennynet dot co dot uk
Description:
------------
After php5-fpm has been running for an amount of time we start to see log messages spinning round rapidly many many times of second of the form:-

"""
[06-Aug-2014 12:18:46] NOTICE: [pool pool-1] child 10141 started
[06-Aug-2014 12:18:46] NOTICE: [pool pool-1] child 9764 exited with code 0 after 0.567466 seconds from start
"""

(A side effect of this is that the scoreboard is updated with "idle++" but if no work is processed that idle counter is never decremented so it continues to increase)

Attaching gdb and following the fork to the child one can observe that the accept() call in fcgi_accept_request() returns -1 with errno=EAGAIN. 

A result expected from non-blocking sockets but not blocking sockets, further:-

gdb> print fcntl(listen_socket, 3)
... 2050

Which is O_RDWR | O_NONBLOCK.

So it appears that *somehow* the listen_socket is being put into non-blocking mode and hence:-

* The loop around fcgi_accept_request() aborts immediately with exit_status=0.
* The child subsequently exits.
* fpm_children_bury() processes this as restart_child=1
* Round and round it goes spinning up new new children which exit immediately if there is nothing to accept().

As a test / proof of the above this is fixed by:-

gdb> print fcntl(listen_socket, 4, fcntl(listen_socket, 3) &~ 04000)
... 0

Well, fixed until the next time the socket somehow gets put into non-blocking mode, I've not yet managed to isolate exactly how that is happening.

Test script:
---------------
N/A

Expected result:
----------------
N/A

Actual result:
--------------
N/A

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-08-06 13:39 UTC] kenny at kennynet dot co dot uk
Update:-

I can 100% reproduce the problem and I'm now not sure if this is a PHP, PHP-FPM or not.

Within php if you exec(), the process you exec inherits stdin from the -fpm worker (e.g. the listen_socket).

If that process then sets O_NONBLOCK this causes the -fpm issue described. 

You can reproduce with this C file (nonblock.c): http://pastie.org/9450459

Exec'd from a php script called from a fpm worker:-
"""
<?php
header("Content-type: text/plain");
passthru("/path/to/nonblock");
"""

Just refresh the page to cause (and fix) the issue at will.
 [2014-08-13 05:02 UTC] 64438136 at qq dot com
Excuse me, how to solve this problem?
 [2014-08-14 01:11 UTC] stas@php.net
-Assigned To: +Assigned To: fat
 [2014-09-07 06:57 UTC] kenny at kennynet dot co dot uk
This is the same issue as bug: #61558

My use-case was ssh using a ControlMaster file (as described in that bug report).

In this scenario ssh sets stdin to O_NONBLOCK but never removes it.

Without a ControlMaster, ssh does reset (remove) the O_NONBLOCK flag on stdin.
 [2017-10-24 07:45 UTC] kalle@php.net
-Status: Assigned +Status: Open -Assigned To: fat +Assigned To:
 [2018-02-23 21:10 UTC] nikic@php.net
-Status: Open +Status: Duplicate
 [2018-02-23 21:10 UTC] nikic@php.net
Another manifestation of bug #73342, marking as duplicate.
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Mon May 27 12:01:26 2019 UTC