php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #70185 php-fpm restarts master process in a loop when exec() and using ssh multiplexing
Submitted: 2015-08-03 02:54 UTC Modified: 2018-02-23 16:55 UTC
Votes:20
Avg. Score:4.8 ± 0.5
Reproduced:20 of 20 (100.0%)
Same Version:3 (15.0%)
Same OS:4 (20.0%)
From: kobenews at cox dot net Assigned:
Status: Duplicate Package: FPM related
PHP Version: 5.4.43 OS: CentOS release 6.6 (Final)
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: kobenews at cox dot net
New email:
PHP Version: OS:

 

 [2015-08-03 02:54 UTC] kobenews at cox dot net
Description:
------------
Ok bear with me, this bug is a bit obscure and takes some setup. However, it is consistent, and causes php-fpm to continue to restart the process over and over again using 100% of the CPU on the php-fpm master process.

It only occurs with SSH controlMaster which basically allows SSH connection multiplexing.

Here is an example code:

<?php
    $command = 'ssh -o "ControlMaster=auto" -o "ControlPersist=2m" -S /home/justin/.ssh/sockets/justin.sock justin@1.2.3.4';
    $output = exec($command, $output_array, $return_code);


If you remove the options -o "ControlMaster=auto" -o "ControlPersist=2m" -S /home/justin/.ssh/sockets/justin.sock this works without crashing over and over again.

Test script:
---------------
<?php
    $command = 'ssh -o "ControlMaster=auto" -o "ControlPersist=2m" -S /home/justin/.ssh/sockets/justin.sock justin@1.2.3.4';
    $output = exec($command, $output_array, $return_code);

Expected result:
----------------
php-fpm master process should not restart the process over and over again. Nothing is crashing here.

Actual result:
--------------
php-fpm continues to restart the process over and over again using 100% of the CPU on the php-fpm master process.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-08-03 03:21 UTC] kobenews at cox dot net
Sorry, the script I provided is incorrect. I forgot a command at the end. It should be:

<?php
    $command = 'ssh -o "ControlMaster=auto" -o "ControlPersist=2m" -S /home/justin/.ssh/sockets/justin.sock justin@1.2.3.4 uptime';
    $output = exec($command, $output_array, $return_code);
    die($output);
 [2015-08-07 20:13 UTC] kobenews at cox dot net
-Summary: php-fpm restarts process forever when exec() and using ssh multiplexing +Summary: php-fpm restarts master process in a loop when exec() and using ssh multiplexing
 [2015-08-07 20:13 UTC] kobenews at cox dot net
Updated summary.
 [2015-09-08 20:39 UTC] php at plainview dot se
Finally, someone else with the same bug! I have been dealing with it for years and, until I read this, used a separate php-fastcgi to deal with the constant child spamming.

My previous setup was a Debian 7 with 5.4.4-14+deb7u14.
My current setup is a Debian 8 with 5.6.9+dfsg-0+deb8u1.

Both exhibit the exact same behaviour: when a multiplexed SSH is exec()ed it goes spawn crazy, with children dying left and right:

[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30721 exited with code 0 after 0.011428 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30722 exited with code 0 after 0.011916 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30723 exited with code 0 after 0.011527 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30724 exited with code 0 after 0.012399 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30725 exited with code 0 after 0.011333 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30726 exited with code 0 after 0.010918 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30727 exited with code 0 after 0.010977 seconds from start
[08-Sep-2015 21:32:36] NOTICE: [pool www] child 30728 exited with code 0 after 0.011124 seconds from start

The result is 100% CPU load and a sluggish system.

The source of the problem is, as kobenews says, multiplexing SSH. I do it via a series of bash scripts that are execed by PHP.

Original script:

#!/bin/bash
CONTROLPATH=/var/www/mail_scripts/master-mail_scripts
MAIL_SCRIPTS_COMMAND="ssh -o StrictHostKeyChecking=no -l mail_scripts -o ControlPath=$CONTROLPATH -i /var/www/mail_scripts/mail_scripts TARGET.COM"

# If the socket doesn't exist, create it.
if [ ! -e $CONTROLPATH* ] ; then
    $MAIL_SCRIPTS_COMMAND -fNM
fi

# Execute the requested command.
$MAIL_SCRIPTS_COMMAND /home/mail_scripts/the_command $@


What I did was I moveed -f in front of TARGET.COM and started the master manually, outside of PHP. After that PHP seems fine with running the script without spawnflooding.
 [2015-09-09 02:18 UTC] kobenews at cox dot net
php at plainview dot se: glad I'm not the only one experiencing this bug. Now the trick, is to get a fix.

I am running:

PHP 5.4.43 on CentOS 6.7
 [2016-01-22 08:06 UTC] i dot priladnyh at gmail dot com
Ubuntu 14.04
php5-fpm 5.5.9
Same problem with launch ssh session using ControlMaster. FPM load CPU to 100%.
Problem with FPM solved by launch SSH session with flag -f (background). But then SSH connection multiplexing don`t correctly works.
 [2017-02-24 13:52 UTC] sylvain at com-ocean dot com
Same bug by executing a bash script calling rsync command.
with PHP 5.6.30 and Debian 7.10
 [2017-03-10 20:03 UTC] php at grindau dot com
I am having the same issue! It took me ages to find out whats the reason.
I am running the latest php-7.1.2 under archx64.

Executing this command: 				

shell_exec('/usr/lib/node_modules/csso/bin/csso '.ROOT_PATH.'/tmp/style-critical-unoptimized.css --output '.ROOT_PATH.'/tmp/style-critical.css');

which should just easily grap a normal css file and compress it down causes the master process to restart itself hundred times a second or even more causing 100% cpu load on the server.
 [2017-03-10 23:14 UTC] kobenews at cox dot net
php@grindau dot com: I'd love to get a test case script, without requiring lots of setup and dependencies. Anybody have ideas?

This does seem to indicate a bug though.
 [2017-03-13 14:31 UTC] php at grindau dot com
Well, there is no shell-scripts, this code runs inside of my classes.

I was able to narrow it down to all shell functions like
passthru(), exec(), shell_exec(), system(), etc, it happens with every of them.

i tried to run this cli-module on my arch linux:
https://github.com/css/csso
by executing this
/usr/lib/node_modules/csso/bin/csso infile.css --output outfile.css
the same happened with UglifyCSS: https://github.com/fmarcia/UglifyCSS
but not with UglifyJS: https://github.com/mishoo/UglifyJS

I am also running java jars inside those functions for years without any issue.
Like Yuicompressor: http://yui.github.io/yuicompressor/

Hope that helps.
Regards
Sascha
 [2017-03-18 21:35 UTC] php at grindau dot com
Any news on this?
Is it reproduceable?

While rechering i found some topics about stdin non-blocking behaviour of apps which might cause that issue.

Hope that helps.
 [2017-03-24 09:36 UTC] please at forget dot me dot uk
I have the same problem on php 7.0.17(from 7.0.15) with FreeBSD 11.
restarting php-fpm helps. 
however, I am using perl modules inside nginx to look after node CSS - and CSS files produce cut versions like PHP-fpm so I assume problem is somewhere in the middle of: php-fpm -- nginx -- node ; with perl I have to restart webserver.
 [2017-04-13 08:43 UTC] php at grindau dot com
I will never report a bug again.
This kind of ignorance is just bullshit.
 [2017-04-13 20:22 UTC] merrill at oakland dot edu
We are seeing the same issue, although in our case we aren't doing anything with a socket in SSH. We are just connecting an issuing a command - in fact I can just do "ssh sshserver.whatever -i pathtoidentfile \"sleep 10\"" and php-fpm will keep responding processes, flooding the logs with warnings and causing the machine to bottleneck.
 [2017-04-17 20:21 UTC] jlaustill at gmail dot com
I am running into this same issue on php 5.6.30 running on Ubuntu 16.04.  I'm trying to call minify(node.js library).  Here's the exact call I'm making.

    public function minifyJsFile($source, $dest)
    {
        $result = exec('minify ' . $source . ' > ' . $dest);

        if ($result) {
            return false;
        } else {
            return filemtime($dest);
        }
    }

I have been unable to find a workaround.  But I'm not a php expert by any stretch.  My background is c/c++/C#/javascript mainly.
 [2017-04-20 09:37 UTC] php at grindau dot com
My only working workaround looks as the follows:

				$cwd='/tmp';
				$descriptorspec = array(
				    0 => array("pipe", "r"),
				    1 => array("pipe", "w"),
				    2 => array("file", "/tmp/error-output.txt", "a") );

				$process = proc_open('csso '.ROOT_PATH.'/tmp/style-'.$type.'-uncritical-unoptimized.css '.ROOT_PATH.'/tmp/style-'.$type.'-uncritical.css --comments none', $descriptorspec, $pipes, '/tmp');

				stream_get_contents($pipes[1]);
				fclose($pipes[1]);
 [2017-04-24 22:49 UTC] php at grindau dot com
Applause to the devs!
Absolutely no response, this is a shame.
 [2017-04-25 05:24 UTC] spam2 at rhsoft dot net
I am not a dev but did you realize that pho5 support is over except highly critical security bugs?
 [2017-04-25 10:24 UTC] php at grindau dot com
This happens in the latest stable php 7.1 as well!
 [2017-04-25 11:28 UTC] php at grindau dot com
As everybody could have seen in my post with the timestamp: 2017-03-10 20:03 UTC.
 [2017-04-25 11:35 UTC] spam2 at rhsoft dot net
> This happens in the latest stable php 7.1 as well!
> As everybody could have seen in my post with 
> the timestamp: 2017-03-10 20:03 UTC

then someone should change this metadata

PHP Version: 5.4.43 	
OS: CentOS release 6.6 (Final)
 [2017-04-25 14:04 UTC] merrill at oakland dot edu
This bug does effect 7.x as well

That being said, it's an open source project, with an insurmountable amount of bugs (That's just the way big projects work). Somebody who cares will have to work on it an try to submit a patch.
 [2017-04-25 14:24 UTC] php at grindau dot com
Well, "kobenews at cox dot net" asked for additional informations to reproduce. After that tons of input of me and others were provided but we havent got any feedback so far. I am not one of the devs, so i am not going to change any metadata of which i dont have enough knowledge. I ll leave this up to people who are into it and limit my input on providing feedback.
 [2017-07-21 09:36 UTC] d dot reade at reades dot co dot uk
I'm experiencing a very similar issue. My server is running CentOS 7.3, cPanel 64.0.33, Apache 2.4.27 and PHP 5.6.31 with PHP-FPM.

After much investigation, the following script reproduces my issue:

<?php
exec('/usr/bin/postcss css_from.css --use autoprefixer --autoprefixer.remove "false" --output css_to.css');
echo 'Done';
?>

Only after the script completes does the PHP-FPM process then use 100% CPU and the PHP-FPM error.log file does this:

root@l1vs09rcms [/opt/cpanel/ea-php56/root/usr/var/log/php-fpm]# head error.log -n 128
[21-Jul-2017 10:23:03] NOTICE: fpm is running, pid 30537
[21-Jul-2017 10:23:03] NOTICE: ready to handle connections
[21-Jul-2017 10:23:03] NOTICE: systemd monitor interval set to 10000ms
[21-Jul-2017 10:23:13] WARNING: [pool dev3_reades_local] child 30540 said into stderr: "✔ Finished css_from.css (60ms)"
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30540 exited with code 0 after 1.167822 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30551 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30551 exited with code 0 after 0.007451 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30552 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30552 exited with code 0 after 0.007069 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30553 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30553 exited with code 0 after 0.007391 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30554 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30554 exited with code 0 after 0.007211 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30555 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30555 exited with code 0 after 0.007631 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30556 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30556 exited with code 0 after 0.007972 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30558 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30558 exited with code 0 after 0.007635 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30559 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30559 exited with code 0 after 0.007761 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30560 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30560 exited with code 0 after 0.007557 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30561 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30561 exited with code 0 after 0.007735 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30562 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30562 exited with code 0 after 0.008361 seconds from start
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30563 started
[21-Jul-2017 10:23:13] NOTICE: [pool dev3_reades_local] child 30563 exited with code 0 after 0.007477 seconds from start

The log file is meant to rotate as per the cPanel configuration, but when it's doing this and constantly growing, the rotate threshold does not trigger the rotate and so the log file just keeps getting bigger until the server runs out of disk space and crashes!

It doesn't even matter if I use shell_exec() or exec(), or if I append "> /dev/null 2>&1" to the command thinking the output is the cause. If I append the command so there's no output, the log file doesn't show the warning, but the continuous notices still happen regardless and so the log file just gets bigger.

The workaround for us is to not execute any scripts that contain exec() or shell_exec() and to execute these scripts from the command line instead, which is doable as I wrote all our software, but for others using other pre-written software packages this could be a problem.
 [2017-09-21 18:07 UTC] mike at sagesolutionsinc dot ca
I also have this issue when calling exec()

Im using php 7.0.22 on Ubuntu 16.04


my log file is  getting flooded, and my cpu goes to 100%.

The only solution I have is to restart the php7.0-fpm service which is not ideal.

I could move my exec() into a cronjob, but would like to keep it in PHP preferably.


sudo tail -f /var/log/php7.0-fpm.log
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14682 exited with code 0 after 0.022976 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14685 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14683 exited with code 0 after 0.024077 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14686 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14684 exited with code 0 after 0.024208 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14687 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14685 exited with code 0 after 0.023809 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14688 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14686 exited with code 0 after 0.022311 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14689 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14687 exited with code 0 after 0.023341 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14690 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14688 exited with code 0 after 0.027854 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14691 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14689 exited with code 0 after 0.028554 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14692 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14690 exited with code 0 after 0.027951 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14693 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14691 exited with code 0 after 0.024784 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14694 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14692 exited with code 0 after 0.027203 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14695 started
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14693 exited with code 0 after 0.028104 seconds from start
[21-Sep-2017 14:01:36] NOTICE: [pool www] child 14696 started
....
 [2018-02-16 15:09 UTC] schnederle at futureweb dot at
we got the same issue with Centos 7.4.1708:

Latest PHP Version:

[root@localhost ~]# php -v
PHP 7.2.2 (cli) (built: Feb  1 2018 15:30:38) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.2.2, Copyright (c) 1999-2018, by Zend Technologies

...
[16-Feb-2018 14:48:30] WARNING: [pool www] child 53778 exited on signal 15 (SIGTERM) after 0.046685 seconds from start
[16-Feb-2018 14:48:30] WARNING: [pool www] child 53782 exited on signal 15 (SIGTERM) after 0.045416 seconds from start
[16-Feb-2018 14:48:30] WARNING: [pool www] child 53785 exited on signal 15 (SIGTERM) after 0.045331 seconds from start
[16-Feb-2018 14:48:30] WARNING: [pool www] child 53787 exited on signal 15 (SIGTERM) after 0.044765 seconds from start
...

100% CPU Usage of "php-fpm: master process (/etc/php-fpm.conf)"


[root@localhost ~]# service php-fpm status
Redirecting to /bin/systemctl status php-fpm.service
● php-fpm.service - The PHP FastCGI Process Manager
   Loaded: loaded (/usr/lib/systemd/system/php-fpm.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2018-02-16 14:50:00 CET; 1h 17min ago
 Main PID: 7869 (php-fpm)
   Status: "Processes active: 0, idle: 808593, Requests: 2319, slow: 0, Traffic: 72req/sec"
   CGroup: /system.slice/php-fpm.service
           ├─ 7869 php-fpm: master process (/etc/php-fpm.conf)
           ├─43200 php-fpm: pool www
           ├─43202 php-fpm: pool www
           ├─43205 php-fpm: pool www
           ├─43206 php-fpm: pool www
           └─43207 php-fpm: pool www

Feb 16 14:50:00 localhost.localdomain systemd[1]: Starting The PHP FastCGI Process Manager...
Feb 16 14:50:00 localhost.localdomain systemd[1]: Started The PHP FastCGI Process Manager.

Also the Traffic 72req/sec shows that something is Bogus with only 2.319 Requests within the last 1:17h ...

Really noone who cares?! :-/

thx, bye from Austria
Andreas Schnederle-Wagner
 [2018-02-16 15:24 UTC] schnederle at futureweb dot at
Guess Dev's won't look here anymore because it's reported for PHP 5.4
I opened a new Bug Report for PHP 7.2 - please vote it up ... ;-)

https://bugs.php.net/bug.php?id=75968

thx
 [2018-02-23 16:55 UTC] nikic@php.net
-Status: Open +Status: Duplicate
 [2018-02-23 16:55 UTC] nikic@php.net
Root cause is bug #73342, marking as duplicate.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 19:01:33 2024 UTC