go to bug id or search bugs for
This is a pain to replicate but spent 2 hours with AWS engineers (plural) working this one out.
Load Balancer to Apache2 to PHP7.2-FPM
Wordpress 5 - Database Connection error (Firewall / security group blocking access to Database)
PHP builds response and output and works correctly if you we used curl to the server curl with the "host:" header we could see the full HTML response generated by Wordpress.
When going via a Load Balance on AWS (Application Load Balancer) the underlying apache connection to the server was never closed so the load balancer just keep waiting for the connection to close before time out.
When Wordpress handle the DB connection error and produces a complete error output it should close the connection to apache.
Seems to hang and leave the connection open meaning apache does not close it's connection correctly
Add a Patch
Add a Pull Request
Sorry, but your problem does not imply a bug in PHP itself. For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions. Due to the volume
of reports we can not explain in detail here why your report is not
a bug. The support channels will be able to provide an explanation
Thank you for your interest in PHP.
Besides the fact that the problem clearly appears to be with the LB, this is too complicated to deal with on a bug tracker. Too many variables. If you can eventually track down the problem to something specifically with php-fpm (which has no knowledge of your LB because Apache is in between) then we can work from there.
Wait your telling me a PHP Application not responding correctly is not a bug...
So what your deliberately leave the connection to apache2 open when it a handled error happens, even though no error it closes the connection correctly?
So just to be clear 2 members of staff at our side 2 AWS Engineering staff all looking into this problem for 4 hours.
We had tcpdump running, a static apache (HTML) site was working the whole time. so this was only happening when php was being used.
how we identified this was from the tcpdump we could see everything from the load balancer to the server was working.
We could even see the server was sending the output content from PHP-FPM as it was created.
The only problem was that the PHP-FPM handle waited for the socket timeout to occur Default of 60 seconds even though it has finished execution of the files.
However, when the WordPress does not have a DB error it does not do the same site loads and responded in under 2 seconds.
So unless you're telling me that there is a bug in the Most Used PHP Application in the world in a specific case it errored created the output then but the process to sleep for more than 60 seconds before closing the connection.
This is a PHP Bug, Apache2 works without PHP no Problem Apache2 is using the PHP Connector in FPM Mode. every part of where this error could be coming from is PHP Projects Code.
Maybe I was mistaken but it sounded originally like you were saying that everything between Apache and PHP works fine unless you add a load balancer. Now you're saying that everything between the LB and Apache works fine unless you add PHP?
I'm not sure what you mean by closing connection as FastCGI keeps the connection shared between all workers in the pool so a single worker error should never close the connection between Apache2 and FPM. Could you elaborate a bit more what the application error exactly is and what you expect FPM to do. Also could you attach your FPM config.
From the description it seems to me like you might have an idle process but not really sure. If that's the case, have you tried setting pm.process_idle_timeout?
Ok so here is an overview again. and I have found some new information.
I will start with the new information.
As soon as PHP responded successfully all the errors start working.
So default setups are all from apt-get repo ppa:ondrej/php
So test cases of the system we ran.
1) If I go to a static HTML Website via Load Balancer it works.
2) If i go to a static HTML Website direct to the server it works.
3) If I go to a PHP Wordpress set via Load Balancer in error (could not connect to Database)
a) Load Balancer connects to apache
b) Apache makes the request to php-fpm
c) php-fpm runs the script and does not fail (last instruction was an error_log and it worked)
d) php-fpm sends the response to apache (DB Connection error)
e) apache handler does not get the null terminator or whatever is supposed to be the terminator.
f) apache waits for php-fpm to terminate
g) Load Balancer terminates due to timeout.
4) If I correct the Wordpress error that could not connect to DB error by opening the firewall on the DB Server to allow from the Web Server and Wordpress then everything else starts to work correctly I can go into the wp-config.php and break the DB connection and it will still respond correctly with the error response.
So the condition that was not working if php-fpm's initial connection errors FPM seems to fail to finish handling the request and notify Apache that it has finished.
After FPM has run a working script it's fine and handles errors correctly. so it seems to be if there is an error DB error in WordPress as the first thing PHP-FPM runs (even though the script did not truly error it was handled)
(theory) this has lent me more to there is a problem with PHP-FPM and the MySQLi system in that if MySQLi errors before it's had a successful connection it breaks the FPM handler.
The people testing this was an AWS EC2 Linux expert who checked all configs on the server and said it was fine.
An AWS Network engineer checking all the traffic including going through the pcap file and spotting that the server was responding with the response from PHP all the way to the load balancer it just was not NULL terminating the connection, so the load balancer was expecting more data.
And me a Software Programming graduate who has been using PHP mainstream for employment since PHP 4, and graduated a software programming degree where I specialized in software socket connections for HPC programming.
Once we resolved this we carried on investigating and came to the conclusion that it had to by the PHP-FPM connection not sending the terminator for the connections when an error state was initially loaded. as I said above the moment a WordPress site loaded successfully i could then break the DB connection and it worked correctly.
so I have just double checked this php.ini is in the default configuration are not even any extensions loaded for the php-fpm php.ini in /etc/php/7.2/fpm/php.ini