php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #76225 ini setting to not empty opcache when php-fpm is reloaded
Submitted: 2018-04-16 00:16 UTC Modified: 2018-04-23 19:44 UTC
From: post at minhost dot no Assigned:
Status: Open Package: opcache
PHP Version: 7.1.16 OS: CentOS 7.4
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2018-04-16 00:16 UTC] post at minhost dot no
Description:
------------
On shared hosting servers users can add and delete domains, and install Let's Encrypt SSL certificates, and they do many times each day, but this clears OPcache each time, because PHP-FPM is reloaded.

To avoid empty OPcache for first visits to users PHP pages after each PHP-FPM reload, we have setup file cache as second level fallback OPcache. However ever since PHP 7.1.12 users are sporadic getting 503 errors when running update.php in Drupal (and PHP-FPM crashes). We have not been able to figure out what goes wrong, but it never happened in PHP 7.1.11 (we are now on PHP 7.1.16). I suspect this is related to running file cache as second level fallback OPcache and new bugs in file cache from PHP version 7.1.12 and newer.

Because of this problem, we would like to stop running file cache as second level fallback OPcache, and only run OPcache in RAM. To eliminate the need of running file cache as second level fallback OPcache, we need to avoid OPcache being purged on every PHP-FPM reload.

Can you please add a OPcache .ini setting that allow us to disable purging of OPcache after PHP-FPM is reloaded, so that we can stop using file cache, and only use OPcache in RAM.

Or even better, can you please disable purgin of OPcache after PHP-FPM reload, and only purge OPcache when PHP-FPM is restarted?


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-04-16 05:12 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2018-04-16 05:12 UTC] requinix@php.net
Has this 503 bug been reported? I did quick searches here and only found your InfiniteWP bug
https://bugs.php.net/bug.php?id=76205

And pardon my ignorance but why do you have to reload php-fpm when updating the web server's SSL cert? It's separate.
 [2018-04-16 07:46 UTC] post at minhost dot no
PHP-FPM is reloaded after any SSL cert installation because we run DirectAdmin, and that is the behaviour. I have now contacted DirectAdmin support and asked them why this is needed.

I have not yet made a bug report regarding sporadic 503 errors when running update.php in Drupal, because I am not able to reproduce the error on every run, it only happen sporadic, and it can be days between one of my customers report that it happen. Then there is no useful information in the log files, and because I am not able to reproduce the error on demand, I am not able to debug it. Do you still think I should create a bug report?
 [2018-04-16 07:50 UTC] requinix@php.net
-Status: Feedback +Status: Open
 [2018-04-16 07:50 UTC] requinix@php.net
I suppose not - we'll just have to ask for more information anyways. The first thing I would ask is whether you're sure the 503 is coming from PHP/php-fpm itself and not the web server or Drupal.
 [2018-04-16 07:56 UTC] post at minhost dot no
Yes I am sure! PHP-FPM is _crashed_ and will stay down until I reload PHP-FPM to get the site online again.
 [2018-04-16 11:30 UTC] nikic@php.net
Can you please provide the backtrace(s) from the core dumps of the crashes? That may be enough to diagnose the problem.
 [2018-04-16 11:34 UTC] post at minhost dot no
Is it possible to do after it has happened? As said it only happen sporadic. Also can you please point me to some guides for creating a backtrace from the core dumps AFTER the crash has already happened?
 [2018-04-16 11:41 UTC] nikic@php.net
Sure, that's exactly what core dumps are for :) You need to run "gdb php-fpm path-to-core-file" and then "bt" or "bt full". It may be necessary to specify a full path to the php-fpm binary and to install debug symbols from your package manager, if they aren't installed yet.
 [2018-04-16 12:03 UTC] post at minhost dot no
Thanks for the information. However this is new to me. I did try to see if GDB is installed, here is some output:

[root@server~]# which -a gdb
/usr/bin/which: no gdb in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/.local/bin:/root/bin)

[root@server~]# locate -eb '\gdb'
/usr/share/gdb

[root@server~]# gdb -help
-bash: gdb: command not found

[root@server~]# gdb php-fpm /usr/local/php71/bin
-bash: gdb: command not found

[root@server ~]# gdb /usr/local/php71/bin
-bash: gdb: command not found

Any help to point me in the right direction? If not I will have to seek outside technical advice on this, wich could take several days or weeks.
 [2018-04-16 13:25 UTC] nikic@php.net
Ah yes, you will have to install gdb first (using "sudo apt-get install gdb" or whatever the equivalent for your operating system is).

Alternatively, if you do not want to install additional software on a production server, it is also in principle possible to copy the core dump to a different machine that has gdb installed and analyze it there. However, this requires that both the php-fpm binary and shared objects are the same on the other machine. This tends to be more finicky, so if you can directly install gdb, I'd recommend going that way.
 [2018-04-17 10:23 UTC] post at minhost dot no
When compiling PHP with --enable-debug, PHP runs to slow for me to do this on production servers. As I said, the crash only happen sporadic, and it can be anything from days to weeks before it happen again. So I can't do the backtrace on my production servers.

Instead I will try to setup a copy of the sites on a test VPS, and run backtrace on that test server instead. I can only hope I will be able to trigger the crashes when it is only a copy of the live sites.

I will then try to reproduce the crashes of both InfititeWP in this bug https://bugs.php.net/bug.php?id=76205 and on Drupal sites when running update.php (described in the bug report I am commenting on now).

Please note that this bug report was a feature request. But I guess it does not matter if I use the comment field here.

This could take time ...
 [2018-04-17 11:24 UTC] nikic@php.net
It is not necessary to build PHP with --enable-debug (which apart from enabling debug symbols also disables optimization and enables debug assertions). The documentation and ./configure --help output are quite misleading in that regard and should be improved.

For a basic backtrace without file+line information, just using a normal PHP build is fine. To also have file+line information it's possible to pass CFLAGS="-g" to configure, in which case PHP will be built with debug symbols (and optimization). This should not impact performance, but increases the size of the binary.
 [2018-04-20 15:57 UTC] post at minhost dot no
I was finnaly able to get a crash of the copy of a site on a test server. However there was no dump files in /tmp - I really don't know what I did wrong? Here is what I did to activate bactrace:


# First I changed all php-fpm.conf config files to include this:
rlimit_core = unlimited

# Then I compiled PHP-FPM with --enable-debug - and I confirmed a PHP info page says this: "Debug Build yes"

# Then I installed GDB with yum: yum install gdb

# Then I enabled core dumps in linux like this:

su -
echo '/tmp/core-%e.%p' > /proc/sys/kernel/core_pattern
echo 0 > /proc/sys/kernel/core_uses_pid
ulimit -c unlimited

# Then i logged out of shell. Then a few days passed without any crash on the test site on the test server. Then today I was able to reproduce a crash on the test server, and PHP-FPM for the site crashed with a 503 error.

In apache error log for the test site I got this:
[Fri Apr 20 17:30:02.395339 2018] [proxy_fcgi:error] [pid 19834:tid 140118967469824] [client 176.74.214.18:6917] AH01067: Failed to read FastCGI header
[Fri Apr 20 17:30:02.395398 2018] [proxy_fcgi:error] [pid 19834:tid 140118967469824] (104)Connection reset by peer: [client 176.74.214.18:6917] AH01075: Error dispatching request to : 
[Fri Apr 20 17:30:45.711350 2018] [proxy_fcgi:error] [pid 19834:tid 140119471032064] [client 176.74.214.18:6933] AH01067: Failed to read FastCGI header, referer: https://www.fjell-bfk4k.41.no/update.php?op=info
[Fri Apr 20 17:30:45.711443 2018] [proxy_fcgi:error] [pid 19834:tid 140119471032064] (104)Connection reset by peer: [client 176.74.214.18:6933] AH01075: Error dispatching request to : , referer: https://www.fjell-bfk4k.41.no/update.php?op=info

# Then I went into /tmp to look for core dumps, but ther are not any core dumps in /tmp!

By the way I can run gdb just fine like this

[root@dns2 ~]# gdb
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7_4.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
(gdb)

But of course I just get:
(gdb) bt
No stack.
(gdb) help

I would like to run:
gdb /usr/local/php71/sbin/php-fpm71 /tmp/file-name-to-dump-file.123123

But of course I can't, because there is no dump files in /tmp

What am I doing wrong? I really need som help here.
 [2018-04-20 16:02 UTC] post at minhost dot no
By the way, when I manually double check by looking into the file:
/proc/sys/kernel/core_pattern

It correctly has this content:
/tmp/core-%e.%p

And looking into:
/proc/sys/kernel/core_uses_pid

It correctly has this content:
0
 [2018-04-20 18:23 UTC] post at minhost dot no
On the test server I have managed to crash two test sites now

Some observations I have done now, is that each time I get crash on the test server, it is no more free memory in opcahe, like this (the test server does not have a lot of memory):

Used memory	134214488
Free memory	3240

However on the production servers, I have allocated 32 GB to opcache, and I have never seen it use more then 9 GB in opcache.

However I am not sure how opcache user the memory limit I set, because the 32 GB is not visible when looking at used memory in shell:

[root@production-server ~]# free -m
              total        used        free      shared  buff/cache   available
Mem:         128658       14825        1961        6672      111870      106292
Swap:           955         954           1

Could it be that opcache is not able to allocate memory from the already used buffer/cache above? Or is the 32 GB memory limit in opcache already accounted for in the current buffer/cache?

Anyway, I am still not able to get any dump files in /tmp on the test server.
 [2018-04-23 10:57 UTC] post at minhost dot no
I was able to get one step closer to generating core dumps. It seems I needed to add the following to /etc/security/limits.conf:

*               soft    core            unlimited

After doing that I was able to get a core file in /tmp when running this command:

kill -s SIGSEGV $$

However when php-fpm/opcache crash, there is still not generated any core files in /tmp

I have even changed PrivateTmp=true to be PrivateTmp=false - but still no luck.
 [2018-04-23 18:52 UTC] post at minhost dot no
Please note that whenever php-fpm/opcache crash, it causes a SIGABRT in php-fpm.log like this:

[23-Apr-2018 20:35:50] WARNING: [pool asle] child 949 exited on signal 6 (SIGABRT) after 169.991198 seconds from start

And no core dump file is generated in /tmp - is it because it is a SIGABRT and not a SIGSEGV? Should I be able to get a core dump file on crash with SIGABRT?
 [2018-04-23 18:59 UTC] post at minhost dot no
The crazy thing is that on the test server with a copy of the sites from production server, I only get SIGABRT in php-fpm.log when php-fpm/opcache crashes, but on the production server I get SIGSEGV in php-fpm.log when php-fpm/opcache crashes on the same sites.
 [2018-04-23 19:44 UTC] nikic@php.net
SIGABRT (most likely) indicates that an assertion failure occurred. These only happen in debug builds, so it's expected that you see a different behavior in non-debug builds.

SIGABRT should also be generating core dumps. I'm not sure why this doesn't work for you. One more thing to check would be if you have the rlimit_core option specified in an fpm pool config. It should be either not present at all, or set to rlimit_core=unlimited.
 [2018-04-23 20:23 UTC] post at minhost dot no
Thank you for information about SIGABRT being related to the debug build.

Finally I am able to generate core dump files! On my CentOS 7.4 server it was needed to change fs.suid_dumpable from default 0 to become 1

I might need a couple of hours, then I will update my bug reports.
 [2018-04-23 20:59 UTC] post at minhost dot no
I have now added a comment on this InfiniteWP/WordPress bug with backtrace of the core dumps: https://bugs.php.net/bug.php?id=76205 - I will soon create a new bug report for the crash in Drupal, wich might be related to this bug.
 [2018-04-23 23:05 UTC] post at minhost dot no
I have now created a bug report for the crash on Drupal sites, including backtrace of core dumps: https://bugs.php.net/bug.php?id=76258
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Fri May 24 22:01:26 2019 UTC