php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77430 php-fpm crashes with Main process exited, code=dumped, status=11/SEGV
Submitted: 2019-01-08 10:10 UTC Modified: 2019-01-17 18:56 UTC
From: claudiu_beta at yahoo dot com Assigned: bukka (profile)
Status: Closed Package: FPM related
PHP Version: 7.3.1RC1 OS: Fedora 28
Private report: No CVE-ID: None
 [2019-01-08 10:10 UTC] claudiu_beta at yahoo dot com
Description:
------------
I'm using php 7.3 RC1 from Remi repo. PHP 7.1, 7.2 and 7.3 running on the same machine with same php-fpm configuration and same pool configuration. I only have issues with php 7.3 which is going down frequently.

Redirecting to /bin/systemctl status php73-php-fpm.service
● php73-php-fpm.service - The PHP FastCGI Process Manager
   Loaded: loaded (/usr/lib/systemd/system/php73-php-fpm.service; disabled; vendor preset: disabled)
   Active: failed (Result: core-dump) since Mon 2019-01-07 14:56:05 EET; 54s ago
  Process: 9541 ExecStart=/opt/remi/php73/root/usr/sbin/php-fpm --nodaemonize (code=dumped, signal=SEGV)
Main PID: 9541 (code=dumped, signal=SEGV)
   Status: "Processes active: 0, idle: 2, Requests: 225, slow: 3, Traffic: 0.9req/sec"

Jan 07 14:48:12 serv systemd[1]: Starting The PHP FastCGI Process Manager...
Jan 07 14:48:12 serv systemd[1]: Started The PHP FastCGI Process Manager.
Jan 07 14:56:05 serv systemd[1]: php73-php-fpm.service: Main process exited, code=dumped, status=11/SEGV
Jan 07 14:56:05 serv systemd[1]: php73-php-fpm.service: Failed with result 'core-dump'.

Installed packages:
php73-php-debuginfo-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-common-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-devel-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pecl-zip-1.15.4-1.fc28.remi.x86_64
php73-php-bcmath-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-xml-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-gd-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-ioncube-loader-10.3.2-1.fc28.remi.x86_64
php73-php-mysqlnd-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-soap-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-fpm-debuginfo-7.3.1~RC1-1.fc28.remi.x86_64
php73-build-1.0-1.fc28.remi.x86_64
php73-php-intl-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pear-1.10.7-4.fc28.remi.noarch
php73-php-mbstring-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-gmp-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pecl-imagick-3.4.3-13.fc28.remi.x86_64
php73-1.0-1.fc28.remi.x86_64
php73-php-pecl-memcached-3.1.3-1.fc28.remi.x86_64
php73-runtime-1.0-1.fc28.remi.x86_64
php73-php-imap-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pdo-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-fpm-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-json-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pecl-igbinary-2.0.8-1.fc28.remi.x86_64
php73-php-xmlrpc-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-process-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-cli-7.3.1~RC1-1.fc28.remi.x86_64
php73-php-pecl-msgpack-2.0.3-1.fc28.remi.x86_64

(gdb) bt
#0  0x000055a8f4f64703 in zlog_stream_set_msg_suffix (stream=stream@entry=0x200, suffix=suffix@entry=0x0, final_suffix=final_suffix@entry=0x55a8f4fc37f6 ", pipe is closed")
    at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/zlog.c:602
#1  0x000055a8f4f61b99 in fpm_stdio_child_said (ev=<optimized out>, which=<optimized out>, arg=0x55a8f6b9bdd0) at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/fpm_stdio.c:199
#2  0x000055a8f4f67026 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/events/epoll.c:145
#3  0x000055a8f4f59914 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/fpm_events.c:409
#4  0x000055a8f4f53ca7 in fpm_run (max_requests=0x7ffc4910c26c) at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/fpm.c:113
#5  0x000055a8f4d31404 in main (argc=2, argv=0x7ffc4910c888) at /usr/src/debug/php73-php-7.3.1~RC1-1.fc28.remi.x86_64/sapi/fpm/fpm/fpm_main.c:1864


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-01-08 10:16 UTC] remi@php.net
-Type: Security +Type: Bug
 [2019-01-08 10:16 UTC] remi@php.net
No need to be private
 [2019-01-08 19:48 UTC] bukka@php.net
-Status: Open +Status: Assigned -Package: *General Issues +Package: FPM related -Assigned To: +Assigned To: bukka
 [2019-01-08 19:48 UTC] bukka@php.net
> #0  0x000055a8f4f64703 in zlog_stream_set_msg_suffix (stream=stream@entry=0x200, suffix=suffix@entry=0x0, final_suffix=final_suffix@entry=0x55a8f4fc37f6 ", pipe is closed")

This looks like something went really wrong as initially reading of the pipe failed and then the stream pointer is 0x200 which looks like the child structure got corrupted maybe.

Could you share your FPM config and possibly a bit more details about the app? Can you see "pipe is closed" in the logs when using 7.2 or 7.1 (just want check if the pipe is getting closed in there as well)?
 [2019-01-09 06:48 UTC] claudiu_beta at yahoo dot com
I'm using exactly the same templates for all 7.1,7.2,7.3 on few servers. I only have issues with 7.3.
My monitoring script finds php73 service down several times a day, sometimes two or three times each hour.

Maybe a coincidence, I don't know. Till RC6 I had no issues or, let say, just 1-2 issues a week (under 100 pools). After 7.3.0 GA, the number of issues increased a lot (number of pools per server on average 200-300). Different servers, different sites, the same issue.
In logs I have no errors, nor "pipe is closed", nor anything else. I have checked the logs before and after the crash, nothing.
With 7.2, for example, I have 1000+ pools on the same server with no crash issues. I have errors in logs like

[07-Jan-2019 20:34:25] ERROR: unable to read what child say: Bad file descriptor (9)
[07-Jan-2019 20:34:25] ERROR: unable to read what child say: Bad file descriptor (9)
[07-Jan-2019 20:35:27] NOTICE: [pool poolname] child 23542 exited with code 0 after 69.601713 seconds from start
[07-Jan-2019 20:35:27] NOTICE: [pool poolname] child 23797 started

but service will not crash.




php-fpm.conf

include=/etc/opt/remi/php73/php-fpm.d/*.conf

[global]
pid = /var/opt/remi/php73/run/php-fpm/php-fpm.pid
error_log = /var/log/php-fpm/master73.log
syslog.ident = php-fpm73
daemonize = yes
rlimit_files = 30000

and a pool

[poolname]

user = apache
group = apache

listen = /var/opt/remi/php73/run/php-fpm/pool.sock
listen.acl_users = apache
listen.allowed_clients = 127.0.0.1

pm = ondemand
pm.max_children = 15
pm.process_idle_timeout = 4s
pm.max_requests = 30

pm.status_path = /fpmstats

slowlog = /var/log/php-fpm/00000_www-slow.log
request_slowlog_timeout = 60
request_terminate_timeout = 63

chdir = /home/users/domain
catch_workers_output = yes

security.limit_extensions = .php .phtml .phar

php_admin_value[mail.log] = /var/log/php-fpm/php_mail.log
php_admin_value[mail.add_x_header] = On

php_admin_value[error_log] = /var/log/php-fpm/domain.73.log
php_admin_value[doc_root] = "/home/users/domain"
php_admin_value[max_execution_time] = 60
php_admin_value[max_input_time] = 90
php_admin_value[max_input_vars] = 1000
php_admin_value[memory_limit] = 256M
php_admin_value[post_max_size] = 10M
php_admin_value[upload_max_filesize] = 10M
php_admin_value[default_socket_timeout] = 30
php_admin_value[open_basedir] = /home/users/domain/:/tmp/:/usr/share/pear/
php_admin_value[disable_functions] = mail, mb_send_mail, system, exec, shell_exec, passthru, error_log, dl, sys_getloadavg, pfsockopen, openlog, syslog, readlink, symlink, link, leak, popen, escapeshellcmd
 [2019-01-09 19:47 UTC] bukka@php.net
I guess it might be related to child restart possibly due request_terminate_timeout and not doing clean up properly but will need to verify it. Thanks for the config.
 [2019-01-14 09:21 UTC] claudiu_beta at yahoo dot com
I have upgraded to Fedora 29 and php 7.3.1. This is a backtrace from a new coredump.

#0  0x000055eb8b44fc73 in zlog_stream_set_msg_suffix (stream=stream@entry=0x100, suffix=suffix@entry=0x0, final_suffix=final_suffix@entry=0x55eb8b4aedf6 ", pipe is closed")
    at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/zlog.c:602
#1  0x000055eb8b44d109 in fpm_stdio_child_said (ev=<optimized out>, which=<optimized out>, arg=0x55eb8c536250) at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/fpm_stdio.c:199
#2  0x000055eb8b452596 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/events/epoll.c:145
#3  0x000055eb8b444e84 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/fpm_events.c:409
#4  0x000055eb8b43f217 in fpm_run (max_requests=0x7fff9540e34c) at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/fpm.c:113
#5  0x000055eb8b21c924 in main (argc=2, argv=0x7fff9540e968) at /usr/src/debug/php73-php-7.3.1-1.fc29.remi.x86_64/sapi/fpm/fpm/fpm_main.c:1864
 [2019-01-14 12:54 UTC] sjon at hortensius dot net
I reported the exact same issue in 7.3.1 (as released) which NikiC tagged as a duplicate. There aren't necessarily any requests - according to the access-log, the last request was @ 03:43:35 and lasted 50ms (segfault @ 03:43:43).

If you need more input let me know - I have a coredump on a debug-build
 [2019-01-15 20:15 UTC] bukka@php.net
I think that I know what the issue is. The event can be received after the child is freed which means that the shared log stream is uninitialized and it can't be used.

Are you able to test the following patch:

diff --git a/sapi/fpm/fpm/fpm_stdio.c b/sapi/fpm/fpm/fpm_stdio.c
index ba8f6d8213..0ff189c9f0 100644
--- a/sapi/fpm/fpm/fpm_stdio.c
+++ b/sapi/fpm/fpm/fpm_stdio.c
@@ -196,8 +196,10 @@ static void fpm_stdio_child_said(struct fpm_event_s *ev, short which, void *arg)
        }
 
        if (read_fail) {
-               zlog_stream_set_msg_suffix(log_stream, NULL, ", pipe is closed");
-               zlog_stream_finish(log_stream);
+               if (!child->log_stream) {
+                       zlog_stream_set_msg_suffix(log_stream, NULL, ", pipe is closed");
+                       zlog_stream_finish(log_stream);
+               }
                if (read_fail < 0) {
                        zlog(ZLOG_SYSERROR, "unable to read what child say");
                }
 [2019-01-15 20:20 UTC] bukka@php.net
Actually please try this one if you can:

diff --git a/sapi/fpm/fpm/fpm_stdio.c b/sapi/fpm/fpm/fpm_stdio.c
index ba8f6d8213..03d15cbf0d 100644
--- a/sapi/fpm/fpm/fpm_stdio.c
+++ b/sapi/fpm/fpm/fpm_stdio.c
@@ -122,7 +122,7 @@ static void fpm_stdio_child_said(struct fpm_event_s *ev, short which, void *arg)
        struct fpm_event_s *event;
        int fifo_in = 1, fifo_out = 1;
        int in_buf = 0;
-       int read_fail = 0, finish_log_stream = 0;
+       int read_fail = 0, finish_log_stream = 0, create_log_stream;
        int res;
        struct zlog_stream *log_stream;
 
@@ -138,7 +138,8 @@ static void fpm_stdio_child_said(struct fpm_event_s *ev, short which, void *arg)
                event = &child->ev_stderr;
        }
 
-       if (!child->log_stream) {
+       create_log_stream = !child->log_stream;
+       if (create_log_stream) {
                log_stream = child->log_stream = malloc(sizeof(struct zlog_stream));
                zlog_stream_init_ex(log_stream, ZLOG_WARNING, STDERR_FILENO);
                zlog_stream_set_decorating(log_stream, child->wp->config->decorate_workers_output);
@@ -196,8 +197,10 @@ static void fpm_stdio_child_said(struct fpm_event_s *ev, short which, void *arg)
        }
 
        if (read_fail) {
-               zlog_stream_set_msg_suffix(log_stream, NULL, ", pipe is closed");
-               zlog_stream_finish(log_stream);
+               if (create_log_stream) {
+                       zlog_stream_set_msg_suffix(log_stream, NULL, ", pipe is closed");
+                       zlog_stream_finish(log_stream);
+               }
                if (read_fail < 0) {
                        zlog(ZLOG_SYSERROR, "unable to read what child say");
                }
 [2019-01-16 12:23 UTC] remi@php.net
FYI, build including this patch is available in remi-test repository (F28/F29 only)
 [2019-01-17 04:28 UTC] mnikulin at plesk dot com
With the patch "[2019-01-15 20:20 UTC] bukka@php.net"
I faced Bug #77114 "php-fpm master segfaults in fpm_event_epoll_wait/fpm_event_fire"

I have adjusted configuration of pools to make child exit
events frequent enough and provide moderate load to ~30 pools

    pm.process_idle_timeout = 2s
    php_value[error_reporting] = 22519
    pm.max_requests = 2

(gdb) bt
#0  0x0000004200000000 in ?? ()
#1  0x000000000085c1e0 in fpm_event_fire (ev=0x1ce15b0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
#2  0x000000000086f4f8 in fpm_event_epoll_wait (queue=0x1cea5a0, timeout=1000) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:145
#3  0x000000000085bf24 in fpm_event_loop (err=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:409
#4  0x000000000085420f in fpm_run (max_requests=0x7ffcc6873c9c) at /usr/src/debug/sapi/fpm/fpm/fpm.c:113
#5  0x00000000008618e9 in main (argc=2, argv=0x7ffcc6873ed8) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1873

(gdb) frame 1
#1  0x000000000085c1e0 in fpm_event_fire (ev=0x1ce15b0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
465	in /usr/src/debug/sapi/fpm/fpm/fpm_events.c
(gdb) p ev
$44 = (struct fpm_event_s *) 0x1ce15b0
(gdb) print epollfds[15]
$45 = {events = 1, data = {ptr = 0x1ce15b0, fd = 30283184, u32 = 30283184, u64 = 30283184}}

(gdb) print *(struct fpm_event_s *)epollfds[14].data.ptr
$41 = {fd = 34, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x864541 <fpm_pctl_on_socket_accept>, arg = 0x1cd40b0, flags = 10, index = 34, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[15].data.ptr
$42 = {fd = 0, timeout = {tv_sec = 8822805, tv_usec = 30283040}, frequency = {tv_sec = 356482285570, tv_usec = 2}, callback = 0x4200000000, arg = 0x53, flags = 0, index = 0, which = 0}
(gdb) print *(struct fpm_event_s *)epollfds[16].data.ptr
$43 = {fd = 124, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x86a015 <fpm_stdio_child_said>, arg = 0x1ce0dd0, flags = 2, index = 124, which = 2}

Callback in epollfds[15].data.ptr (related to fired ev) looks suspicious.
 [2019-01-17 08:17 UTC] claudiu_beta at yahoo dot com
Thanks to updated, recompiled packages from Remi for Fedora 29, I was able to test the patched version. I can now confirm that the issue is gone.
Thank you all!
 [2019-01-17 18:37 UTC] bukka@php.net
Automatic comment on behalf of bukka
Revision: http://git.php.net/?p=php-src.git;a=commit;h=766b4fd5150031878188a03ea77bb9a0f73e4dc5
Log: Fix bug #77430 (php-fpm crashes with Main process exited)
 [2019-01-17 18:37 UTC] bukka@php.net
-Status: Assigned +Status: Closed
 [2019-01-17 18:39 UTC] bukka@php.net
Automatic comment on behalf of bukka
Revision: http://git.php.net/?p=php-src.git;a=commit;h=766b4fd5150031878188a03ea77bb9a0f73e4dc5
Log: Fix bug #77430 (php-fpm crashes with Main process exited)
 [2019-01-17 18:56 UTC] bukka@php.net
It's merged and will be part of 7.3.2. It's quite difficult to do a test for this but will think about it as it's a bit more generic issue as noted below.

@claudiu Thanks for testing!

@mnikulin That's a different issue so this fix cannot fix it as it's in completely different place. Will try to take a look on that one later.
 [2019-01-18 10:10 UTC] mnikulin at plesk dot com
My point was that I can not confirm that provided fix works perfectly.
I managed to get a test server with usual time to crash in the range
from several seconds to several minutes.
Without the fix provided for this bug I get crash in zlog_stream_set_msg_suffix.
With the patch I get the crash with the same rate.
I am in doubt if failure in zlog_stream_set_msg_suffix
is eliminated completely or it is just delayed and corrupted
epoll data becomes primary source of failure.

Both bugs are annoying enough and not easy to reproduce.
I was really surprised when I have got another semi-mysterious crash.

I have added some speculations to the Bug #77114.

Thank you for the fix,
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 16:01:28 2024 UTC