php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77114 php-fpm master segfaults in fpm_event_epoll_wait/fpm_event_fire
Submitted: 2018-11-06 11:35 UTC Modified: 2019-07-22 08:41 UTC
Votes:9
Avg. Score:4.2 ± 0.8
Reproduced:8 of 8 (100.0%)
Same Version:3 (37.5%)
Same OS:4 (50.0%)
From: nvizovitin at plesk dot com Assigned:
Status: Duplicate Package: FPM related
PHP Version: 7.2.11 OS: CentOS 7
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: nvizovitin at plesk dot com
New email:
PHP Version: OS:

 

 [2018-11-06 11:35 UTC] nvizovitin at plesk dot com
Description:
------------
php-fpm master process is occasionally killed by SIGSEGV. There are no exact steps to reproduce the issue. The following additional details are known.

Observed on the following configurations:
1. CloudLinux 7.5, PHP 7.0.32.
2. CentOS 7.5.1804, PHP 7.2.9.

In all observed cases segmentation fault occurred during processing of epoll event, called from fpm_event_epoll_wait() on line:
  fpm_event_fire((struct fpm_event_s *)epollfds[i].data.ptr);
when attempting to call the callback in fpm_event_fire():
  (*ev->callback)( (struct fpm_event_s *) ev, ev->which, ev->arg);
presumably due to memory corruption of (struct fpm_event_s *) retrieved from epoll_wait() output.

For PHP 7.0.32 php-fpm was configured with:

./configure --prefix=/opt/plesk/php/7.0 --bindir=/opt/plesk/php/7.0/bin --sbindir=/opt/plesk/php/7.0/sbin --libexecdir=/opt/plesk/php/7.0/lib64 --datadir=/opt/plesk/php/7.0/share --sysconfdir=/opt/plesk/php/7.0/etc --sharedstatedir=/opt/plesk/php/7.0/com --localstatedir=/opt/plesk/php/7.0/var --libdir=/opt/plesk/php/7.0/lib64 --includedir=/opt/plesk/php/7.0/include --infodir=/opt/plesk/php/7.0/share/info --mandir=/opt/plesk/php/7.0/share/man --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --with-libdir=lib64 --cache-file=../config.cache --with-config-file-path=/opt/plesk/php/7.0/etc --with-config-file-scan-dir=/opt/plesk/php/7.0/etc/php.d --disable-debug --without-pear --with-pic --disable-rpath --with-bz2 --with-freetype-dir=/usr --with-png-dir=/usr --with-xpm-dir=/usr --enable-gd-native-ttf --without-gdbm --with-gettext --with-gmp --with-iconv --with-jpeg-dir=/usr --with-openssl --with-pcre-regex --with-zlib --with-layout=GNU --enable-exif --enable-ftp --enable-sockets --enable-sysvsem --enable-sysvshm --enable-sysvmsg --with-kerberos --enable-calendar --with-libxml-dir=/usr --enable-xml --enable-phpdbg --with-enchant=shared,/usr --with-system-tzdata --with-litespeed --with-sodium=/opt/plesk-libsodium-1.0.16 --enable-fpm --with-fpm-acl --without-mysql --disable-pdo --without-mysql --without-gd --disable-dom --disable-dba --without-unixODBC --disable-pdo --disable-xmlreader --disable-xmlwriter --without-sqlite3 --disable-phar --disable-fileinfo --disable-json --without-pspell --disable-wddx --without-curl --disable-posix --disable-sysvmsg --disable-sysvshm --disable-sysvsem --with-fpm-systemd

Additional note: this issue looks remotely similar to https://bugs.php.net/bug.php?id=62418 (but this bug looks abandoned).

Expected result:
----------------
No segmentation faults in master process.

Actual result:
--------------
php-fpm master process segmentation fault upon attempting to call a callback for a epoll event. All observed backtraces are similar.


In php-fpm error.log at the time of segfault (although such errors sometimes are also observed w/o any master crashes):

[05-nov-2018 12:33:55] ERROR: unable to read what child say: Descriptor de fichero err<C3><B3>neo (9)
[05-nov-2018 15:38:29] ERROR: unable to read what child say: Descriptor de fichero err<C3><B3>neo (9)


Backtrace 1 (php-fpm 7.0.32, CloudLinux 7.5, 2018-11-05-15:38:29-2294865):

(gdb) bt full
#0  0x0000000001a0914e in ?? ()
No symbol table info available.
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006b36d1 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 483544, tv_usec = 545085}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006add07 in fpm_run (max_requests=max_requests@entry=0x7ffec911d888) at /usr/src/debug/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000437f5c in main (argc=2, argv=0x7ffec911dbb8) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1881
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 65280, mmap = {len = 0, pos = 0, map = 0x6678756e696c6573, buf = 0x662f7379732f0073 <Address 0x662f7379732f0073 out of bounds>, old_handle = 0x0,
                old_closer = 0x0}, reader = 0x0, fsizer = 0x0, closer = 0x0}}, filename = 0xff00000000000000 <Address 0xff00000000000000 out of bounds>, opened_path = 0x5f4c53534e45504f, type = 842228073, free_filename = 99 'c'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 1
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
147     /usr/src/debug/sapi/fpm/fpm/events/epoll.c: No such file or directory.
(gdb) print epollfds
$1 = (struct epoll_event *) 0x1a46960
(gdb) print i
$2 = <optimized out>
(gdb) print ret
$3 = 3
(gdb) print epollfds[0]
$4 = {events = 1, data = {ptr = 0xa7c920 <signal_fd_event.13366>, fd = 10996000, u32 = 10996000, u64 = 10996000}}
(gdb) print epollfds[1]
$5 = {events = 16, data = {ptr = 0x1acd1d8, fd = 28103128, u32 = 28103128, u64 = 28103128}}
(gdb) print epollfds[2]
$6 = {events = 16, data = {ptr = 0x1acd220, fd = 28103200, u32 = 28103200, u64 = 28103200}}
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$7 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$8 = {fd = 27300038, timeout = {tv_sec = 27300101, tv_usec = 27300107}, frequency = {tv_sec = 27300119, tv_usec = 27300130}, callback = 0x1a090d7, arg = 0x1a090e0, flags = 27300076, index = 0, which = -28423}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$9 = {fd = 27300142, timeout = {tv_sec = 27300156, tv_usec = 27300209}, frequency = {tv_sec = 27300215, tv_usec = 27300167}, callback = 0x1a0914e, arg = 0x1a0915a, flags = 27300197, index = 0, which = -28285}


Backtrace 2 (php-fpm 7.0.32, CloudLinux 7.5, 2018-11-05-12:33:55-1446445):

(gdb) bt full
#0  0x00000000018ff880 in ?? ()
No symbol table info available.
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006b36d1 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 472471, tv_usec = 190982}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006add07 in fpm_run (max_requests=max_requests@entry=0x7fff1d0f1908) at /usr/src/debug/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000437f5c in main (argc=2, argv=0x7fff1d0f1c38) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1881
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 65280, mmap = {len = 0, pos = 0, map = 0x6678756e696c6573, buf = 0x662f7379732f0073 <Address 0x662f7379732f0073 out of bounds>, old_handle = 0x0,
                old_closer = 0x0}, reader = 0x0, fsizer = 0x0, closer = 0x0}}, filename = 0xff00000000000000 <Address 0xff00000000000000 out of bounds>, opened_path = 0x5f4c53534e45504f, type = 842228073, free_filename = 99 'c'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 1
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
147     /usr/src/debug/sapi/fpm/fpm/events/epoll.c: No such file or directory.
(gdb) print ret
$1 = 3
 
(gdb) print epollfds[0]
$6 = {events = 1, data = {ptr = 0xa7c920 <signal_fd_event.13366>, fd = 10996000, u32 = 10996000, u64 = 10996000}}
(gdb) print epollfds[1]
$7 = {events = 16, data = {ptr = 0x18ff8d8, fd = 26212568, u32 = 26212568, u64 = 26212568}}
(gdb) print epollfds[2]
$8 = {events = 16, data = {ptr = 0x18ff920, fd = 26212640, u32 = 26212640, u64 = 26212640}}
 
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$9 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$10 = {fd = 0, timeout = {tv_sec = 0, tv_usec = 65}, frequency = {tv_sec = 7018986666877744431, tv_usec = 7809632559044715890}, callback = 0x2e53455f73652f65, arg = 0x434c2f382d465455, flags = 1397050719, index = 1162297683,
  which = 12115}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$11 = {fd = 111, timeout = {tv_sec = 81, tv_usec = 26212592}, frequency = {tv_sec = 139861315026945, tv_usec = 0}, callback = 0x18ff880, arg = 0x18c1020, flags = 25956640, index = 0, which = 4256}
 
(gdb) print *fpm_event_queue_fd
$18 = {prev = 0x0, next = 0x18c1000, ev = 0x18ffa60}
(gdb) print *fpm_event_queue_fd->ev
$20 = {fd = 41, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6ba810 <fpm_stdio_child_said>, arg = 0x18ff9f0, flags = 2, index = 41, which = 2}
(gdb) print *fpm_event_queue_fd->next
$21 = {prev = 0x18c0fe0, next = 0x18ff790, ev = 0x18ffa18}
(gdb) print *fpm_event_queue_fd->next->ev
$22 = {fd = 38, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6ba810 <fpm_stdio_child_said>, arg = 0x18ff9f0, flags = 2, index = 38, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->ev
$23 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->ev
$24 = {fd = 32, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18f45a0, flags = 10, index = 32, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->ev
$25 = {fd = 31, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18c07b0, flags = 10, index = 31, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->ev
$26 = {fd = 30, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18c0160, flags = 10, index = 30, which = 2}
...
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->ev
$36 = {fd = 10, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x183a870, flags = 10, index = 10, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->ev
$37 = {fd = 9, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x1815b70, flags = 10, index = 9, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next
$38 = {prev = 0x1985400, next = 0x0, ev = 0x18caec0}
 
(gdb) print *fpm_event_queue_timer
$39 = {prev = 0x0, next = 0x18ff7b0, ev = 0xa7d420 <heartbeat.13415>}
(gdb) print *fpm_event_queue_timer->ev
$40 = {fd = -1, timeout = {tv_sec = 472471, tv_usec = 773859}, frequency = {tv_sec = 10, tv_usec = 0}, callback = 0x6bd7a0 <fpm_systemd_heartbeat>, arg = 0x0, flags = 4, index = -1, which = 1}
(gdb) print *fpm_event_queue_timer->next->ev
$41 = {fd = -1, timeout = {tv_sec = 472472, tv_usec = 166451}, frequency = {tv_sec = 1, tv_usec = 0}, callback = 0x6b6d50 <fpm_pctl_perform_idle_server_maintenance_heartbeat>, arg = 0x0, flags = 4, index = -1, which = 1}
(gdb) print *fpm_event_queue_timer->next
$42 = {prev = 0x18ff820, next = 0x0, ev = 0xa7ca20 <heartbeat.13686>}


Backtrace 3 (php-fpm 7.2.9, CentOS 7.5.1804, pid17709-time1535372443):

(gdb) bt full
#0  0x00000000006fd156 in fpm_event_fire (ev=0x15986d0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:467
No locals.
#1  0x0000000000707346 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006fd4f1 in fpm_event_loop (err=err@entry=0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 1990412, tv_usec = 373331}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6fc9d0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006f7a77 in fpm_run (max_requests=max_requests@entry=0x7ffe34023c58) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000438550 in main (argc=2, argv=0x7ffe34023f88) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_main.c:1880
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 0, mmap = {len = 6866955149093720143, pos = 31632276063478121, map = 0x6168732f7273752f,
                buf = 0x6c61636f6c2f6572 <Address 0x6c61636f6c2f6572 out of bounds>, old_handle = 0x0, old_closer = 0xff00}, reader = 0x0, fsizer = 0xff00000000000000, closer = 0x5f4c53534e45504f}},
          filename = 0x70616332336169 <Address 0x70616332336169 out of bounds>, opened_path = 0x6168732f7273752f, type = 1815045490, free_filename = 111 'o'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 0
#0  0x00000000006fd156 in fpm_event_fire (ev=0x15986d0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:467
467     in /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c
(gdb) print *ev
$1 = {fd = 1801807218, timeout = {tv_sec = 4047978859204596295, tv_usec = 7307090632390554936}, frequency = {tv_sec = 6874014757463356002, tv_usec = 4047978859204594761}, callback = 0x726800382d393538, arg = 0x6800696b73746176,
  flags = 1380474738, index = 1330858286, which = 14381}
(gdb) print epollfds
$2 = (struct epoll_event *) 0x1595100
(gdb) print epollfds[0]
$3 = {events = 1, data = {ptr = 0xadf840 <signal_fd_event.13578>, fd = 11401280, u32 = 11401280, u64 = 11401280}}
(gdb) print epollfds[1]
$4 = {events = 16, data = {ptr = 0x1598688, fd = 22644360, u32 = 22644360, u64 = 22644360}}
(gdb) print epollfds[2]
$5 = {events = 16, data = {ptr = 0x15986d0, fd = 22644432, u32 = 22644432, u64 = 22644432}}
(gdb) print epollfds[3]
$6 = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$7 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6fc9d0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$8 = {fd = 7300965, timeout = {tv_sec = 4051037683542732371, tv_usec = 7809636479359858997}, frequency = {tv_sec = 7811212552778113897, tv_usec = 3264919828641826143}, callback = 0x6700312d39353838, arg = 0x6564006e616d7265,
  flags = 776291423, index = 760173385, which = 14392}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$9 = {fd = 1801807218, timeout = {tv_sec = 4047978859204596295, tv_usec = 7307090632390554936}, frequency = {tv_sec = 6874014757463356002, tv_usec = 4047978859204594761}, callback = 0x726800382d393538, arg = 0x6800696b73746176,
  flags = 1380474738, index = 1330858286, which = 14381}


Notice that in all backtraces epollfds[0] value is OK, while epollfds[1] and epollfds[2] seem to contain corrupted values (in particular, callback pointer is invalid, which likely caused segmentation fault).

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-01-18 09:57 UTC] mnikulin at plesk dot com
Just some speculations, I may be wrong.

It seems that it is possible to get signal notification
before read on children stdout/stderr file descriptor event
in epoll_wait() results. I am afraid that fpm_children_bury()
may free some fpm_child_s structures but epoll events with greater
indices may have the same pointers in epoll_event.data.ptr,
so the attempt to call ev->callback obtained from overwritten
memory may cause segfault.

I have no idea which way zero pointer may appear in

fpm_event_epoll_wait()
sapi/fpm/fpm/events/epoll.c:139
		/* do we have a valid ev ptr ? */
		if (!epollfds[i].data.ptr) {
			continue;
		}

if everything works correctly. I am suspicious concerning
the following code as well

fpm_stdio_child_said()
sapi/fpm/fpm/fpm_stdio.c:129
	if (!arg) {
		return;
	}

From my point of view it may be struggling with
symptoms, leaving the root cause untouched. I admit
that it makes failures more rare (and harder to reproduce).

Deleting of some data while iterating over events require
great care. I would prefer to have io handling and
freeing of memory separated.

I have a test server with tuned pool configuration to make
children death frequent enough

    pm = ondemand
    pm.process_idle_timeout = 2s
    php_value[error_reporting] = 22519
    pm.max_requests = 2

I have ~30 pools, maybe it is enough o have a single pool with
max_children ~ 100.

Under moderate load crashes of master process happen often enough
(from several seconds to several minutes).
 [2019-01-18 11:34 UTC] mnikulin at plesk dot com
backtrace variant with patch from Bug #77430 applied.
Notice ev->index = 1 that means that the event has been removed.

(gdb) bt
#0  0x000000000086e717 in zlog_stream_finish (stream=0xf0) at /usr/src/debug/sapi/fpm/fpm/zlog.c:811
#1  0x000000000086a471 in fpm_stdio_child_said (ev=0x1dc92e0, which=2, arg=0x1dc9270) at /usr/src/debug/sapi/fpm/fpm/fpm_stdio.c:218
#2  0x000000000085c1e0 in fpm_event_fire (ev=0x1dc92e0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
#3  0x000000000086f4f8 in fpm_event_epoll_wait (queue=0x1dc9370, timeout=862) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:145
#4  0x000000000085bf24 in fpm_event_loop (err=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:409
#5  0x000000000085420f in fpm_run (max_requests=0x7ffeb1eeb58c) at /usr/src/debug/sapi/fpm/fpm/fpm.c:113
#6  0x00000000008618e9 in main (argc=2, argv=0x7ffeb1eeb7c8) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1873
(gdb) f 2
#2  0x000000000085c1e0 in fpm_event_fire (ev=0x1dc92e0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
465	/usr/src/debug/sapi/fpm/fpm/fpm_events.c: No such file or directory.
(gdb) p ev
$1 = (struct fpm_event_s *) 0x1dc92e0
(gdb) p *ev
$2 = {fd = 83, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x86a015 <fpm_stdio_child_said>, arg = 0x1dc9270, flags = 2, index = -1, which = 2}
 [2019-01-18 11:41 UTC] mnikulin at plesk dot com
It seems that the following quick hack with delayed bury
at least significantly improves stability.

--- sapi/fpm/fpm/fpm_events.c.orig	2019-01-18 11:09:38.000000000 +0000
+++ sapi/fpm/fpm/fpm_events.c	2019-01-18 11:12:53.000000000 +0000
@@ -51,6 +51,14 @@
 }
 /* }}} */
 
+static struct fpm_event_s children_bury_timer;
+
+static void fpm_postponed_children_bury(struct fpm_event_s *ev, short which, void *arg) /* {{{ */
+{
+	fpm_children_bury();
+}
+/* }}} */
+
 static void fpm_got_signal(struct fpm_event_s *ev, short which, void *arg) /* {{{ */
 {
 	char c;
@@ -72,7 +80,8 @@
 		switch (c) {
 			case 'C' :                  /* SIGCHLD */
 				zlog(ZLOG_DEBUG, "received SIGCHLD");
-				fpm_children_bury();
+				fpm_event_set_timer(&children_bury_timer, 0, &fpm_postponed_children_bury, NULL);
+				fpm_event_add(&children_bury_timer, 0);
 				break;
 			case 'I' :                  /* SIGINT  */
 				zlog(ZLOG_DEBUG, "received SIGINT");
 [2019-07-10 03:09 UTC] mnikulin at plesk dot com
Bug #77185 suggests a way to reproduce the issue.
With older libc (e.g. in CentOS-7) the following environment
variable may be set to certainly catch use after free cases

    MALLOC_PERTURB_=165

The patch fpm-race-condition.patch attached to the Bug #65398
does not eliminate the problem.
 [2019-07-10 04:06 UTC] mnikulin at plesk dot com
The following pull request has been associated:

Patch Name: Prevent use after free in fpm_event_epoll_wait
On GitHub:  https://github.com/php/php-src/pull/4385
Patch:      https://github.com/php/php-src/pull/4385.patch
 [2019-07-22 08:41 UTC] nikic@php.net
-Status: Open +Status: Duplicate
 [2019-07-22 08:41 UTC] nikic@php.net
Marking this as a duplicate of bug #77185, which has detailed analysis and is now fixed.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Dec 04 08:01:29 2024 UTC