php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77114 php-fpm master segfaults in fpm_event_epoll_wait/fpm_event_fire
Submitted: 2018-11-06 11:35 UTC Modified: 2019-07-22 08:41 UTC
Votes:9
Avg. Score:4.2 ± 0.8
Reproduced:8 of 8 (100.0%)
Same Version:3 (37.5%)
Same OS:4 (50.0%)
From: nvizovitin at plesk dot com Assigned:
Status: Duplicate Package: FPM related
PHP Version: 7.2.11 OS: CentOS 7
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: nvizovitin at plesk dot com
New email:
PHP Version: OS:

 

 [2018-11-06 11:35 UTC] nvizovitin at plesk dot com
Description:
------------
php-fpm master process is occasionally killed by SIGSEGV. There are no exact steps to reproduce the issue. The following additional details are known.

Observed on the following configurations:
1. CloudLinux 7.5, PHP 7.0.32.
2. CentOS 7.5.1804, PHP 7.2.9.

In all observed cases segmentation fault occurred during processing of epoll event, called from fpm_event_epoll_wait() on line:
  fpm_event_fire((struct fpm_event_s *)epollfds[i].data.ptr);
when attempting to call the callback in fpm_event_fire():
  (*ev->callback)( (struct fpm_event_s *) ev, ev->which, ev->arg);
presumably due to memory corruption of (struct fpm_event_s *) retrieved from epoll_wait() output.

For PHP 7.0.32 php-fpm was configured with:

./configure --prefix=/opt/plesk/php/7.0 --bindir=/opt/plesk/php/7.0/bin --sbindir=/opt/plesk/php/7.0/sbin --libexecdir=/opt/plesk/php/7.0/lib64 --datadir=/opt/plesk/php/7.0/share --sysconfdir=/opt/plesk/php/7.0/etc --sharedstatedir=/opt/plesk/php/7.0/com --localstatedir=/opt/plesk/php/7.0/var --libdir=/opt/plesk/php/7.0/lib64 --includedir=/opt/plesk/php/7.0/include --infodir=/opt/plesk/php/7.0/share/info --mandir=/opt/plesk/php/7.0/share/man --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --with-libdir=lib64 --cache-file=../config.cache --with-config-file-path=/opt/plesk/php/7.0/etc --with-config-file-scan-dir=/opt/plesk/php/7.0/etc/php.d --disable-debug --without-pear --with-pic --disable-rpath --with-bz2 --with-freetype-dir=/usr --with-png-dir=/usr --with-xpm-dir=/usr --enable-gd-native-ttf --without-gdbm --with-gettext --with-gmp --with-iconv --with-jpeg-dir=/usr --with-openssl --with-pcre-regex --with-zlib --with-layout=GNU --enable-exif --enable-ftp --enable-sockets --enable-sysvsem --enable-sysvshm --enable-sysvmsg --with-kerberos --enable-calendar --with-libxml-dir=/usr --enable-xml --enable-phpdbg --with-enchant=shared,/usr --with-system-tzdata --with-litespeed --with-sodium=/opt/plesk-libsodium-1.0.16 --enable-fpm --with-fpm-acl --without-mysql --disable-pdo --without-mysql --without-gd --disable-dom --disable-dba --without-unixODBC --disable-pdo --disable-xmlreader --disable-xmlwriter --without-sqlite3 --disable-phar --disable-fileinfo --disable-json --without-pspell --disable-wddx --without-curl --disable-posix --disable-sysvmsg --disable-sysvshm --disable-sysvsem --with-fpm-systemd

Additional note: this issue looks remotely similar to https://bugs.php.net/bug.php?id=62418 (but this bug looks abandoned).

Expected result:
----------------
No segmentation faults in master process.

Actual result:
--------------
php-fpm master process segmentation fault upon attempting to call a callback for a epoll event. All observed backtraces are similar.


In php-fpm error.log at the time of segfault (although such errors sometimes are also observed w/o any master crashes):

[05-nov-2018 12:33:55] ERROR: unable to read what child say: Descriptor de fichero err<C3><B3>neo (9)
[05-nov-2018 15:38:29] ERROR: unable to read what child say: Descriptor de fichero err<C3><B3>neo (9)


Backtrace 1 (php-fpm 7.0.32, CloudLinux 7.5, 2018-11-05-15:38:29-2294865):

(gdb) bt full
#0  0x0000000001a0914e in ?? ()
No symbol table info available.
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006b36d1 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 483544, tv_usec = 545085}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006add07 in fpm_run (max_requests=max_requests@entry=0x7ffec911d888) at /usr/src/debug/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000437f5c in main (argc=2, argv=0x7ffec911dbb8) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1881
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 65280, mmap = {len = 0, pos = 0, map = 0x6678756e696c6573, buf = 0x662f7379732f0073 <Address 0x662f7379732f0073 out of bounds>, old_handle = 0x0,
                old_closer = 0x0}, reader = 0x0, fsizer = 0x0, closer = 0x0}}, filename = 0xff00000000000000 <Address 0xff00000000000000 out of bounds>, opened_path = 0x5f4c53534e45504f, type = 842228073, free_filename = 99 'c'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 1
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
147     /usr/src/debug/sapi/fpm/fpm/events/epoll.c: No such file or directory.
(gdb) print epollfds
$1 = (struct epoll_event *) 0x1a46960
(gdb) print i
$2 = <optimized out>
(gdb) print ret
$3 = 3
(gdb) print epollfds[0]
$4 = {events = 1, data = {ptr = 0xa7c920 <signal_fd_event.13366>, fd = 10996000, u32 = 10996000, u64 = 10996000}}
(gdb) print epollfds[1]
$5 = {events = 16, data = {ptr = 0x1acd1d8, fd = 28103128, u32 = 28103128, u64 = 28103128}}
(gdb) print epollfds[2]
$6 = {events = 16, data = {ptr = 0x1acd220, fd = 28103200, u32 = 28103200, u64 = 28103200}}
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$7 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$8 = {fd = 27300038, timeout = {tv_sec = 27300101, tv_usec = 27300107}, frequency = {tv_sec = 27300119, tv_usec = 27300130}, callback = 0x1a090d7, arg = 0x1a090e0, flags = 27300076, index = 0, which = -28423}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$9 = {fd = 27300142, timeout = {tv_sec = 27300156, tv_usec = 27300209}, frequency = {tv_sec = 27300215, tv_usec = 27300167}, callback = 0x1a0914e, arg = 0x1a0915a, flags = 27300197, index = 0, which = -28285}


Backtrace 2 (php-fpm 7.0.32, CloudLinux 7.5, 2018-11-05-12:33:55-1446445):

(gdb) bt full
#0  0x00000000018ff880 in ?? ()
No symbol table info available.
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006b36d1 in fpm_event_loop (err=err@entry=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 472471, tv_usec = 190982}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006add07 in fpm_run (max_requests=max_requests@entry=0x7fff1d0f1908) at /usr/src/debug/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000437f5c in main (argc=2, argv=0x7fff1d0f1c38) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1881
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 65280, mmap = {len = 0, pos = 0, map = 0x6678756e696c6573, buf = 0x662f7379732f0073 <Address 0x662f7379732f0073 out of bounds>, old_handle = 0x0,
                old_closer = 0x0}, reader = 0x0, fsizer = 0x0, closer = 0x0}}, filename = 0xff00000000000000 <Address 0xff00000000000000 out of bounds>, opened_path = 0x5f4c53534e45504f, type = 842228073, free_filename = 99 'c'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 1
#1  0x00000000006bd306 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:147
147     /usr/src/debug/sapi/fpm/fpm/events/epoll.c: No such file or directory.
(gdb) print ret
$1 = 3
 
(gdb) print epollfds[0]
$6 = {events = 1, data = {ptr = 0xa7c920 <signal_fd_event.13366>, fd = 10996000, u32 = 10996000, u64 = 10996000}}
(gdb) print epollfds[1]
$7 = {events = 16, data = {ptr = 0x18ff8d8, fd = 26212568, u32 = 26212568, u64 = 26212568}}
(gdb) print epollfds[2]
$8 = {events = 16, data = {ptr = 0x18ff920, fd = 26212640, u32 = 26212640, u64 = 26212640}}
 
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$9 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$10 = {fd = 0, timeout = {tv_sec = 0, tv_usec = 65}, frequency = {tv_sec = 7018986666877744431, tv_usec = 7809632559044715890}, callback = 0x2e53455f73652f65, arg = 0x434c2f382d465455, flags = 1397050719, index = 1162297683,
  which = 12115}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$11 = {fd = 111, timeout = {tv_sec = 81, tv_usec = 26212592}, frequency = {tv_sec = 139861315026945, tv_usec = 0}, callback = 0x18ff880, arg = 0x18c1020, flags = 25956640, index = 0, which = 4256}
 
(gdb) print *fpm_event_queue_fd
$18 = {prev = 0x0, next = 0x18c1000, ev = 0x18ffa60}
(gdb) print *fpm_event_queue_fd->ev
$20 = {fd = 41, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6ba810 <fpm_stdio_child_said>, arg = 0x18ff9f0, flags = 2, index = 41, which = 2}
(gdb) print *fpm_event_queue_fd->next
$21 = {prev = 0x18c0fe0, next = 0x18ff790, ev = 0x18ffa18}
(gdb) print *fpm_event_queue_fd->next->ev
$22 = {fd = 38, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6ba810 <fpm_stdio_child_said>, arg = 0x18ff9f0, flags = 2, index = 38, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->ev
$23 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b2bb0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->ev
$24 = {fd = 32, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18f45a0, flags = 10, index = 32, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->ev
$25 = {fd = 31, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18c07b0, flags = 10, index = 31, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->ev
$26 = {fd = 30, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x18c0160, flags = 10, index = 30, which = 2}
...
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->ev
$36 = {fd = 10, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x183a870, flags = 10, index = 10, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->ev
$37 = {fd = 9, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6b7620 <fpm_pctl_on_socket_accept>, arg = 0x1815b70, flags = 10, index = 9, which = 2}
(gdb) print *fpm_event_queue_fd->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next
$38 = {prev = 0x1985400, next = 0x0, ev = 0x18caec0}
 
(gdb) print *fpm_event_queue_timer
$39 = {prev = 0x0, next = 0x18ff7b0, ev = 0xa7d420 <heartbeat.13415>}
(gdb) print *fpm_event_queue_timer->ev
$40 = {fd = -1, timeout = {tv_sec = 472471, tv_usec = 773859}, frequency = {tv_sec = 10, tv_usec = 0}, callback = 0x6bd7a0 <fpm_systemd_heartbeat>, arg = 0x0, flags = 4, index = -1, which = 1}
(gdb) print *fpm_event_queue_timer->next->ev
$41 = {fd = -1, timeout = {tv_sec = 472472, tv_usec = 166451}, frequency = {tv_sec = 1, tv_usec = 0}, callback = 0x6b6d50 <fpm_pctl_perform_idle_server_maintenance_heartbeat>, arg = 0x0, flags = 4, index = -1, which = 1}
(gdb) print *fpm_event_queue_timer->next
$42 = {prev = 0x18ff820, next = 0x0, ev = 0xa7ca20 <heartbeat.13686>}


Backtrace 3 (php-fpm 7.2.9, CentOS 7.5.1804, pid17709-time1535372443):

(gdb) bt full
#0  0x00000000006fd156 in fpm_event_fire (ev=0x15986d0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:467
No locals.
#1  0x0000000000707346 in fpm_event_epoll_wait (queue=<optimized out>, timeout=<optimized out>) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/events/epoll.c:147
        ret = 3
        i = <optimized out>
        __func__ = "fpm_event_epoll_wait"
#2  0x00000000006fd4f1 in fpm_event_loop (err=err@entry=0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:411
        q = <optimized out>
        ms = <optimized out>
        tmp = <optimized out>
        timeout = <optimized out>
        ret = <optimized out>
        q2 = <optimized out>
        now = {tv_sec = 1990412, tv_usec = 373331}
        signal_fd_event = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6fc9d0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
        __func__ = "fpm_event_loop"
#3  0x00000000006f7a77 in fpm_run (max_requests=max_requests@entry=0x7ffe34023c58) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm.c:115
        wp = 0x0
#4  0x0000000000438550 in main (argc=2, argv=0x7ffe34023f88) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_main.c:1880
        exit_status = 0
        c = -1
        use_extended_info = 0
        file_handle = {handle = {fd = 0, fp = 0x0, stream = {handle = 0x0, isatty = 0, mmap = {len = 6866955149093720143, pos = 31632276063478121, map = 0x6168732f7273752f,
                buf = 0x6c61636f6c2f6572 <Address 0x6c61636f6c2f6572 out of bounds>, old_handle = 0x0, old_closer = 0xff00}, reader = 0x0, fsizer = 0xff00000000000000, closer = 0x5f4c53534e45504f}},
          filename = 0x70616332336169 <Address 0x70616332336169 out of bounds>, opened_path = 0x6168732f7273752f, type = 1815045490, free_filename = 111 'o'}
        orig_optind = 1
        orig_optarg = 0x0
        ini_entries_len = 0
        max_requests = 500
        requests = 0
        fcgi_fd = 0
        fpm_config = 0x0
        fpm_prefix = 0x0
        fpm_pid = 0x0
        test_conf = 0
        force_daemon = 0
        force_stderr = 0
        php_information = 0
        php_allow_to_run_as_root = 0
        __func__ = "main"
(gdb) frame 0
#0  0x00000000006fd156 in fpm_event_fire (ev=0x15986d0) at /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c:467
467     in /home/builder/buildbot/php-7.2-bcos7x64/build/7.2/php/work/php-7.2.9/sapi/fpm/fpm/fpm_events.c
(gdb) print *ev
$1 = {fd = 1801807218, timeout = {tv_sec = 4047978859204596295, tv_usec = 7307090632390554936}, frequency = {tv_sec = 6874014757463356002, tv_usec = 4047978859204594761}, callback = 0x726800382d393538, arg = 0x6800696b73746176,
  flags = 1380474738, index = 1330858286, which = 14381}
(gdb) print epollfds
$2 = (struct epoll_event *) 0x1595100
(gdb) print epollfds[0]
$3 = {events = 1, data = {ptr = 0xadf840 <signal_fd_event.13578>, fd = 11401280, u32 = 11401280, u64 = 11401280}}
(gdb) print epollfds[1]
$4 = {events = 16, data = {ptr = 0x1598688, fd = 22644360, u32 = 22644360, u64 = 22644360}}
(gdb) print epollfds[2]
$5 = {events = 16, data = {ptr = 0x15986d0, fd = 22644432, u32 = 22644432, u64 = 22644432}}
(gdb) print epollfds[3]
$6 = {events = 0, data = {ptr = 0x0, fd = 0, u32 = 0, u64 = 0}}
(gdb) print *(struct fpm_event_s *)epollfds[0].data.ptr
$7 = {fd = 6, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x6fc9d0 <fpm_got_signal>, arg = 0x0, flags = 2, index = 6, which = 2}
(gdb) print *(struct fpm_event_s *)epollfds[1].data.ptr
$8 = {fd = 7300965, timeout = {tv_sec = 4051037683542732371, tv_usec = 7809636479359858997}, frequency = {tv_sec = 7811212552778113897, tv_usec = 3264919828641826143}, callback = 0x6700312d39353838, arg = 0x6564006e616d7265,
  flags = 776291423, index = 760173385, which = 14392}
(gdb) print *(struct fpm_event_s *)epollfds[2].data.ptr
$9 = {fd = 1801807218, timeout = {tv_sec = 4047978859204596295, tv_usec = 7307090632390554936}, frequency = {tv_sec = 6874014757463356002, tv_usec = 4047978859204594761}, callback = 0x726800382d393538, arg = 0x6800696b73746176,
  flags = 1380474738, index = 1330858286, which = 14381}


Notice that in all backtraces epollfds[0] value is OK, while epollfds[1] and epollfds[2] seem to contain corrupted values (in particular, callback pointer is invalid, which likely caused segmentation fault).

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-01-18 09:57 UTC] mnikulin at plesk dot com
Just some speculations, I may be wrong.

It seems that it is possible to get signal notification
before read on children stdout/stderr file descriptor event
in epoll_wait() results. I am afraid that fpm_children_bury()
may free some fpm_child_s structures but epoll events with greater
indices may have the same pointers in epoll_event.data.ptr,
so the attempt to call ev->callback obtained from overwritten
memory may cause segfault.

I have no idea which way zero pointer may appear in

fpm_event_epoll_wait()
sapi/fpm/fpm/events/epoll.c:139
		/* do we have a valid ev ptr ? */
		if (!epollfds[i].data.ptr) {
			continue;
		}

if everything works correctly. I am suspicious concerning
the following code as well

fpm_stdio_child_said()
sapi/fpm/fpm/fpm_stdio.c:129
	if (!arg) {
		return;
	}

From my point of view it may be struggling with
symptoms, leaving the root cause untouched. I admit
that it makes failures more rare (and harder to reproduce).

Deleting of some data while iterating over events require
great care. I would prefer to have io handling and
freeing of memory separated.

I have a test server with tuned pool configuration to make
children death frequent enough

    pm = ondemand
    pm.process_idle_timeout = 2s
    php_value[error_reporting] = 22519
    pm.max_requests = 2

I have ~30 pools, maybe it is enough o have a single pool with
max_children ~ 100.

Under moderate load crashes of master process happen often enough
(from several seconds to several minutes).
 [2019-01-18 11:34 UTC] mnikulin at plesk dot com
backtrace variant with patch from Bug #77430 applied.
Notice ev->index = 1 that means that the event has been removed.

(gdb) bt
#0  0x000000000086e717 in zlog_stream_finish (stream=0xf0) at /usr/src/debug/sapi/fpm/fpm/zlog.c:811
#1  0x000000000086a471 in fpm_stdio_child_said (ev=0x1dc92e0, which=2, arg=0x1dc9270) at /usr/src/debug/sapi/fpm/fpm/fpm_stdio.c:218
#2  0x000000000085c1e0 in fpm_event_fire (ev=0x1dc92e0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
#3  0x000000000086f4f8 in fpm_event_epoll_wait (queue=0x1dc9370, timeout=862) at /usr/src/debug/sapi/fpm/fpm/events/epoll.c:145
#4  0x000000000085bf24 in fpm_event_loop (err=0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:409
#5  0x000000000085420f in fpm_run (max_requests=0x7ffeb1eeb58c) at /usr/src/debug/sapi/fpm/fpm/fpm.c:113
#6  0x00000000008618e9 in main (argc=2, argv=0x7ffeb1eeb7c8) at /usr/src/debug/sapi/fpm/fpm/fpm_main.c:1873
(gdb) f 2
#2  0x000000000085c1e0 in fpm_event_fire (ev=0x1dc92e0) at /usr/src/debug/sapi/fpm/fpm/fpm_events.c:465
465	/usr/src/debug/sapi/fpm/fpm/fpm_events.c: No such file or directory.
(gdb) p ev
$1 = (struct fpm_event_s *) 0x1dc92e0
(gdb) p *ev
$2 = {fd = 83, timeout = {tv_sec = 0, tv_usec = 0}, frequency = {tv_sec = 0, tv_usec = 0}, callback = 0x86a015 <fpm_stdio_child_said>, arg = 0x1dc9270, flags = 2, index = -1, which = 2}
 [2019-01-18 11:41 UTC] mnikulin at plesk dot com
It seems that the following quick hack with delayed bury
at least significantly improves stability.

--- sapi/fpm/fpm/fpm_events.c.orig	2019-01-18 11:09:38.000000000 +0000
+++ sapi/fpm/fpm/fpm_events.c	2019-01-18 11:12:53.000000000 +0000
@@ -51,6 +51,14 @@
 }
 /* }}} */
 
+static struct fpm_event_s children_bury_timer;
+
+static void fpm_postponed_children_bury(struct fpm_event_s *ev, short which, void *arg) /* {{{ */
+{
+	fpm_children_bury();
+}
+/* }}} */
+
 static void fpm_got_signal(struct fpm_event_s *ev, short which, void *arg) /* {{{ */
 {
 	char c;
@@ -72,7 +80,8 @@
 		switch (c) {
 			case 'C' :                  /* SIGCHLD */
 				zlog(ZLOG_DEBUG, "received SIGCHLD");
-				fpm_children_bury();
+				fpm_event_set_timer(&children_bury_timer, 0, &fpm_postponed_children_bury, NULL);
+				fpm_event_add(&children_bury_timer, 0);
 				break;
 			case 'I' :                  /* SIGINT  */
 				zlog(ZLOG_DEBUG, "received SIGINT");
 [2019-07-10 03:09 UTC] mnikulin at plesk dot com
Bug #77185 suggests a way to reproduce the issue.
With older libc (e.g. in CentOS-7) the following environment
variable may be set to certainly catch use after free cases

    MALLOC_PERTURB_=165

The patch fpm-race-condition.patch attached to the Bug #65398
does not eliminate the problem.
 [2019-07-10 04:06 UTC] mnikulin at plesk dot com
The following pull request has been associated:

Patch Name: Prevent use after free in fpm_event_epoll_wait
On GitHub:  https://github.com/php/php-src/pull/4385
Patch:      https://github.com/php/php-src/pull/4385.patch
 [2019-07-22 08:41 UTC] nikic@php.net
-Status: Open +Status: Duplicate
 [2019-07-22 08:41 UTC] nikic@php.net
Marking this as a duplicate of bug #77185, which has detailed analysis and is now fixed.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 01 17:01:32 2024 UTC