php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78405 Fpm keeps killing idle children claiming they timed out
Submitted: 2019-08-12 13:22 UTC Modified: -
From: sjon@php.net Assigned:
Status: Open Package: FPM related
PHP Version: 7.3.8 OS: archLinux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2019-08-12 13:22 UTC] sjon@php.net
Description:
------------
I have request_terminate_timeout=5 configured and my fpm logs are littered with these sort of errors:

child 5268, script '' (request: \" \") execution timed out (5.850341 sec), terminating
child 5268 exited on signal 15 (SIGTERM) after 6.609585 seconds from start

the first timestamp is always ~ 5-6s, but the second one fluctuates wildly between 6 and 800 seconds). It's weird how there is no script or request logged, but looking at the strace that is correct - these childs are idle...

why are they getting terminated?

Actual result:
--------------
strace on the master:

10:15:46.708090 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f90dd4d3ed0) = 5268
...
10:15:53.277060 read(53, "\0", 1023)    = 1
10:15:53.284168 read(53, 0x7ffd8f35d9b0, 1023) = -1 EAGAIN (Resource temporarily unavailable)
10:15:53.291516 getpid()                = 369
10:15:53.298583 getpid()                = 369
10:15:53.306064 kill(5268, SIGTERM)     = 0
10:15:53.310332 write(4, "[06-Aug-2019 10:15:53] WARNING: [pool xxx] child 5268, script '' (request: \" \") execution timed out (5.850341 sec), terminating\n", 142) = 142
10:15:53.311267 getpid()                = 369
10:15:53.311843 getpid()                = 369
10:15:53.312393 epoll_wait(29, [{EPOLLHUP, {u32=3302801088, u64=94552712855232}}, {EPOLLHUP, {u32=3302801016, u64=94552712855160}}], 177, 480) = 2
10:15:53.314538 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=5268, si_uid=33, si_status=SIGTERM, si_utime=3, si_stime=3} ---
10:15:53.314705 getpid()                = 369
10:15:53.315533 write(17, "C", 1)       = 1
10:15:53.315899 rt_sigreturn({mask=[]}) = 2
10:15:53.316174 read(54, "", 1023)      = 0
10:15:53.316526 epoll_ctl(29, EPOLL_CTL_DEL, 54, 0x7ffd8f35d95c) = 0
10:15:53.316743 close(54)               = 0
10:15:53.317029 getpid()                = 369
10:15:53.317338 read(42, "", 1023)      = 0
10:15:53.317489 epoll_ctl(29, EPOLL_CTL_DEL, 42, 0x7ffd8f35d95c) = 0
10:15:53.317635 close(42)               = 0
10:15:53.317871 getpid()                = 369
10:15:53.318330 getpid()                = 369
10:15:53.318497 epoll_wait(29, [{EPOLLIN, {u32=3281711456, u64=94552691765600}}], 177, 474) = 1
10:15:53.318777 read(16, "C", 1)        = 1
10:15:53.318968 wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGTERM}], WNOHANG|WSTOPPED, NULL) = 5268
10:15:53.319370 write(4, "[06-Aug-2019 10:15:53] WARNING: [pool xxx] child 5268 exited on signal 15 (SIGTERM) after 6.609585 seconds from start\n", 132) = 132


strace of the child:
10:15:47.449977 write(10, "xxx -  06/aug/2019:10:15:46 +0200 \"GET /xxx.php\" 200 /xxx.php 721.021ms 4096Kb 8.32%\n", 192) = 192
10:15:47.450308 write(4, "\1\6\0\1\37\370\0\0Set-Cookie: SessionID="..., 8192) = 8192
10:15:47.450432 setitimer(ITIMER_PROF, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}, NULL) = 0
10:15:47.450518 umask(022)              = 000
10:15:47.450618 close(8)                = 0
10:15:47.450724 close(7)                = 0
10:15:47.450807 close(6)                = 0
10:15:47.451105 close(5)                = 0
10:15:47.451301 sendto(11, "\1\0\0\0\1", 5, MSG_DONTWAIT, NULL, 0) = 5
10:15:47.451448 close(11)               = 0
10:15:47.451542 sendto(9, "\1\0\0\0\1", 5, MSG_DONTWAIT, NULL, 0) = 5
10:15:47.451634 close(9)                = 0
10:15:47.451738 fcntl(3, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
10:15:47.451818 write(4, "\1\6\0\1\v\200\0\0html>"..., 2968) = 2968
10:15:47.451940 munmap(0x7f90cf800000, 2097152) = 0
10:15:47.452132 setitimer(ITIMER_PROF, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}, NULL) = 0
10:15:47.452213 times({tms_utime=3, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 1431814353
10:15:47.452284 read(4, "\1\5\0\1\0\0\0\0", 8) = 8
10:15:47.452359 read(4, 0x7ffd8f34ddb8, 8) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
10:15:53.310090 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=369, si_uid=0} ---
10:15:53.310291 rt_sigaction(SIGTERM, NULL, {sa_handler=0x55fec30b6bf0, sa_mask=~[ILL TRAP ABRT BUS FPE KILL SEGV CONT STOP TSTP TTIN TTOU SYS RTMIN RT_1], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f90df9517e0}, 8) = 0
10:15:53.310468 rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f90df9517e0}, NULL, 8) = 0
10:15:53.310711 rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
10:15:53.311334 getpid()                = 5268
10:15:53.311631 kill(5268, SIGTERM)     = 0
10:15:53.311929 --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=5268, si_uid=33} ---
10:15:53.313948 +++ killed by SIGTERM +++

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-09-24 08:59 UTC] sjon@php.net
This might be "caused" by fastcgi_keep_conn being enabled. Possibly fpm keeps the child running due to keepalive being enabled but the master doesn't realize this and kills the child thinking it is timing out. Should this work or should this become a feature request?

See also https://stackoverflow.com/a/51457613/3749523
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Sun Dec 08 15:01:25 2019 UTC