php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #60961 Graceful Restart (USR2) isn't very graceful
Submitted: 2012-02-02 20:25 UTC Modified: 2021-12-09 22:15 UTC
Votes:149
Avg. Score:4.7 ± 0.7
Reproduced:134 of 135 (99.3%)
Same Version:40 (29.9%)
Same OS:26 (19.4%)
From: phpbugs at oops dot mooo dot com Assigned: bukka (profile)
Status: Assigned Package: FPM related
PHP Version: 7.0 OS: Debian Squeeze
Private report: No CVE-ID: None
 [2012-02-02 20:25 UTC] phpbugs at oops dot mooo dot com
Description:
------------
I just compiled a new PHP+APC with the CVE-2012-0830 fix. It looks like all/some of the the active requests died. I had the same problem when upgrading to 5.3.8 and 5.3.9 too.

# cat /var/log/nginx/error.log
 *empty*
# cat /var/run/php-fpm.pid
2161
# ps aux | fgrep -i 2161
root      2161  0.1  0.2 123692  4520 ?        Ss   06:28   0:00 php-fpm: master process (/opt/php/etc/php-fpm.conf)
# kill -USR2 2161
# cat /var/log/nginx/error.log
2012/02/02 21:08:26 [error] 25004#0: *7381002 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7381001 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7372696 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7381238 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7374985 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7369723 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7360478 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7371999 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7375111 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX
2012/02/02 21:08:26 [error] 25004#0: *7381000 readv() failed (104: Connection reset by peer) while reading upstream, client: xx.xx.xx.xx, server: XXX

This gives 502 Bad Gateway on the client.


Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-05-23 13:30 UTC] fat@php.net
-Assigned To: +Assigned To: fat
 [2012-07-17 06:40 UTC] megazubr at gmail dot com
Centos 5.8, php 5.4.4 compiled from source with --enable-fpm
When doing USR2 on master pid of php-fpm daemon
1. kill child(502 Bad gatway on long_loop.php in same time)
2. restart daemon with another master pid

As i see, it's not a graceful reload.

master pid - 12788
child(long_loop.php) - 12795

strace -f -s 8000 -p 12788

[pid 12795] munmap(0x2b017edd1000, 1168) = 0
[pid 12795] munmap(0x2b017edd2000, 1168) = 0
[pid 12795] munmap(0x2b017edd3000, 1168) = 0
[pid 12795] munmap(0x2b017edb5000, 112) = 0
[pid 12795] time(NULL)                  = 1342505962
[pid 12795] dup2(1, 2)                  = 2
[pid 12795] close(4)                    = 0
[pid 12795] dup2(13, 0)                 = 0
[pid 12795] geteuid()                   = 0
[pid 12795] setgid(500)                 = 0
[pid 12795] open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 4
[pid 12795] read(4, "65536\n", 31)      = 6
[pid 12795] close(4)                    = 0
[pid 12795] open("/etc/group", O_RDONLY) = 4
[pid 12795] fcntl(4, F_GETFD)           = 0
[pid 12795] fcntl(4, F_SETFD, FD_CLOEXEC) = 0
[pid 12795] fstat(4, {st_mode=S_IFREG|0644, st_size=693, ...}) = 0
[pid 12795] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0) = 0x2b017b3ce000
[pid 12795] lseek(4, 0, SEEK_CUR)       = 0
[pid 12795] read(4, 
"root:x:0:root\nbin:x:1:root,bin,daemon\ndaemon:x:2:root,bin,daemon\nsys:x:3:roo
t,bin,adm\nadm:x:4:root,adm,daemon\ntty:x:5:\ndisk:x:6:root\nlp:x:7:daemon,lp\nm
em:x:8:\nkmem:x:9:\nwheel:x:10:root\nmail:x:12:mail,exim\nnews:x:13:news\nuucp:x
:14:uucp\nman:x:15:\ngames:x:20:\ngopher:x:30:\ndip:x:40:\nftp:x:50:\nlock:x:54:
\nnobody:x:99:\nusers:x:100:\nnscd:x:28:\nfloppy:x:19:\nvcsa:x:69:\npcap:x:77:\n
utmp:x:22:\nutempter:x:35:\nslocate:x:21:\naudio:x:63:\nrpc:x:32:\nmailnull:x:47
:\nsmmsp:x:51:\necryptfs:x:101:\nsshd:x:74:\ndbus:x:81:\navahi:x:70:\nrpcuser:x:
29:\nnfsnobody:x:65534:\nhaldaemon:x:68:\navahi-
autoipd:x:102:\nmysql:x:27:\nftpgroup:x:107:\nexim:x:93:\nscreen:x:84:\nfwh:x:50
0:\nqwqwqw:x:501:\n1q2w3e:x:502:\n1q:x:503:\n2w:x:504:\n", 4096) = 693
[pid 12795] read(4, "", 4096)           = 0
[pid 12795] close(4)                    = 0
[pid 12795] munmap(0x2b017b3ce000, 4096) = 0
[pid 12795] setgroups(1, [500])         = 0
[pid 12795] setuid(500)                 = 0
[pid 12795] prctl(0x4, 0x1, 0, 0, 0)    = 0
[pid 12795] clock_gettime(CLOCK_MONOTONIC, {1029872, 690313456}) = 0
[pid 12795] close(6)                    = 0
[pid 12795] close(8)                    = 0
[pid 12795] rt_sigaction(SIGTERM, {SIG_DFL, [], SA_RESTORER, 0x322c6302f0}, 
NULL, 8) = 0
[pid 12795] rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x322c6302f0}, NULL, 
8) = 0
[pid 12795] rt_sigaction(SIGUSR1, {SIG_DFL, [], SA_RESTORER, 0x322c6302f0}, 
NULL, 8) = 0
[pid 12795] rt_sigaction(SIGUSR2, {SIG_DFL, [], SA_RESTORER, 0x322c6302f0}, 
NULL, 8) = 0
[pid 12795] rt_sigaction(SIGCHLD, {SIG_DFL, [], SA_RESTORER, 0x322c6302f0}, 
NULL, 8) = 0
[pid 12795] rt_sigaction(SIGQUIT, {0x7c33f0, [], SA_RESTORER|SA_RESTART, 
0x322c6302f0}, NULL, 8) = 0
[pid 12795] close(9)                    = 0
[pid 12795] close(10)                   = 0
[pid 12795] close(11)                   = 0
[pid 12795] close(12)                   = 0
[pid 12795] close(13)                   = 0
[pid 12795] clock_gettime(CLOCK_MONOTONIC, {1029872, 692338456}) = 0
[pid 12795] accept(0, {sa_family=AF_INET, sin_port=htons(60781), 
sin_addr=inet_addr("127.0.0.1")}, [16]) = 4
[pid 12795] clock_gettime(CLOCK_MONOTONIC, {1029872, 692611456}) = 0
[pid 12795] time(NULL)                  = 1342505962
[pid 12795] times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 
1821051559
[pid 12795] poll([{fd=4, events=POLLIN}], 1, 5000) = 1 ([{fd=4, 
revents=POLLIN}])
[pid 12795] read(4, "\1\1\0\1\0\10\0\0", 8) = 8
[pid 12795] read(4, "\0\1\0\0\0\0\0\0", 8) = 8
[pid 12795] read(4, "\1\4\0\1\4\313\5\0", 8) = 8
[pid 12795] read(4, 
"\17'SCRIPT_FILENAME/home/_htdocs/public_html/long_loop.php\f\0QUERY_STRING\16\3
REQUEST_METHODGET\f\0CONTENT_TYPE\16\0CONTENT_LENGTH\v\16SCRIPT_NAME/long_loop.p
hp\v\16REQUEST_URI/long_loop.php\f\16DOCUMENT_URI/long_loop.php\r\31DOCUMENT_ROO
T/home/_htdocs/public_html\17\10SERVER_PROTOCOLHTTP/1.1\21\7GATEWAY_INTERFACECGI
/1.1\17\vSERVER_SOFTWAREnginx/1.2.2\v\16REMOTE_ADDR89.221.202.206\v\5REMOTE_PORT
58343\v\16SERVER_ADDR174.142.69.107\v\2SERVER_PORT80\v\20SERVER_NAMEfwh.webasyst
.com\t\20HTTP_HOSTfwh.webasyst.com\17\nHTTP_CONNECTIONkeep-
alive\22\tHTTP_CACHE_CONTROLmax-age=0\17\200\0\0\215HTTP_USER_AGENTMozilla/5.0 
(X11; Linux x86_64) AppleWebKit/535.19 (KHTML, like Gecko) Ubuntu/11.10 
Chromium/18.0.1025.168 Chrome/18.0.1025.168 Safari/535.19\v?
HTTP_ACCEPTtext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\24\21
HTTP_ACCEPT_ENCODINGgzip,deflate,sdch\24#HTTP_ACCEPT_LANGUAGEru-RU,ru;q=0.8,en-
US;q=0.6,en;q=0.4\23 HTTP_ACCEPT_CHARSETwindows-1251,utf-
8;q=0.7,*;q=0.3\v\200\0\0012HTTP_COOKIE__utma=144069043.2077055053.1307503204.13
42164095.1342166094.132; __utmc=144069043; 
__utmz=144069043.1341978575.128.65.utmcsr=feedburner|utmccn=Feed:%20webasystcom/
ru%20(Webasyst)|utmcmd=feed; PHPSESSID=uvg77mtbskgn2i0avlmmtaola4; 
_csrf=5003fbcba78297.60899625; 
last_page=1%5E%5E%5E%2Fwebasyst%2Fhosting%2F\0\0\0\0\0", 1232) = 1232
[pid 12795] read(4, "\1\4\0\1\0\0\0\0", 8) = 8
[pid 12795] clock_gettime(CLOCK_MONOTONIC, {1029872, 694032456}) = 0
[pid 12795] gettimeofday({1342505962, 584135}, NULL) = 0
[pid 12795] stat("/home/_htdocs/public_html/.user.ini", 0x7fffba4db660) = -1 
ENOENT (No such file or directory)
[pid 12795] setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={60, 0}}, NULL) 
= 0
[pid 12795] rt_sigaction(SIGPROF, {0x7078d0, [PROF], SA_RESTORER|SA_RESTART, 
0x322c6302f0}, {SIG_DFL, [], 0}, 8) = 0
[pid 12795] rt_sigprocmask(SIG_UNBLOCK, [PROF], NULL, 8) = 0
[pid 12795] lstat("/home/_htdocs/public_html/long_loop.php", 
{st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs", {st_mode=S_IFDIR|0711, st_size=4096, ...}) = 
0
[pid 12795] lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html/long_loop.php", 
{st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs", {st_mode=S_IFDIR|0711, st_size=4096, ...}) = 
0
[pid 12795] lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html/long_loop.php", 
{st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs", {st_mode=S_IFDIR|0711, st_size=4096, ...}) = 
0
[pid 12795] lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html/long_loop.php", 
{st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] lstat("/home/_htdocs/public_html", {st_mode=S_IFDIR|0755, 
st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs", {st_mode=S_IFDIR|0711, st_size=4096, ...}) = 
0
[pid 12795] lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 12795] lstat("/home/_htdocs", {st_mode=S_IFDIR|0711, st_size=4096, ...}) = 
0
[pid 12795] lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 12795] open("/home/_htdocs/public_html/long_loop.php", O_RDONLY) = 6
[pid 12795] fstat(6, {st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] fstat(6, {st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] fstat(6, {st_mode=S_IFREG|0644, st_size=76, ...}) = 0
[pid 12795] mmap(NULL, 76, PROT_READ, MAP_SHARED, 6, 0) = 0x2b017b3ce000
[pid 12795] clock_gettime(CLOCK_MONOTONIC, {1029872, 698631456}) = 0
[pid 12795] getcwd("/usr/local/nginx/logs"..., 4095) = 22
[pid 12795] chdir("/home/_htdocs/public_html") = 0
[pid 12795] setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={30, 0}}, NULL) 
= 0
[pid 12795] munmap(0x2b017b3ce000, 76)  = 0
[pid 12795] close(6)                    = 0
[pid 12795] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 12795] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], SA_RESTORER, 
0x322c6302f0}, 8) = 0
[pid 12795] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 12795] nanosleep({1, 0},  <unfinished ...>
[pid 12788] <... epoll_wait resumed> {}, 1, 737) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029873, 411240456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029873, 411340456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029873, 411447456}) = 0
[pid 12788] epoll_wait(16,  <unfinished ...>
[pid 12795] <... nanosleep resumed> {1, 0}) = 0
[pid 12795] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 12795] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], SA_RESTORER, 
0x322c6302f0}, 8) = 0
[pid 12795] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 12795] nanosleep({1, 0},  <unfinished ...>
[pid 12788] <... epoll_wait resumed> {}, 1, 1000) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029874, 411323456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029874, 411442456}) = 0
[pid 12788] getsockopt(13, SOL_TCP, TCP_INFO, "\n\0\0\0\0\0\0\0\300\306-
\0\0\0\0\0\30\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\240\23]=\0\0
\0\0\240\23]=\240\23]=\0\0\0\0\0\0\0\0\0\0\0\0\260q\v\0\377\377\377\177\2\0\0\0\
0\0\0\0\3\0\0\0", [92]) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029874, 411797456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029874, 411921456}) = 0
[pid 12788] epoll_wait(16,  <unfinished ...>
[pid 12795] <... nanosleep resumed> {1, 0}) = 0
[pid 12795] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 12795] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], SA_RESTORER, 
0x322c6302f0}, 8) = 0
[pid 12795] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid 12795] nanosleep({1, 0},  <unfinished ...>
[pid 12788] <... epoll_wait resumed> 13bda920, 1, 1000) = -1 EINTR (Interrupted 
system call)
[pid 12788] --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
[pid 12788] write(8, "2", 1)            = 1
[pid 12788] rt_sigreturn(0x8)           = -1 EINTR (Interrupted system call)
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029875, 99361456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029875, 99472456}) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC, {1029875, 99580456}) = 0
[pid 12788] epoll_wait(16, {{EPOLLIN, {u32=15680416, u64=15680416}}}, 1, 312) = 
1
[pid 12788] read(6, "2", 1)             = 1
[pid 12788] gettimeofday({1342505964, 989857}, NULL) = 0
[pid 12788] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3477, ...}) = 
0
[pid 12788] write(4, "[17-Jul-2012 02:19:24] NOTICE: Reloading in progress 
...\n", 57) = 57
[pid 12788] kill(12795, SIGQUIT <unfinished ...>
[pid 12795] <... nanosleep resumed> 0x7fffba4d94a0) = ? ERESTART_RESTARTBLOCK 
(To be restarted)
[pid 12788] <... kill resumed> )        = 0
[pid 12795] --- SIGQUIT (Quit) @ 0 (0) ---
[pid 12788] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 12795] close(0 <unfinished ...>
[pid 12788] <... clock_gettime resumed> {1029875, 100807456}) = 0
[pid 12795] <... close resumed> )       = 0
[pid 12788] read(6,  <unfinished ...>
[pid 12795] socket(PF_FILE, SOCK_STREAM, 0 <unfinished ...>
[pid 12788] <... read resumed> 0x7fffba4db8b7, 1) = -1 EAGAIN (Resource 
temporarily unavailable)
[pid 12795] <... socket resumed> )      = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 12795] rt_sigreturn(0x1 <unfinished ...>
[pid 12788] <... clock_gettime resumed> {1029875, 101356456}) = 0
[pid 12795] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
[pid 12788] kill(12795, SIGTERM <unfinished ...>
[pid 12795] rt_sigprocmask(SIG_BLOCK, [CHLD],  <unfinished ...>
[pid 12788] <... kill resumed> )        = 0
[pid 12795] <... rt_sigprocmask resumed> [], 8) = 0
[pid 12788] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 12795] --- SIGTERM (Terminated) @ 0 (0) ---
Process 12795 detached
<... clock_gettime resumed> {1029875, 101834456}) = 0
clock_gettime(CLOCK_MONOTONIC, {1029875, 103461456}) = 0
clock_gettime(CLOCK_MONOTONIC, {1029875, 103566456}) = 0
clock_gettime(CLOCK_MONOTONIC, {1029875, 103843456}) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
write(8, "C", 1)                        = 1
rt_sigreturn(0x8)                       = 0
epoll_wait(16, {{EPOLLIN, {u32=15680416, u64=15680416}}}, 1, 308) = 1
read(6, "C", 1)                         = 1
wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGTERM}], WNOHANG|WSTOPPED, NULL) 
= 12795
clock_gettime(CLOCK_MONOTONIC, {1029875, 104400456}) = 0
gettimeofday({1342505964, 994394}, NULL) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3477, ...}) = 0
write(4, "[17-Jul-2012 02:19:24] NOTICE: reloading: execvp(\"/usr/sbin/php-
fpm\", {\"/usr/sbin/php-fpm\", \"--fpm-config\", \"/etc/php-fpm.conf\"})\n", 
130) = 130
munmap(0x2b017b3cf000, 1168)            = 0
munmap(0x2b017b3d0000, 1168)            = 0
munmap(0x2b017b3d1000, 1168)            = 0
munmap(0x2b017b3d2000, 1168)            = 0
munmap(0x2b017b3d3000, 1168)            = 0
munmap(0x2b017b3d4000, 1168)            = 0
munmap(0x2b017b3d5000, 1168)            = 0
munmap(0x2b017b3d6000, 1168)            = 0
munmap(0x2b017b3d7000, 1168)            = 0
munmap(0x2b017eb57000, 1168)            = 0
munmap(0x2b017eb58000, 1168)            = 0
munmap(0x2b017eb59000, 1168)            = 0
munmap(0x2b017ed65000, 1168)            = 0
 [2013-02-13 09:47 UTC] php-bugs at puzzled dot xs4all dot nl
CentOS 6.3 with php-5.3.3 and the FPM code from 5.3.20 still has the same issue. FWIW this makes it challenging to use nginx with php-fpm because it results in a 502 Bad Gateway. I'm more than happy to help testing a patch.
 [2013-02-13 15:57 UTC] phpbugs at oops dot mooo dot com
Try setting process_control_timeout to something higher than 0.
 [2013-09-18 16:22 UTC] baohx2000 at gmail dot com
Having this issue on 5.5.3.  Sending USR2 to master process kills the master and 
orphans the children.  Have to then kill all children before I can restart the 
master process.
 [2014-02-19 19:35 UTC] ptr dot wang at gmail dot com
php5-fpm 5.5.8+dfsg-3+sury.org~precise+2 on ubuntu precise has the same problem.
killing USR2 signal to php5-fpm master process causes all php5-fpm processes (include the master process) replaced, and the active requests end up with HTTP 502 Bad gateway error on the nginx side.
 [2014-07-24 11:29 UTC] yinchuan at baixing dot com
CentOS 6.5 + Nginx 1.4.7 + PHP 5.4.9 with php-fpm
"fastcgi_keep_conn on;" in nginx.conf
set pm.max_request=100, then when thd 100th reqest reach, php process exit, and nginx log error:"... recv() failed (104: Connection reset by peer) while reading response header from upstream, client: XXX"

when set "fastcgi_keep_conn off;", everything is ok.
 [2015-06-05 11:22 UTC] razvanphp at yahoo dot com
I have the same problem with PHP 5.4.39-0+deb7u2, latest Debian Wheezy.

Nginx mainline 1.9 uses unix socket connection to php-fpm, but it results in 502 when reloading php-fpm service.

Found this in the code, that seems to cause this:

static void sig_soft_quit(int signo) /* {{{ */
{
       int saved_errno = errno;

       /* closing fastcgi listening socket will force fcgi_accept() exit immediately */
       close(0);
       if (0 > socket(AF_UNIX, SOCK_STREAM, 0)) {
               zlog(ZLOG_WARNING, "failed to create a new socket");
       }
       fpm_php_soft_quit();
       errno = saved_errno;
}
 [2017-01-02 16:28 UTC] konrad dot baumgart at moneyhouse dot de
This bug affects many people, but it seems the assigned person does not contribute to bugfixing anymore.

Can we get it unassigned/reassigned?

Bug is still present with nginx 1.10 and php 7 communicating with unix socket
 [2017-01-26 11:05 UTC] albertcasademont at gmail dot com
Yes please, a graceful reload is needed urgently. And more generally, php-fpm needs some love
 [2017-02-01 14:49 UTC] requinix@php.net
-Status: Assigned +Status: Open -Assigned To: fat +Assigned To:
 [2017-02-01 14:49 UTC] requinix@php.net
-PHP Version: 5.3.9 +PHP Version: 7.0
 [2017-10-21 16:11 UTC] luke at viovet dot co dot uk
This is also affecting PHP 7.1 on Ubuntu 14.04 and 16.04. Please fix asap.
 [2017-11-30 17:17 UTC] federico dot omoto at gmail dot com
And this is also affecting the recently released PHP 7.2.0.
Is this bug expected to be fixed in the short term?
 [2018-07-01 08:35 UTC] i at xstudio dot me
mark!
 [2018-08-23 15:33 UTC] danny dot kopping at hetzner dot co dot za
Hi

We have been experiencing this issue for a while, and after some investigation - we think we have isolated the issue.

This is how we understand it:
When FPM receives a USR2 signal, it attempts to wait for its children to finish processing, and then sends a QUIT signal to each child. If a child does not respond in the amount of time defined in "process_control_timeout" then the master process sends a TERM.

Now, while FPM is waiting for its children to terminate, it goes into a "reloading" state, it would seem. It prints the message "Reloading in progress ..." to its log. During this time, one can send as many USR2 signals to the master process as one likes - it will not cause any problems. However, if one sends a USR2 signal after the `execvp("/usr/sbin/php-fpm7.1", {"/usr/sbin/php-fpm7.1", "--nodaemonize", "--fpm-config", "/etc/php/7.1/fpm/php-fpm.conf"})` syscall and before receiving the "fpm is running, pid xxx" log entry - then it will crash the master process and consequently all the children.

Our suspicion is that the signal handling is buggy between startup time and moving into the "ready to accept connections" state (without looking at the source).
 [2019-10-03 11:43 UTC] mike@php.net
The following pull request has been associated:

Patch Name: php-fpm: graceful restart without blocking/losing requests
On GitHub:  https://github.com/php/php-src/pull/3758
Patch:      https://github.com/php/php-src/pull/3758.patch
 [2019-12-05 14:21 UTC] ingo dot bez at gmx dot de
May I ask carefully if and when this problem will be solved?
Seven years open, a patch functional since the beginning of 2017 and public for 11 months now... Wow.
 [2021-12-09 22:15 UTC] bukka@php.net
-Assigned To: +Assigned To: bukka
 [2021-12-09 22:15 UTC] bukka@php.net
There is now an improvement in PHP 8.1 that might help a bit. It's from this PR https://github.com/php/php-src/pull/6753 that makes it possible to configure max spawn rate so it's done much faster. I will keep this open as it's not a full solution but it can significantly reduce the mentioned issues.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC