php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77140 FPM occassionally hangs
Submitted: 2018-11-12 09:54 UTC Modified: -
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: tozz at kijkt dot tv Assigned:
Status: Open Package: FPM related
PHP Version: 7.0.32 OS: Debian 9
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2018-11-12 09:54 UTC] tozz at kijkt dot tv
Description:
------------
We sometimes face the issue that the FPM-daemon of PHP no longer responds to any queries/hits/requests. We've seen this issue occur on multiple on our servers, and it tends to happen to multiple servers over the course of a few days (2/3 days) and afterwards it can be fine for weeks. So it seems there is something happening on these servers that triggers it, but I have no clue what it is.

What happens is:
- All requests for sites using this FPM hang indefinitly (timeouts are seemingly ignored)
- After a while, Apache has 0 free slots and goes down as well (due to not having available slots)
- Restarting Apache solves the slot issue, but FPM is still not working, so slots drop to zero after a while again.
- Restarting FPM instantly resolves all issues (all busy Apache slots are freed, everything starts working again).

I suspect this has something to do with frequent restarts of the FPM daemon (due to external configuration changes), but cannot be certain. There are no issues in the FPM logfile.

When I strace the process I see:

strace: Process 1217 attached
getpid()                                = 1217
epoll_wait(26, [], 381, 90)             = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 238)            = 0
getpid()                                = 1217
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 27
getsockopt(27, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
setsockopt(27, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0
sendmsg(27, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Processes active:"..., iov_len=87}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 87
close(27)                               = 0
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 761)            = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, [], 381, 1000)           = 0
getpid()                                = 1217
getpid()                                = 1217
getpid()                                = 1217
epoll_wait(26, ^Cstrace: Process 1217 detached
 <detached ...>


I have no experience with 'gdb', so I don't really know what to do, but I tried to get a trace. This didn't look very usefull to me, but who knows:

(gdb) bt
#0  0x00007f82247150a3 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x000055f6ceccfc43 in ?? ()
#2  0x000055f6cecc6144 in fpm_event_loop ()
#3  0x000055f6cecc0b37 in fpm_run ()
#4  0x000055f6cea3bc6b in main ()
(gdb) 

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7f82277a58c0 (LWP 1217) "php-fpm" 0x00007f82247150a3 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:84
(gdb) 


I tried to replicate the issue, but failed to do so. I also have limited time to debug this issue, because it is causing downtime.

We also see this issue on multiple PHP versions (ranging from 5.6 to 7.2). 


Patches

Add a Patch

Pull Requests

Add a Pull Request

 
PHP Copyright © 2001-2018 The PHP Group
All rights reserved.
Last updated: Tue Dec 11 22:01:26 2018 UTC