php-fpm + persistent sockets = frequent 502 bad gateway

2,380 views
Skip to first unread message

Leon Sorokin

unread,
Oct 11, 2013, 12:58:52 PM10/11/13
to highloa...@googlegroups.com
Hello all,

Put on your reading glasses - this will be a long-ish one.

First, what I'm doing. I'm building a web-app interface for some particularly slow tcp devices. Opening a socket to them takes 200ms and an fwrite/fread cycle takes another 300ms. To reduce the need for both of these actions on each request, I'm opening a persistent tcp socket which reduces the response time by the aforementioned 200ms. I was hoping PHP-FPM would share the persistent connections between requests from different clients (and indeed it does!), but there are some issues which I havent been able to resolve after 2 days of interneting, reading logs and modifying settings. I have somewhat narrowed it down though.

Setup:
- Ubuntu 13.04 x64 Server (fully updated) on Linode
- PHP 5.5.0-6~raring+1 (fpm-fcgi)
- nginx/1.5.2

Relevent config:
nginx
-----
worker_processes 4;

php-fpm/pool.d
--------------
pm = dynamic
pm.max_children = 2
pm.start_servers = 2
pm.min_spare_servers = 2

Let's go from coarse to fine detail of what happens. After a fresh start I have 4x nginx processes and 2x php5-fpm processes waiting to handle requests. Then I send requests every couple seconds to the script. The first take a while to open the socket connection and returns with the data in about 500ms, the second returns data in 300ms (yay it's re-using the socket), the third also succeeds in about 300ms, the fourth request = 502 Bad Gateway, same with the 5th. Sixth request once again returns data, except now it took 500ms again. The process repeats for several cycles after which every 4 requests result in 2x 502 Bad Gateways and 2x 500ms Data responses.

If I double all the fpm pool values and have 4x php-fpm processes running, the cycles settles in with 4x successful 500ms responses followed by 4x Bad Gateway errors. If I don't use persistent sockets, this issue goes away but then every request is 500ms. What I suspect is happening is the persistent socket keeps each php-fpm process from idling and ties it up, so the next one gets chosen until none are left and as they error out, maybe they are restarted and become available on the next round-robin loop ut the socket dies with the process. I haven't yet checked the 'slowlog', but the nginx error log shows lots of this:

*188 recv() failed (104: Connection reset by peer) while reading response header from upstream, client:...

All the suggestions on the internet regarding fixing nginx/php-fpm/502 bad gateway relate to high load or fcgi_pass misconfiguration. This is not the case here. Increasing buffers/sizes, changing timeouts, switching from unix socket to tcp socket for fcgi_pass, upping connection limits on the system....none of this stuff applies here.

I've had some other success with setting pm = ondemand rather than dynamic, but as soon as the initial fpm-process gets killed off after idling, the persistent socket is gone for all subsequent php-fpm spawns. For the php script, I'm using stream_socket_client() with a STREAM_CLIENT_PERSISTENT flag. A while/stream_select() loop to detect socket data and fread($sock, 4096) to grab the data. I don't call fclose() obviously.

If anyone has some additional questions or advice on how to get a persistent socket without tying up the php-fpm processes beyond the request completion, or maybe some other things to try, I'd appreciate it.

some useful links:
http://serverfault.com/questions/302...fpm-recv-error
http://serverfault.com/questions/178...d-on-a-test-se
http://serverfault.com/questions/520...onal-duplicate
http://www.linuxquestions.org/questi...ckopen-552084/
http://stackoverflow.com/questions/1...ent-php-socket
http://devzone.zend.com/303/extensio...zend/#Heading3
http://stackoverflow.com/questions/2...m-socket-alive
http://php.net/manual/en/install.fpm.configuration.php
https://www.google.com/search?q=recv...&bih=953&dpr=1

Antony Dovgal

unread,
Oct 14, 2013, 7:21:22 AM10/14/13
to highloa...@googlegroups.com
On 2013-10-11 20:58, Leon Sorokin wrote:
> Let's go from coarse to fine detail of what happens. After a fresh start I have 4x nginx processes and 2x php5-fpm processes waiting to handle requests. Then I send requests every couple seconds to the script. The first take a while to open the socket connection and returns with the data in about 500ms, the second returns data in 300ms (yay it's re-using the socket), the third also succeeds in about 300ms, the fourth request = 502 Bad Gateway, same with the 5th. Sixth request once again returns data, except now it took 500ms again. The process repeats for several cycles after which every 4 requests result in 2x 502 Bad Gateways and 2x 500ms Data responses.

What's in the FPM logs?

--
Wbr,
Antony Dovgal
---
http://pinba.org - realtime profiling for PHP

Leon Sorokin

unread,
Oct 16, 2013, 3:34:11 PM10/16/13
to highloa...@googlegroups.com
Nothing particularly helpful, just stuff like: [14-Oct-2013 11:51:54] NOTICE: configuration file /etc/php5/fpm/php-fpm.conf test is successful

As the comment in http://serverfault.com/questions/302822/nginx-php-fpm-recv-error says, maybe the logs dont show anything wrong because there are no errors in the actual code. Though I would expect at least the process restarts to show up. maybe some more info here: http://rtcamp.com/tutorials/php/fpm-slow-log/

Antony Dovgal

unread,
Oct 17, 2013, 9:28:10 AM10/17/13
to highloa...@googlegroups.com
On 2013-10-16 23:34, Leon Sorokin wrote:
> Nothing particularly helpful, just stuff like: [14-Oct-2013 11:51:54] NOTICE: configuration file /etc/php5/fpm/php-fpm.conf test is successful

No warnings at all?
That means it doesn't really crash there, otherwise you'd see something like this:
Jul 25 14:16:42.994796 [WARNING] [pool default] child 12741 exited on signal 11 (SIGSEGV - core dumped) after 588529.945226 seconds from start

Do you a have short, but complete reproduce script?
I'd have to reproduce it myself in order to understand what's going on there.

Leon Sorokin

unread,
Oct 19, 2013, 2:08:12 AM10/19/13
to highloa...@googlegroups.com
I've attached a reduced test case. Try refreshing this script multiple times with an nginx/php-fpm setup.You should get at least a couple 502 bad gateway responses after every <number of FPM workers> requests.

Thanks!
psock.php

Antony Dovgal

unread,
Oct 21, 2013, 7:11:45 AM10/21/13
to highloa...@googlegroups.com
On 2013-10-19 10:08, Leon Sorokin wrote:
> I've attached a reduced test case. Try refreshing this script multiple times with an nginx/php-fpm setup.You should get at least a couple 502 bad gateway responses after every <number of FPM workers> requests.

Thank you, the reproduce case was really helpful.

This is now fixed in Git:
https://bugs.php.net/bug.php?id=65936
http://git.php.net/?p=php-src.git;a=commit;h=b636c03426193ecf0b7e166126a14b70ce8185e9

Leon Sorokin

unread,
Oct 21, 2013, 11:43:20 AM10/21/13
to highloa...@googlegroups.com
awesome, thanks for looking into it!

Jose Luis Canciani

unread,
Oct 21, 2013, 5:18:21 PM10/21/13
to highloa...@googlegroups.com
Could this be affecting Curl when reusing connections?


--
 
---
You received this message because you are subscribed to the Google Groups "highload-php-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to highload-php-...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Antony Dovgal

unread,
Oct 22, 2013, 4:28:33 AM10/22/13
to highloa...@googlegroups.com
On 2013-10-22 01:18, Jose Luis Canciani wrote:
> Could this be affecting Curl when reusing connections?

I don't think so, the bug itself affected only PHP streams, not cURL resources.
However, there is cURL transport for PHP streams, so you might want to clarify your question a little.

Jose Luis Canciani

unread,
Oct 22, 2013, 9:02:37 AM10/22/13
to highloa...@googlegroups.com
We have been seeing  some sporadic  seg faults in our app, and we do use curl with persistent connection, it sounded very similar to this issue. That's why I asked. No, we do not use curl://, just plain d curl_ functions.

Thanks.

Antony Dovgal

unread,
Oct 22, 2013, 9:20:58 AM10/22/13
to highloa...@googlegroups.com
On 2013-10-22 17:02, Jose Luis Canciani wrote:
> We have been seeing some sporadic seg faults in our app, and we do use curl with persistent connection, it sounded very similar to this issue. That's why I asked. No, we do not use curl://, just plain d curl_ functions.

Please provide a reproduce case, I'll take a look at it.
You might want to try the latest PHP version first to be sure it's still reproducible.

Jose Luis Canciani

unread,
Oct 22, 2013, 10:41:51 AM10/22/13
to highloa...@googlegroups.com
Thanks, we are still trying to find a controlled case, which we can't. And it's very rare, so that is making things difficult. Someone in the company will post it when we had more info.

Thanks again!
Jose


--

--- You received this message because you are subscribed to the Google Groups "highload-php-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to highload-php-en+unsubscribe@googlegroups.com.

Antony Dovgal

unread,
Oct 22, 2013, 10:59:01 AM10/22/13
to highloa...@googlegroups.com
On 2013-10-22 18:41, Jose Luis Canciani wrote:
> Thanks, we are still trying to find a controlled case, which we can't. And it's very rare, so that is making things difficult. Someone in the company will post it when we had more info.

A backtrace might also contain some useful data.
Here's how to get one: https://bugs.php.net/bugs-generating-backtrace.php

Leon Sorokin

unread,
Oct 31, 2013, 2:47:22 AM10/31/13
to highloa...@googlegroups.com
I think you added the changelog note in NEWS under the wrong PHP ver (5.5.5). 5.5.5 has been out since Oct 17.

Antony Dovgal

unread,
Oct 31, 2013, 7:05:09 AM10/31/13
to highloa...@googlegroups.com
On 2013-10-31 10:47, Leon Sorokin wrote:
> I think you added the changelog note in NEWS under the wrong PHP ver (5.5.5). 5.5.5 has been out since Oct 17.

Oh, indeed.
Thanks for noticing.
Reply all
Reply to author
Forward
0 new messages