nginx+php-fpm: Random 104 Connection reset by peer errors

4,643 views
Skip to first unread message

dissoman

unread,
Dec 15, 2010, 5:25:33 AM12/15/10
to highload-php-en
Hi all,
I have production machine running for quite sometime. Since day one, I
have been seeing the following errors in the logs:

2010/10/27 08:40:40 [error] 8459#0: *25044705 recv() failed (104:
Connection reset by peer) while reading response header from upstream,
client: 72.44.44.132, server: XXXXX.com, request: "HEAD /some/url HTTP/
1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.XXXXX.com"

I have noticed, that in all the cases, the requests are failing for
HTTP HEAD. Not a single GET or POST requests.

I tried upgrading from php 5.1.8 with spawn-fcgi to php 5.3.3 with php-
fpm but that didn't help much.
The only thing that I gained, is more information in the php-fpm logs.
I see the following quite often and in conjunction with the above
errors:

Dec 15 10:18:26.102944 [WARNING] [pool www] child 16744 said into
stderr: "zend_mm_heap corrupted"
Dec 15 10:18:26.109527 [WARNING] [pool www] child 16744 exited with
code 1 after 8.278630 seconds from start

I also see the following in /var/log/messages/

Dec 15 10:10:39 pb-main kernel: [12692976.574969] php-fpm[6044]:
segfault at 10 ip 000000000058b6e7 sp 00007fffb93bee10 error 6 in php-
fpm[400000+2ae000]
Dec 15 10:13:51 pb-main kernel: [12693167.973450] php-fpm[8189]:
segfault at 860c6ea1 ip 00000000005b9a40 sp 00007fffb93bf360 error 4
in php-fpm[400000+2ae000]
Dec 15 10:14:48 pb-main kernel: [12693224.995135] php-fpm[9654]:
segfault at 10 ip 000000000058b6e7 sp 00007fffb93bee10 error 6 in php-
fpm[400000+2ae000]

I am not sure if all of these are related to one another, but I
thought I will give as much information as possible.

I am using Centos 5.5 with nginx 0.7.65 as HTTP server.
My nginx configuration for fastcgi is as followed:

location ~ \.php$ {
fastcgi_pass localhost:9000; # port where FastCGI processes
were spawned
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /srv/www/site.com/html
$fastcgi_script_name; # same path as above

fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;

fastcgi_param SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param DOCUMENT_URI $document_uri;
fastcgi_param DOCUMENT_ROOT $document_root;
fastcgi_param SERVER_PROTOCOL $server_protocol;

fastcgi_param GATEWAY_INTERFACE CGI/1.1;
fastcgi_param SERVER_SOFTWARE nginx/$nginx_version;

fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_NAME $server_name;
}



I have been researching about this for months and still came up with
nothing. I really hope someone here can advise.
Thanks in advance!

dissoman

unread,
Jan 19, 2011, 6:34:24 AM1/19/11
to highload-php-en
Anyone can help?
It seems that many of the IPs which requests are failing are coming
from amazon cloud servers. Just in case it triggers something for
anyone :)

ronanchilvers

unread,
Jan 27, 2011, 4:54:34 AM1/27/11
to highloa...@googlegroups.com
Hi dissoman

Did you discover why this is happening? I've seen the same thing here.
PHP5.3.5 FPM listneing on port 9000 with NGinx in front. The server runs
very fast but erratically stops serving pages and I've seen similar
events in my logs to yours:

[11421122.928133] php5-fpm[16687]: segfault at 10 ip 6e77e2 sp
7fffffffa740 error 4 in php5-fpm[400000+833000]
[11421142.518008] php5-fpm[16890]: segfault at 10 ip 6e77e2 sp
7fffffffa740 error 4 in php5-fpm[400000+833000]

I'm not seeing the same NGinx events as you do. Server load is very low,
RAM usage is fine, IO is fine.

Would be good to hear any progress you've made diagnosing this.

Posted at Nginx Forum: http://forum.nginx.org/read.php?3,158814,170014#msg-170014

dissoman

unread,
Jan 30, 2011, 11:29:50 AM1/30/11
to highload-php-en
Unfortunately not. I am still trying to figure out the problem. Seems
to me like a PHP bug perhaps.
I am now trying to make the children to at least create a coredump
when they die, but no success. Can someone give some pointers
regarding that?

dissoman

unread,
Jan 30, 2011, 12:44:52 PM1/30/11
to highload-php-en
Ok, some progress. GDB output followed below.
Will continue researching. Please reply if you know this problem.


Core was generated by `php-fpm'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000579b27 in ?? ()
(gdb) bt
#0 0x0000000000579b27 in ?? ()
#1 0x00000000005b60b9 in zend_objects_free_object_storage ()
#2 0x00000000005b99e5 in zend_objects_store_del_ref_by_handle_ex ()
#3 0x00000000005b9a23 in zend_objects_store_del_ref ()
#4 0x000000000058b755 in _zval_ptr_dtor ()
#5 0x00000000005a4788 in zend_hash_destroy ()
#6 0x0000000000597bcf in _zval_dtor_func ()
#7 0x000000000058b755 in _zval_ptr_dtor ()
#8 0x00007f63edcfe543 in ?? () from /usr/lib64/php/modules/
memcache.so
#9 0x00007f63edcfe590 in ?? () from /usr/lib64/php/modules/
memcache.so
#10 0x00007f63edcfe808 in mmc_pool_free () from /usr/lib64/php/modules/
memcache.so
#11 0x00000000005a726e in ?? ()
#12 0x00000000005a53a4 in zend_hash_del_key_or_index ()
#13 0x00000000005a74e9 in _zend_list_delete ()
#14 0x000000000058b755 in _zval_ptr_dtor ()
#15 0x00000000005a4788 in zend_hash_destroy ()
#16 0x00000000005b6099 in zend_object_std_dtor ()
#17 0x00000000005b60b9 in zend_objects_free_object_storage ()
#18 0x00000000005b99e5 in zend_objects_store_del_ref_by_handle_ex ()
#19 0x00000000005b9a23 in zend_objects_store_del_ref ()
#20 0x000000000058b755 in _zval_ptr_dtor ()
#21 0x00000000005a4788 in zend_hash_destroy ()
#22 0x0000000000597bcf in _zval_dtor_func ()
#23 0x000000000058b755 in _zval_ptr_dtor ()
#24 0x00000000005a4788 in zend_hash_destroy ()
#25 0x00000000005b6099 in zend_object_std_dtor ()
#26 0x00000000005b60b9 in zend_objects_free_object_storage ()
#27 0x00000000005b9dc6 in zend_objects_store_free_object_storage ()
#28 0x000000000058ba05 in ?? ()
#29 0x0000000000598172 in ?? ()
#30 0x000000000054634e in php_request_shutdown ()
#31 0x000000000062546f in ?? ()
#32 0x0000003260e1d994 in __libc_start_main () from /lib64/libc.so.6
#33 0x0000000000421ba9 in _start ()

Jérôme Loyet

unread,
Jan 30, 2011, 12:59:22 PM1/30/11
to highloa...@googlegroups.com
I see that the memcache.so module appears in the gdb backtrace.

Do you have the same problem without loading memcache (if you can) ?

2011/1/30 dissoman <diss...@gmail.com>:

dissoman

unread,
Jan 30, 2011, 1:48:59 PM1/30/11
to highload-php-en
Jerome - Unfortunately this problem happens only in production
environment. I cannot afford myself to lose the memcached as it will
increase the loads many times fold.
However, I found this http://pecl.php.net/bugs/bug.php?id=16818
This problem is IDENTICAL to mine. I also use wordpress and the same
plugin.
I will try upgrading to memcache-3.0.4 I hope it will work..

On Jan 30, 7:59 pm, Jérôme Loyet <m...@fatbsd.com> wrote:
> I see that the memcache.so module appears in the gdb backtrace.
>
> Do you have the same problem without loading memcache (if you can) ?
>
> 2011/1/30 dissoman <disso...@gmail.com>:

Jérôme Loyet

unread,
Jan 30, 2011, 2:13:32 PM1/30/11
to highloa...@googlegroups.com
2011/1/30 dissoman <diss...@gmail.com>:

> Jerome - Unfortunately this problem happens only in production
> environment. I cannot afford myself to lose the memcached as it will
> increase the loads many times fold.
> However, I found this http://pecl.php.net/bugs/bug.php?id=16818
> This problem is IDENTICAL to mine. I also use wordpress and the same
> plugin.
> I will try upgrading to memcache-3.0.4 I hope it will work..

it looks the same, please give us your feedback when you did the upgrade.

dissoman

unread,
Jan 31, 2011, 5:19:59 AM1/31/11
to highload-php-en
I have upgraded memcached module to the latest, 3.0.5. And the server
has been running with it for an hour now
I dont see anymore segfaults but still there are "104: Connection
reset by peer" popping up once in a while.
The only difference is, that they are GETs and not HEAD anymore.
At the moment, I am quite lost as where to go to next.
Any suggestions?

If anyone needs it, I found an RPM for the memcached module at http://pkgs.org/

On Jan 30, 9:13 pm, Jérôme Loyet <m...@fatbsd.com> wrote:
> 2011/1/30 dissoman <disso...@gmail.com>:
>
> > Jerome - Unfortunately this problem happens only in production
> > environment. I cannot afford myself to lose the memcached as it will
> > increase the loads many times fold.
> > However, I found thishttp://pecl.php.net/bugs/bug.php?id=16818
Reply all
Reply to author
Forward
0 new messages