errors and config optimize

daniel

unread,

Jun 3, 2008, 3:30:49 PM6/3/08

to highload-php-en

hey all,

I've recently moved over from LS to nginx+php-fpm and I'm loving it.
My only concern at the moment is the error logs I keep getting on our
server.

server specs:
Xeon DualCore DualProc
6gb ram
raid0 scsi hds
ubuntu 32bit

this server is running
memcached
nginx (0.6.31) +php-fpm (0.5.8) which is hosting the main php files
and some static content.
eaccelerator as cache

another server is running the mysql + nginx for most of the static
content of the page (no php)

current load at this minute: load average: 5.12, 5.52, 5.90
Mem: 4150036k total, 1883004k used, 2267032k free, 393228k
buffers

nginx:
Active connections: 3223
Reading: 37 Writing: 369 Waiting: 2817

configs:

php-fpm
max_children 50
rlimit_files 2048
max_requests 10000

nginx:
worker_processes 6;
worker_connections 1024;
keepalive_timeout 75 20;

errors:

(this one gets repeated the most)
*2680454 connect() failed (110: Connection timed out) while connecting
to upstream, client: xx.xx.15.3, server: xxx,com, request: "GET ..
HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000",

4534#0: *2948650 upstream prematurely closed connection while reading
response header from upstream, client: ...., upstream: "fastcgi://
127.0.0.1:9000",

When I lowered the worker_connections in nginx to 512 I got:
[alert] 4533#0: *2460021 1024 worker_connections is not enough while
connecting to upstream, client: .... request: "GET ... HTTP/1.1",
upstream: "fastcgi://127.0.0.1:9000",

Thanks for any kind of input. And if you need more stats or logs just
ask :)

Andrei Nigmatulin

unread,

Jun 4, 2008, 11:16:55 AM6/4/08

to highloa...@googlegroups.com

This is common error when all php workers are busy. The most likely scenario
is all your workers are waiting for a database response. You can try out
php-fpm-0.5.9-rc1 with option request_slowlog_timeout=1 to find exact place
in php code where hangs took place. Alternatively, you can use xdebug or any
other profiling tool to find out why your scripts executes so long.

> 4534#0: *2948650 upstream prematurely closed connection while reading
> response header from upstream, client: ...., upstream: "fastcgi://
> 127.0.0.1:9000",

This is failure of php to serve fastcgi request. Please check your php-fpm.log
for possible explanations.

> When I lowered the worker_connections in nginx to 512 I got:
> [alert] 4533#0: *2460021 1024 worker_connections is not enough while
> connecting to upstream, client: .... request: "GET ... HTTP/1.1",
> upstream: "fastcgi://127.0.0.1:9000",
>
>
>
>
>
> Thanks for any kind of input. And if you need more stats or logs just
> ask :)

--
Andrei Nigmatulin
GPG PUB KEY 6449830D

Now I lay me down to sleep(3)
Pray the OS my core to keep
If I die before I wake
Pray the Disk my core to take

daniel

unread,

Jun 4, 2008, 6:28:02 PM6/4/08

to highload-php-en

Ok we have feedback from the new dumper:

Jun 05 00:10:01.021810 pid 17787 (pool default)
+++ dump failed

Jun 05 00:10:01.021946 pid 17977 (pool default)
+++ dump failed

Jun 05 00:10:01.022092 pid 17989 (pool default)
+++ dump failed

Jun 05 00:10:01.022304 pid 17837 (pool default)
script_filename = /..../index.php
[0xbfa6bb60] mysql_query() /..../mysql.php:204
[0xbfa6c350] countrows() /..../user.php:1301
[0xbfa6c8f0] get_user_homepage() /..../user_index.php:146
[0xbfa6ccb0] +++ dump failed

Jun 05 00:10:01.129184 pid 18006 (pool default)
script_filename = /...../index.php
[0xbfa6ccb0] session_start() /..../index.php:4

I've also located them in the php-fpm log:

Jun 05 00:10:01.021831 [ERROR] fpm_trace_get_strz(), line 225:
ptrace(PEEKDATA) failed: Input/output error (5)
Jun 05 00:10:01.021865 [NOTICE] fpm_php_trace(), line 359: finished
trace of 17787
Jun 05 00:10:01.021911 [NOTICE] fpm_children_bury(), line 188: child
17977 stopped for tracing
Jun 05 00:10:01.021925 [NOTICE] fpm_php_trace(), line 331: about to
trace 17977
Jun 05 00:10:01.021966 [ERROR] fpm_trace_get_strz(), line 225:
ptrace(PEEKDATA) failed: Input/output error (5)
Jun 05 00:10:01.022003 [NOTICE] fpm_php_trace(), line 359: finished
trace of 17977
Jun 05 00:10:01.022057 [NOTICE] fpm_children_bury(), line 188: child
17989 stopped for tracing
Jun 05 00:10:01.022071 [NOTICE] fpm_php_trace(), line 331: about to
trace 17989
Jun 05 00:10:01.022112 [ERROR] fpm_trace_get_strz(), line 225:
ptrace(PEEKDATA) failed: Input/output error (5)

Jun 05 00:10:01.022224 [NOTICE] fpm_children_bury(), line 188: child
17837 stopped for tracing
Jun 05 00:10:01.022278 [NOTICE] fpm_php_trace(), line 331: about to
trace 17837
Jun 05 00:10:01.022465 [ERROR] fpm_trace_get_strz(), line 225:
ptrace(PEEKDATA) failed: Input/output error (5)
Jun 05 00:10:01.022544 [NOTICE] fpm_php_trace(), line 359: finished
trace of 17837
Jun 05 00:10:01.022612 [NOTICE] fpm_got_signal(), line 48: received
SIGCHLD
Jun 05 00:10:01.022685 [NOTICE] fpm_got_signal(), line 48: received
SIGCHLD

Jun 05 00:10:01.129082 [NOTICE] fpm_php_trace(), line 359: finished
trace of 17814
Jun 05 00:10:01.129148 [NOTICE] fpm_children_bury(), line 188: child
18006 stopped for tracing
Jun 05 00:10:01.129163 [NOTICE] fpm_php_trace(), line 331: about to
trace 18006
Jun 05 00:10:01.129274 [NOTICE] fpm_php_trace(), line 359: finished
trace of 18006
Jun 05 00:10:01.129395 [NOTICE] fpm_got_signal(), line 48: received
SIGCHLD

Hopefully you can help me clear out why they break (or where at
least), and why the dump fails?

fpm conf:
max_children 300
request_slowlog_timeout 10s

On Jun 4, 5:16 pm, Andrei Nigmatulin <andrei.nigmatu...@gmail.com>
wrote:

Andrei Nigmatulin

unread,

Jun 6, 2008, 11:28:44 AM6/6/08

to highloa...@googlegroups.com

On Thursday 05 June 2008 02:28, daniel wrote:
> Ok we have feedback from the new dumper:
>
>
> Jun 05 00:10:01.021810 pid 17787 (pool default)
> +++ dump failed
>
> Jun 05 00:10:01.021946 pid 17977 (pool default)
> +++ dump failed
>
> Jun 05 00:10:01.022092 pid 17989 (pool default)
> +++ dump failed
>
> Jun 05 00:10:01.022304 pid 17837 (pool default)
> script_filename = /..../index.php
> [0xbfa6bb60] mysql_query() /..../mysql.php:204
> [0xbfa6c350] countrows() /..../user.php:1301
> [0xbfa6c8f0] get_user_homepage() /..../user_index.php:146
> [0xbfa6ccb0] +++ dump failed

This is the most important information in your case: this backtrace in theory
might help you to track down slow sql request, which executes too long and
can cause connection timeout errors.

> Jun 05 00:10:01.129184 pid 18006 (pool default)
> script_filename = /...../index.php
> [0xbfa6ccb0] session_start() /..../index.php:4
>

> Jun 05 00:10:01.022224 [NOTICE] fpm_children_bury(), line 188: child
> 17837 stopped for tracing
> Jun 05 00:10:01.022278 [NOTICE] fpm_php_trace(), line 331: about to
> trace 17837
> Jun 05 00:10:01.022465 [ERROR] fpm_trace_get_strz(), line 225:
> ptrace(PEEKDATA) failed: Input/output error (5)
> Jun 05 00:10:01.022544 [NOTICE] fpm_php_trace(), line 359: finished
> trace of 17837
> Jun 05 00:10:01.022612 [NOTICE] fpm_got_signal(), line 48: received
> SIGCHLD
> Jun 05 00:10:01.022685 [NOTICE] fpm_got_signal(), line 48: received
> SIGCHLD
>

> Hopefully you can help me clear out why they break (or where at
> least), and why the dump fails?

There are several reasons for that, I will investigate what might be in your
case.

>
> fpm conf:
> max_children 300
> request_slowlog_timeout 10s

--

shunz

unread,

Jul 7, 2008, 3:35:04 AM7/7/08

to highload-php-en

How to get this status?
======================

nginx:
Active connections: 3223
Reading: 37 Writing: 369 Waiting: 2817

======================

Andrei Nigmatulin

unread,

Jul 7, 2008, 4:24:14 AM7/7/08

to highloa...@googlegroups.com

On Monday 07 July 2008 11:35, shunz wrote:
> How to get this status?
> ======================
> nginx:
> Active connections: 3223
> Reading: 37 Writing: 369 Waiting: 2817
> ======================

http://wiki.codemongers.com/NginxHttpStubStatusModule

Reply all

Reply to author

Forward