In addition, the php-cgi processes run as an LDAP user, so our LDAP
server must be contacted whenever a permissions check is done (like
read a php file).
So that's all working fine, but we want to move to php-fpm because
spawn-fcgi has some stability problems.
We setup a test server with php-5.2.11 and php-fpm-6.0 (lighttpd + php-
fpm (5 children)) which the development team used for about a week.
We didn't receive any errors and everything went smoothly.
Next we tried to put this into production, replacing our spawn-fcgi
setup. So we set php-fpm to have 512 children, run as our LDAP user,
and started it running... Boom Sound!
The whole thing blew up in our faces.
After setting everything back the way it was, we started trying to
figure out why php-fpm couldn't handle real-life load.
We found several interesting things, but we think the most likely
culprit is the fact that each php-fpm process has two LDAP connections
open at all times. It ran fine when there were only 5 processes with
persistent connections to the LDAP server, but once we had 512
processses (actually more, because we made it through the upgrade on
two servers) our LDAP server couldn't keep up.
On php-fpm's side, we had a flood of errors like the following:
Mar 22 18:41:27.532665 [NOTICE] fpm_children_make(), line 352: child
18504 (pool default) started
Mar 22 18:41:27.532732 [ERROR] fpm_stdio_prepare_pipes(), line 197:
pipe() failed: Too many open files (24)
Mar 22 18:41:27.532853 [WARNING] fpm_children_bury(), line 215: child
14583 (pool default) exited with code 255 after 7.024447 seconds from
start
Mar 22 18:41:27.532892 [WARNING] fpm_stdio_child_said(), line 167:
child 14583 (pool default) said into stderr: "Mar 22 18:41:20.533268
[ERROR] fpm_unix_init_child(), line 215: setuid(3002) failed: Resource
temporarily unavailable (11)"
Mar 22 18:41:27.532918 [WARNING] fpm_stdio_child_said(), line 167:
child 14583 (pool default) said into stderr: "Mar 22 18:41:20.533327
[ERROR] fpm_child_init(), line 140: child failed to initialize (pool
default)", pipe is closed
So we are going to try again, after raising the file limit on the LDAP
server and the web servers. We are also going to try with a much
lower child count for php-fpm.
But we're still wondering, "Why does php-fpm have those persistent
connections to the LDAP server?"
Spawn-fcgi only connects to the LDAP server as needed.
Can anyone enlighten me?
Thanks,
Dan
Do you have the same php ini configuration between your two configurations ?
>
> Thanks,
> Dan
>
> To unsubscribe from this group, send email to highload-php-en+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
>
Are you actually using PHP's LDAP api to authenticate users in scripts
ala ldap_connect(), or are you just using LDAP on your server and have
your fastcgi pool running as a user in LDAP?
If it is the former, I can understand you running out of connections -
you should cleanly ldap_unbind() when done.
If it is the latter, that is a bit more disturbing and means your
server's ldap configuration is probably suboptimal. My thoughts then
would be to make sure slapd is allowed more than the default 1024 file
descriptors and that all client machines run nscd to eliminate
redundant lookups.
- ammon
On Wed, Mar 24, 2010 at 8:57 AM, Dan <d...@danielaharon.com> wrote:
> The php.ini configs are identical.
>
I'm just curious about the differences between spawn-fcgi and php-fpm.
Why is it that running as an LDAP user with php-fpm results in a
persistent connection for each child to the LDAP server, while doing
the same with spawn-fgi does not?
Thanks,
Daniel Aharon
In the error messages you shared in your original post, I don't see
anything stating that the excess of file handles have anything to do
with ldap. My two thoughts:
1. With only a handful of fcgi workers, verify that they are in fact
holding sockets to your auth server ala something like: lsof -i
tcp:ldap
2. The parent fpm worker probably keeps an fd open listening to all of
the children it spins off, try increasing the number of allowed file
descriptors before spinning things up: ulimit -n 2048
- ammon
Yeah, that is the method we used to see the ldap connections for the
php-fpm children.
> 2. The parent fpm worker probably keeps an fd open listening to all of
> the children it spins off, try increasing the number of allowed file
> descriptors before spinning things up: ulimit -n 2048
Yeah, I think we are going to do that both on the LDAP server and on
the web servers.
We discovered, later, that the LDAP server had a ton of "too many open
file" errors. ulimit -n was at 1024 and slapd is a single process.
The LDAP server is really the root cause of the problem.
> There's really no good reason for a process to make its own connection
> to the LDAP server - much less retain it. This sort of thing should
> all be dealt with by your kernel's PAM extensions.
I know, I just don't understand it. Perhaps I did something wrong
during the compile?
I'll have to look into that nscd a bit more. It looks like it would
be worthwhile to install, given our setup.
- ammon