-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
I noticed that on our server fastcgi sometimes aborts because the select system
call was interrupted instead of retrying the system call:
[Fri Oct 31 20:14:47 2008] [error] [client xx.xxx.xx.xx] (4)Interrupted system call: FastCGI: comm with server "/xxx[...]/php5" aborted: select() failed, referer: http://www.aniki.info/Gott_Gauss
I googled this and found the following patch:
http://article.gmane.org/gmane.comp.web.fastcgi.devel/2514
However I think sleeping 1s before retrying is unnecessary (and having a
maximum amount of retries might be a bit paranoid), so I'm not saying this
patch should be applied as-is.
I looked at the libapache-mod-fastcgi-2.4.6 source package and AFAICS it also
has this bug.
Note that the select(2) manpage clearly states:
|Under Linux, select() may report a socket file descriptor as "ready for
|reading", while nevertheless a subsequent read blocks. This could for
|example happen when data has arrived but upon examination has wrong
|checksum and is discarded. There may be other circumstances in which a
|file descriptor is spuriously reported as ready. Thus it may be safer
|to use O_NONBLOCK on sockets that should not block.
So this applies even though the linked mail only talks about this problem
occuring on AIX.
FYI I'm not submitting the bug directly from the server (because I get 'out of
memory' errors from reportbug due to a 32MB memory ulimit as user and I don't
want to reportbug as root), the server is running etch/stable, I have edited
the 'System Information' below accordingly.
- -- System Information:
Debian Release: etch/stable
Architecture: i686 (x86)
Kernel: Linux 2.6.25.4
Locale: LANG=en_IN, LC_CTYPE=en_IN (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iD8DBQFJC2xnzQZOfTz8JZwRAu+DAKCjdlP3YHXbAZ8J+odPMTiX5ShjHQCfZWzZ
fxDZ/qLegIKOqCAORtott7s=
=HwyU
-----END PGP SIGNATURE-----
--
To UNSUBSCRIBE, email to debian-bugs-...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
I'm using the following modified patch now:
Index: libapache-mod-fastcgi-2.4.2/mod_fastcgi.c
===================================================================
--- libapache-mod-fastcgi-2.4.2.orig/mod_fastcgi.c 2008-11-02 16:42:49.000000000 +0000
+++ libapache-mod-fastcgi-2.4.2/mod_fastcgi.c 2008-11-02 16:50:46.000000000 +0000
@@ -2178,12 +2178,15 @@
}
/* wait on the socket */
- select_status = ap_select(nfds, &read_set, &write_set, NULL, &timeout);
+ /* Interrupted system calls do happen now and then, so retry on EINTR */
+ do {
+ select_status = ap_select(nfds, &read_set, &write_set, NULL, &timeout);
+ } while (select_status < 0 && errno == EINTR);
if (select_status < 0)
{
ap_log_rerror(FCGI_LOG_ERR_ERRNO, r, "FastCGI: comm with server "
- "\"%s\" aborted: select() failed", fr->fs_path);
+ "\"%s\" aborted: select() failed: \"%s\"", fr->fs_path, strerror(errno));
state = STATE_ERROR;
break;
}
@@ -2246,11 +2249,19 @@
}
rv = fcgi_buf_socket_recv(fr->serverInputBuffer, fr->fd);
+ /*
+ * select(2) states: Under Linux, select() may report a socket
+ * file descriptor as "ready for reading", while nevertheless a
+ * subsequent read blocks.
+ * Act as if the FD was not set if socket_recv returns EAGAIN.
+ */
+ if (rv < 0 && errno == EAGAIN)
+ break;
if (rv < 0)
{
ap_log_rerror(FCGI_LOG_ERR, r, "FastCGI: comm with server "
- "\"%s\" aborted: read failed", fr->fs_path);
+ "\"%s\" aborted: read failed: \"%s\"", fr->fs_path, strerror(errno));
state = STATE_ERROR;
break;
}
--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。