So I've been having a problem in production that I can't seem to track
down, nor can I figure out even how to go about debugging it.
After running for a day or so (sometimes less), Juggernaut stops
responding. The only messages in the log are the following:
ERROR -- : Bad request
http://192.168.1.78/streams/connection_logout
(Errno::EMFILE: Too many open files - socket(2))
The URL that is being called is the callback when the connection is
terminated. But juggernaut won't respond to external requests either -
it looks like the process just has too many open sockets.
When I do an lsof on the juggernaut process, I get the following:
juggernau 12766 user 10u IPv4 119101789 TCP
192.168.1.78:commplex-link->adsl-xxx-xxx-xxx-
xxx.dsl.hstntx.sbcglobal.net:53495 (ESTABLISHED)
juggernau 12766 user 11u IPv4 119077236 TCP
192.168.1.78:commplex-link->xxx.xxx.xxx.xxx:51571 (ESTABLISHED)
juggernau 12766 user 12u IPv4 119068904 TCP
192.168.1.78:commplex-link->
cpe-xx-xx-xx-xx.socal.res.rr.com:50261
(ESTABLISHED)
juggernau 12766 user 13u IPv4 119066637 TCP
192.168.1.78:commplex-link->
ipxx-xx-xx-xx.ri.ri.cox.net:4139
(ESTABLISHED)
I have about 3 times the these connections to the juggernaut server as
I do actual clients that are connected according to
Juggernaut.show_clients.
So it looks like connections aren't getting cleaned up somehow, but I
don't really know how to test this.
Anyone have any ideas?