socket leak

Showing 1-11 of 11 messages
socket leak Jan Rychter 4/7/11 5:43 AM
After upgrading Ring from 0.2.5 -> 0.3.7 we noticed our application
would crash after some time because of "too many open files".
Investigation showed that incoming connections are the culprit, from
jetty's QueuedThreadPool. Sockets are being left open.

I'm sorry I can't investigate this further right now, but there are
some urgent things that I really need to work on right now. In the
meantime, I wanted to give a heads-up and perhaps someone will be able
to guess the reason for the leak.

We use run-jetty to start our server. I know 0.2.5 works great and is
very stable, while 0.3.7 leaks. I have also narrowed it down to ring/
jetty, e.g. just changing the version in project.clj from 0.3.7 back
to 0.2.5 fixes the problem. It isn't our code, unless we're doing
something stupid that ring 0.3.7 exposed.
Re: socket leak James Reeves 4/7/11 3:24 PM
On 7 April 2011 08:43, Jan Rychter <jryc...@gmail.com> wrote:
> After upgrading Ring from 0.2.5 -> 0.3.7 we noticed our application
> would crash after some time because of "too many open files".
> Investigation showed that incoming connections are the culprit, from
> jetty's QueuedThreadPool. Sockets are being left open.

What state were the sockets being left in? TIME_WAIT?

I tried to replicate the error, and I noticed via netstat that a lot
of sockets were open and set to TIME_WAIT. This appears to be a
problem with Java NIO and the settings of the operating system:

http://jira.codehaus.org/browse/JETTY-999?focusedCommentId=191250&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_191250

Is that the problem you were seeing?

- James

Re: socket leak Jan Rychter 4/8/11 2:57 AM
On Apr 8, 12:24 am, James Reeves <jree...@weavejester.com> wrote:
> On 7 April 2011 08:43, Jan Rychter <jrych...@gmail.com> wrote:
>
> > After upgrading Ring from 0.2.5 -> 0.3.7 we noticed our application
> > would crash after some time because of "too many open files".
> > Investigation showed that incoming connections are the culprit, from
> > jetty's QueuedThreadPool. Sockets are being left open.
>
> What state were the sockets being left in? TIME_WAIT?

I checked now -- they are indeed in TIME_WAIT.

> I tried to replicate the error, and I noticed via netstat that a lot
> of sockets were open and set to TIME_WAIT. This appears to be a
> problem with Java NIO and the settings of the operating system:
>
> http://jira.codehaus.org/browse/JETTY-999?focusedCommentId=191250&pag...
>
> Is that the problem you were seeing?

It might be, but I am not sure. It's true I see lots of sockets in
TIME_WAIT.
But I got the "too many open files" error on our testing server, with
a very
low connection load. We're talking perhaps one connection every few
seconds,
and we ran out of descriptors after about two days of this. The time
scales
do not really correspond to TIME_WAIT states.

A quick check shows that the problem seems to be under control on Mac
OS X:
the app when bombarded with requests ends up with < 4000 sockets in
TIME_WAIT.

However, I don't think it is just the OS settings. Let's see:

ubuntu$ cat /proc/sys/net/ipv4/tcp_fin_timeout
60

we have a default TIME_WAIT period of 60s under Linux, and

bongo:/Main/jwr>sysctl net.inet.tcp.msl
net.inet.tcp.msl: 15000

Mac OS X has a default TCP MSL of 15s (you can't configure TIME_WAIT
period
directly in Mac OS X), so the TIME_WAIT period is at least 30s
(2*MSL).

I do not think the difference between 30s and 60s is *that*
significant,
especially that:

ubuntu$ ulimit -n
1024

bongo:/Main/jwr>ulimit -n
256

So, I don't know -- it might be the same problem, or it might not.
Would
something change between ring 0.2.5 and ring 0.3.7 for the problem to
suddenly appear? I am *really* sure we did not have this with 0.2.5,
we've
been using that in production for many months how.

In case it helps, YourKit profiler shows these sockets as not closed,
e.g.
there is no closing stack trace, only the opening one.

--J.
Re: socket leak James Reeves 4/8/11 4:48 AM
On 8 April 2011 05:57, Jan Rychter <jryc...@gmail.com> wrote:
> Would
> something change between ring 0.2.5 and ring 0.3.7 for the problem to
> suddenly appear? I am *really* sure we did not have this with 0.2.5,
> we've
> been using that in production for many months how.

Version 0.2.5 used an earlier version of Jetty (6.1.14, rather than
6.1.26). Other than that, there haven't been many significant changes,
although it's possible that a small bug fix produced this behaviour.

When you downgraded to 0.2.5, did you just replace the
ring-jetty-adapter jar, or did you change the version in your
project.clj file (and therefore, downgrade the Jetty version and
ring-servlet version at the same time)?

- James

Re: socket leak Jan Rychter 4/9/11 3:40 AM
On Apr 8, 1:48 pm, James Reeves <jree...@weavejester.com> wrote:
I only changed one line in project.clj, then cleaned and redownloaded
all dependencies. So jetty and ring-servlet were downgraded as well.

[sorry for the broken formatting in my previous post -- Google Groups
has a horrific web interface]

--J.
Re: socket leak Jan Rychter 9/26/11 2:36 AM
I'll resurrect an old thread, since the issue still exists. I recently found some time to track this down. To recap the story so far: my ring application started to predictably crash under load with "Too many open files" after several hours when I switched from Ring 0.2.5 to 0.3.7.

I upgraded ring to 0.3.11 and confirmed the problem is there.

I then performed a binary search of jetty versions from 6.1.14 to 6.1.26, e.g. I just replace the two jetty libs, leaving the rest as-is:

6.1.26 - fails
6.1.25 - OK
6.1.23 - OK
6.1.20 - OK
6.1.14 - OK

The likely culprit is http://jira.codehaus.org/browse/JETTY-547 (Jetty should rely on socket.shutdownOutput() to close sockets).

The symptoms are reproducible after 1-3 hours on my Mac OS X system and after 8-12 hours on a Linux box. Investigating with YourKit shows that sockets are NOT being closed and some remain in open state until file descriptors are exhausted. Interestingly enough, it isn't all sockets that remain open, just some, in batches, it seems. This is why even on my Mac system (limited to 256 fds per process) it takes hours of stress testing to discover the problem.

Netstat shows that no sockets linger in TIME_WAIT.

I should probably raise this with jetty people, but I thought I'd post here, for those who have long-running applications (under heavier loads) using Ring. Just a heads-up — you might encounter this problem. In fact, I don't understand why more people don't complain about it.

--J.

Re: socket leak Jan Rychter 9/27/11 8:24 AM
On Monday, September 26, 2011 11:36:00 AM UTC+2, Jan Rychter wrote:
I'll resurrect an old thread, since the issue still exists. I recently found some time to track this down. To recap the story so far: my ring application started to predictably crash under load with "Too many open files" after several hours when I switched from Ring 0.2.5 to 0.3.7.

I upgraded ring to 0.3.11 and confirmed the problem is there.

So I guess the question is — will ring move back to jetty-6.1.25, or should I fork it and build my own version? The current ring with 6.1.26 is unusable in our production environments because of the socket problem in jetty.

--J.

Re: socket leak Constantine Vetoshev 9/27/11 10:49 AM
Have you considered using the latest Ring, excluding its Jetty
dependency, and adding your own? Leiningen supports this, e.g.:

:dependencies [[org.mortbay.jetty/jetty "6.1.25"]
  [ring/ring-jetty-adapter "0.3.11" :exclusions [org.mortbay.jetty/jetty]]]

Something similar should be possible with Maven also.

Re: socket leak Jan Rychter 9/27/11 11:00 AM
On Tuesday, September 27, 2011 7:49:33 PM UTC+2, Constantine Vetoshev wrote:
Have you considered using the latest Ring, excluding its Jetty
dependency, and adding your own? Leiningen supports this, e.g.:

:dependencies [[org.mortbay.jetty/jetty "6.1.25"]
  [ring/ring-jetty-adapter "0.3.11" :exclusions [org.mortbay.jetty/jetty]]]

Nice! Thanks for this helpful advice -- I did this for jetty and jetty-util, and it is much easier than forking ring.

--J.

Re: socket leak James Reeves 9/27/11 2:38 PM

I think I probably will, but in the meantime you can use Constantine's
solution. I believe that the dependencies specified in your
project.clj will always override Ring's own dependencies, so you're
not forced to use specific versions if they don't work for you.

- James

Re: socket leak Jan Rychter 9/28/11 1:21 AM
Constantine's solution works just fine for me. I just finished a nightly stress test, no problems found. 

I filed a bug with the jetty people: [#JETTY-1438] Sockets are not getting closed (likely introduced in #JETTY-547 at http://jira.codehaus.org/browse/JETTY-1438 -- we will see if anything happens there. I am very surprised that more people don't encounter this problem.

thanks,
--J.