Jenkins leaking TIME_WAIT connections(?) (was: Re: Apache proxy problems / Jenkins problem or setup problem?)

811 views
Skip to first unread message

Martin B.

unread,
Jul 25, 2011, 9:28:47 AM7/25/11
to jenkin...@googlegroups.com
(Note: Find Reference to my messages in jenkinsci-users at the end.)

Hi experts! :-)

To my novice's eye it seems that the code that keeps the build queue and
the executors in sync is leaking TIME_WAIT connection on the server side.

Analysis
--------

### Setup

My Jenkins is running at port 8082 on a Windows machine. I have
installed it with the default installer and it's running as a Windows
service.

### refreshPart x 2

The Jenkins sidebar contains the following code:

<table id="buildQueue" class="pane">...
<script defer="defer">
refreshPart('buildQueue',"/jenkins/ajaxBuildQueue");
</script>
<table id="executors" class="pane">...
<script defer="defer">
refreshPart('executors',"/jenkins/ajaxExecutors");
</script>

where refreshPart is defined in hudson-behaviour.js and reload the
buildQueue and the executors info every 5 seconds.


### Reloads

This means my FireFox will make two connections every 5 seconds to the
server.

Looking at the open TCP-Ports at the server, I see that after I newly
opened a Joblist View in FireFox, the number of connections
{server:8082} <-> {mypc:####} in TIME_WAIT state goes up by approx. 2
per 5 seconds, settling down at permanently 52(!) connections in
TIME_WAIT state after a few minutes.

This means that every browser page that shows the Jenkins sidebar will
eventually take up about 50 ports on the server side.

#### TIME_WAIT

I monitor these TIME_WAIT connections with the TCPView tool from
SysInternals, but I guess I could also read this via netstat.

As far as I understand, they are leftovers of the (very short)
connections done by the client when the server does an [Active
Close](http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html)
after the client has been served.

### 503 errors with Apache proxy

I now have an Apache instance running on this server that is serving
this Jenkins instance at port 80 via local proxying

{mypc} <-> {server(Apache):80} <-> {server(Jenkins):8082}

When I do access Jenkins via the proxy, the "leaking" TIME_WAIT
connections will be between (only) Apache:80 and Jenkins:8082 on the
localhost.

It now does appear, that when the number of TIME_WAIT connections
reaches about 1000 (that would be 1000/50==20 open windows for a few
minutes) Apache cannot open a local port to open a proxy connection
anymore and subsequently goes into 503-service-temporarily-unavailable


### Timeout with direct connects on port:8082

When accessing server:8082/jenkins directly with a lot of open FF pages
(from my dev machine) The Jenkins instance doesn't react anymore when
the number of TIME_WAIT connections reaches about 600-700.

The server only start to respond again after a few minutes (FF will wait
for it, so it apparently isn't that long).


Question
--------

* Does my Analysis make sense?
* What to do about this problem?

Clearly neither the 503 error nor the minute-long delay is acceptable if
this can happen as soon as a bunch of devs has 2 or 3 windows open.


thanks a lot,
Martin

On 25.07.2011 14:43, Martin B. wrote:
> On 25.07.2011 10:37, Martin B. wrote:
>> On 22.07.2011 16:15, Martin B. wrote:
>>> Hi!
>>>
>>> I'm running Jenkins (on port 8082) behind an apache proxy (on 80)
>>>
>>> from time to time I get the message (when acessing through apache):
>>> 503 Service Temporarily Unavailable error
>>>
>>> The apache logs show:
>>> [Fri Jul 22 16:08:58 2011]
>>> [error] (OS 10048)Normalerweise darf jede Socketadresse (Protokoll,
>>> Netzwerkadresse oder Anschluss) nur jeweils einmal verwendet werden. :
>>> proxy: HTTP: attempt to connect to 127.0.0.1:8082 (localhost) failed
>>> [Fri Jul 22 16:08:58 2011]
>>> [error] ap_proxy_connect_backend disabling worker for (localhost)
>>> [Fri Jul 22 16:09:03 2011]
>>> [error] proxy: HTTP: disabled connection for (localhost)
>>> ...
>>>
>>>
>>> Does this mean I have some problem with the setup (something else on
>>> port 8082 ??) or is this something else?
>>>
>>
>> Hmmm ...
>>
>> -> http://stackoverflow.com/questions/163603/apache-sockets-not-closing
>> and
>> ->
>> https://wiki.jenkins-ci.org/display/JENKINS/Running+Jenkins+behind+Apache
>>
>> I'll try it with the proxy-nokeepalive option and see if that'll help.
>>
>
> Pfff .. it does appear proxy-nokeepalive actually worsens the situation.
>
> I'll have to dig around further it seems ...
>
> - Martin
>

Martin B.

unread,
Jul 25, 2011, 12:11:25 PM7/25/11
to jenkin...@googlegroups.com

Update
------

It seems it is not directly related to the "open" TIME_WAIT connections
after all.

I just had this occuring (using the server via Apache proxy) and the
TIME_WAIT connections on the server, which I am monitoring atm., was
only at about 30.

Any input welcome. I'm really lost.

cheers,
Martin

Martin B.

unread,
Jul 27, 2011, 6:26:35 AM7/27/11
to jenkin...@googlegroups.com
Update -

I'm now trying with what I found here:
http://forums.devshed.com/apache-development-15/os-10048-error-406218.html

+> After a recent upgrade from Windows Server
+> 2000 to Windows Server 2003, I'm getting
+> intermittent 502 errors ... using a reverse
+> proxy (ProxyPass and ProxyPassReverse) to
+> connect to a J2EE web container on the same machine.
+> ...
+> As it turns out, this was because we were running
+> out of local ports on the server. To fix, we
+> increased the max "local port" from
+> 5000 to 10000.

I have now upped my
[MaxUserPort](http://support.microsoft.com/kb/196271) setting on this
Win2003 box to 15000.

Although I am very sure I was *not* running out of local ports -- there
was no way we were near the avaliable ~4000 (1024 - 5000) ports used up
as far as I could tell -- I guess it's worth a try.

cheers,
Martin

akostadinov

unread,
Jul 28, 2011, 3:07:58 AM7/28/11
to Jenkins Developers
http://developerweb.net/viewtopic.php?id=2941

Have you tried doing your tests with a separate client machine?
According to the link above, TIME_WAIT occurs on the client side and
when you use the same machine for client and server it's not very
clear what happens where...

As for apache proxy, you may use tomcat or jboss + mod_proxy_ajp to
avoid multiple connections between proxy and server.
> > Close](http://www.serverframework.com/asynchronousevents/2011/01/time-wait-a...)
> >>>https://wiki.jenkins-ci.org/display/JENKINS/Running+Jenkins+behind+Ap...
Reply all
Reply to author
Forward
0 new messages