Router holds too many connections

157 views
Skip to first unread message

xiangy...@emc.com

unread,
Jan 28, 2015, 4:01:40 AM1/28/15
to vcap...@cloudfoundry.org
We found an issue recently in our CF environment v172. There are four applications deployed. Each of them have 3 instances. They need to communicate with each other which means one application will make rest call to another application. But sometimes it may fail with connect timeout when accessing another application. We found the request for connecting to another application was never sent to router. DEA is in SYN_SEND status and then timeout.

When we checked tcp connection on router, we found there are a lot of (over 10, 000) tcp connection between router and deas which are in ESTABLISHED status. Is this a bug? Why did router keep all those connections and never release them?

Thanks,
Maggie


James Bayer

unread,
Jan 28, 2015, 4:08:52 AM1/28/15
to vcap...@cloudfoundry.org
maggie, do you have a reproducer test you can share?

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.
To view this discussion on the web visit https://groups.google.com/a/cloudfoundry.org/d/msgid/vcap-dev/c5c2ad9b-deb9-48dc-8a10-b1e8f6b62c28%40cloudfoundry.org.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.



--
Thank you,

James Bayer

xiangy...@emc.com

unread,
Jan 28, 2015, 4:22:16 AM1/28/15
to vcap...@cloudfoundry.org
Following are my test steps.

1. Deploy 4 applications with 3 instances for each one. One of them will access another application by making rest call with java code.
2. Access the application using browser. After a while, it shows timeout error "java.net.ConnectException: Connection timed out". The connection request was not accepted by router.
3. Result of "netstat –etnop " command indicates there are a lot of connections between router and deas in "ESTABLISHED" status.

Thanks,
Maggie


On Wednesday, January 28, 2015 at 5:08:52 PM UTC+8, jbayer wrote:
maggie, do you have a reproducer test you can share?
 
--
Thank you,

James Bayer

xiangy...@emc.com

unread,
Jan 28, 2015, 4:40:28 AM1/28/15
to vcap...@cloudfoundry.org
Here is the result of netstat

tcp6       0      0     routerip%2730182:www    deaip%27301:36572      ESTABLISHED      990/gorouter

I think this is connection started by router and router never close this. If I stop the application, then the connection will be closed.

Thanks a lot for helping!


Maggie

On Wednesday, January 28, 2015 at 5:08:52 PM UTC+8, jbayer wrote:
maggie, do you have a reproducer test you can share?

James Bayer

unread,
Jan 28, 2015, 10:45:51 AM1/28/15
to vcap...@cloudfoundry.org
can you please share both the java client and the java server code?

how you are making the calls, receiving them, closing (or not) the connections, etc could be relevant.

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.

To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Jason Huang

unread,
Jan 28, 2015, 1:46:07 PM1/28/15
to vcap...@cloudfoundry.org
Hi James,

I am Maggie's co-worker. To answer your question:

The Java Client uses Spring REST Template from Spring 3.2.2 Release library to make REST Calls.

 

The Spring REST Template in turn uses the org.apache.http.impl.client.DefaultHttpClient with following properties which manages all Http Connections made to the Server.


image1.png



Please note that the Java Client has not customized any Http Connection parameters.


Thanks,


Jason









xiangy...@emc.com

unread,
Jan 28, 2015, 8:43:02 PM1/28/15
to vcap...@cloudfoundry.org
Hi, James

I also tested on my local machine with the same applications. And all connections would be closed in short time. But in CF environment, I can see there were more than one connection established on router. Most of them would be closed in short time. But only the one which I mentioned in previous post never got closed unless I stopped client application.

Thanks,
Maggie


On Wednesday, January 28, 2015 at 11:45:51 PM UTC+8, jbayer wrote:
can you please share both the java client and the java server code?

how you are making the calls, receiving them, closing (or not) the connections, etc could be relevant.

Dieu Cao

unread,
Jan 29, 2015, 1:59:51 AM1/29/15
to vcap...@cloudfoundry.org, xiangy...@emc.com
I've added a bug [1] to the Runtime tracker to look into this. Thanks for the report!

-Dieu

Christopher Piraino

unread,
Jan 29, 2015, 7:39:58 PM1/29/15
to vcap...@cloudfoundry.org, xiangy...@emc.com
Maggie,

What does your CF deployment look like? Specifically, what infrastructure are you running on and what are you using for a load balancer? We attempted to try and reproduce your problem on a bosh-lite with v172 and on an AWS environment running v197, but were not able to.

What is your manifest setting for "request_timeout_in_seconds"? That should be the timeout for the gorouter to close HTTP connections.

With that information it should be easier to figure out what is going on.

Best,
Chris and Ketan, CF Runtime Team

--
You received this message because you are subscribed to the Google Groups "Cloud Foundry Developers" group.

xiangy...@emc.com

unread,
Jan 29, 2015, 10:12:19 PM1/29/15
to vcap...@cloudfoundry.org, xiangy...@emc.com, cpir...@pivotal.io
Hi, Chris

I deployed CF v172 on Vmware ESXi 5.1 server using bosh-lite. We don't have load balancer. I had attached the manifest file I used. request_timeout_in_seconds is 300. You can just deploy one client application instance and one server application instance in CF. One thing I want to point out is we use new ticket in each rest call.

I also would like to share you with more additional information. I found the rest call from client application to router kept alive even though there was no data transferring. I found following message from client application log file.

10:50:32.242 [http-bio-61049-exec-2] DEBUG o.a.h.i.c.PoolingClientConnectionManager - Connection [id: 0][route: {}->http://11testupgrade9-test.domain] can be kept alive indefinitely

Our client application is running in tomcat and the server application is an OSGI based application. In my local environment, connections will be closed when they are timeout even though there is "Connection: keep-alive" in http request header. But in CF, the connection is kept alive indefinitely. Then I tried to force close connection by adding "Connetion: close" in http request header and this time all connections from router to client application were closed.

So would you please help to answer some questions?

1. Why doesn't timeout setting on router work?

2. Why could not we use "connection: keep-alive" in http request? Does it mean we always need to manually close connection in our code? If the connection was not released, why router need to create new TCP connection?

Would you please let me know if you need more information?

Thanks,
Maggie
cf_ngis_v2_ci_172.yml

Christopher Piraino

unread,
Jan 30, 2015, 8:03:07 PM1/30/15
to xiangy...@emc.com, vcap...@cloudfoundry.org
Hi Maggie,

Taking a look at the v172 version of the gorouter, I realized that I misled you slightly on the "request_timeout_in_seconds" field. In v172, that field is that the time the gorouter will wait for a response from the CF application. This does not have anything to do with client connections. Later versions of the gorouter disable any keep-alive headers from the client[1].

I have hopefully answered your first question above. For your second question, the gorouter will not timeout any client connections, only connections from the router to CF applications are subject to a timeout. This means that any connections kept open by the client will remain open indefinitely, a "keep-alive" connection should be closed by the client when they are done.

Let me know if that helps and if you have any more questions.

Best,
Chris and Dan, CF Runtime Team

xiangy...@emc.com

unread,
Feb 2, 2015, 9:56:49 PM2/2/15
to vcap...@cloudfoundry.org, xiangy...@emc.com, cpir...@pivotal.io
Hi, Chris

Thanks a lot for your great help!

By my understanding, with the change in proxy.go, our application will not have to add "connection: close" in request header.  Router will close connection even thought there is "connection: keep-alive" in request header which is sent by our application. Is it right? And would you please let me know which CF build would contain this change? V196 or later?

Thanks,
Maggie

xiangy...@emc.com

unread,
Feb 4, 2015, 9:36:45 PM2/4/15
to vcap...@cloudfoundry.org, xiangy...@emc.com, cpir...@pivotal.io
Hi, Chris

Would you please help to answer my question?


By my understanding, with the change in proxy.go, our application will not have to add "connection: close" in request header.  Router will close connection even thought there is "connection: keep-alive" in request header which is sent by our application. Is it right? And would you please let me know which CF build would contain this change? V196 or later?


Thanks,
Maggie

Christopher Piraino

unread,
Feb 5, 2015, 12:02:31 PM2/5/15
to Maggie Meng, vcap...@cloudfoundry.org
Hi Maggie,

Sorry for the late reply, yes that is correct. The change was introduced in v173, so anything v173 and above should work for you.

Let me know if anything else comes up.

Best,
Chris Piraino

xiangy...@emc.com

unread,
Feb 5, 2015, 8:53:37 PM2/5/15
to vcap...@cloudfoundry.org, xiangy...@emc.com, cpir...@pivotal.io
Chris, thanks a lot.
Reply all
Reply to author
Forward
0 new messages