How to find out dead TCP connection in Vert.x

1,206 views
Skip to first unread message

Terry Cho

unread,
Feb 12, 2014, 3:01:08 AM2/12/14
to ve...@googlegroups.com
Hello all. I have a scenario that is similar to push server.
200,000 windows client will connected to Vert.x server by using TCP connection.
The problem is i want to detect dead connection.
The meaning of dead connection is,
In netstat the connection is displayed as "ESTABILISHED".
But if we unplug network cable or power off the windows client, the connection will be still in "ESTABLISHED" status, because it didn't get FIN signal from client.
The thing i can do is,
i can set tcp_keep_alive and open socket from client with TCP_KEEP_ALIVE option.
But as u know tcp_keep_alive check interval is about 2 hours.

So i'm asking that is there more elegant way to client is alive.?
and i want to close the dead tcp connection from vertx side.

One of my idea is handling hearbeat check in application level.

Windows client can send heart beat message to server periodically.
and vert.x server stores "Last updated time" in concurrentMap.
and i can add timer event handler to each connecion.
the timer event handler will check it's last updated time, and if the last updated time is older than 10 sec. it close the connection.

any suggestion or case study will be welcome.
Thanx in advance

Norman Maurer

unread,
Feb 12, 2014, 3:04:33 AM2/12/14
to ve...@googlegroups.com
Your only bet is to implement your own heartbeat
--
You received this message because you are subscribed to the Google Groups "vert.x" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vertx+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Terry Cho

unread,
Feb 12, 2014, 7:49:42 AM2/12/14
to ve...@googlegroups.com, norman...@googlemail.com
Hello Norman.
The answer that i want is not a "your own hearbeat".
I want to have a guide or reference implementation, example etc for "your own hearbeat"
Do u have any idea on this?

2014년 2월 12일 수요일 오후 5시 4분 33초 UTC+9, Norman Maurer 님의 말:

Randall Richard

unread,
Feb 12, 2014, 11:58:17 AM2/12/14
to ve...@googlegroups.com, Norman Maurer
Your description is pretty much how I do it, although I have the server pinging the client.  If the server's ping routine recognizes that it hasn't gotten a response from the client within the specified time frame, then it will simply close the connection.

-Randall

Web Dev

unread,
Feb 13, 2014, 8:06:02 PM2/13/14
to ve...@googlegroups.com, norman...@googlemail.com
Hi Terry,

lol don't just demand source code. At least try to implement it yourself. If you get stuck, then post some of your source code and lots of people will help out.

Ryan Chazen

unread,
Feb 14, 2014, 3:48:06 AM2/14/14
to ve...@googlegroups.com, norman...@googlemail.com
It is actually a fairly difficult thing to do well - the naive approach of looping through all 200K connections and sending a ping request is pretty heavy on cpu/network.

Personally I would do it like this:

Don't do any keep alive checking by default.
Set a limit to the number of expected connections, say 300K. Have a counter of the number of open connections that you can change every time you open or close a connection. When that counter goes above 300K, you know that there must be a bunch of inactive connections you need to clean up. So when the counter hits 300K, trigger a ping message to every client and disconnect the ones that don't reply. This means that you will only clear out old connections when there is an actual hardware need to clear them and save you a lot of cpu/network/client battery on unnecessary checking. The 300K should be a number that is a lot more than your expected max number of real connections for this to work so you don't trigger it too often. You would want to set a limit on the number of times you can trigger the ping test in a small interval too.

This method would minimize the amount of work you need to do for keep alives while still ensuring your server doesn't get overloaded with old dead connections.

Frank Reiter

unread,
Feb 14, 2014, 10:44:53 AM2/14/14
to ve...@googlegroups.com

This has the potential to fail catastrophically if you missed the peak traffic or have an unexpected surge in demand.   You might consider instead pinging half a dozen old connections for each new one that comes in.   If indeed there are a high percentage of dead ones,  this will cause the total number to decline.   If on the other hand they are mostly active you will have avoided the burden of testing thousands of connections when you can least afford the load.

Frank.

Ryan Chazen

unread,
Feb 14, 2014, 11:20:41 AM2/14/14
to ve...@googlegroups.com
So keeping some kind of sorted list by connection age (or last connection activity would be even better), and then you can efficiently pull and test the oldest X connections.

eg,
LinkedList for holding the sorted list. Whenever there is activity on a connection, remove it from the linked list and add it to the tail  in constant time.
When a new connection opens, you can remove the first X connections from the head of the list (they will be the oldest connections to have no activity) and you can then test them. If the test fails, you drop the connection. If the test succeeds, you add it to the tail of the list. If the oldest connection had activity recently then no need to test anything. All constant time still and should work optimally regardless of number of connections.

 Thanks Frank, I'll use this when I get a chance for my stuff too. I wonder if it would be possible to generalize it into a library or if it's too specific to app implementation?




--
You received this message because you are subscribed to a topic in the Google Groups "vert.x" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/vertx/Y0S3WvP0AzQ/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to vertx+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages