Proactively closing longpoll connections for clients that disappear from the network

1,046 views
Skip to first unread message

Traun Leyden

unread,
Mar 18, 2015, 5:21:08 PM3/18/15
to golan...@googlegroups.com

I'm trying to figure out why persistent http connections show up in lsof as established for so long (approximately 90 minutes) after mobile clients have disappeared from the network abruptly.  
The way I'm testing is to run the following handler on a server:


and connect with an android test app to the /longpoll endpoint.  After it's connected I'm turning off the wifi on the device which takes it completely offline, since it has no other way to access the network.  

I'm expecting this http handler code:

_, err = response.Write([]byte("\n"))
flush()

to cause it to detect that the client is no longer around and close the connection.  (roughly based on the advice in this blog post)

Is that a reasonable expectation?  If not, is there any way to force connections to close where the client has disappeared?  

Or to phrase the problem in another way, with lots of mobile clients connecting and abruptly disappearing without closing their connections (not sending FIN packets), the http server ends up with a lot of "zombie" connections and can eventually run out of file descriptors and is no longer able to service new clients trying to connect.  How are people avoiding that scenario?

James Bardin

unread,
Mar 18, 2015, 7:06:43 PM3/18/15
to golan...@googlegroups.com

Are you using a current version of Go? TCP keepalive should be set on the server's connection by default. 

On Wednesday, March 18, 2015 at 5:21:08 PM UTC-4, Traun Leyden wrote

I'm expecting this http handler code:

_, err = response.Write([]byte("\n"))
flush()

to cause it to detect that the client is no longer around and close the connection.  (roughly based on the advice in this blog post)

Is that a reasonable expectation?  If not, is there any way to force connections to close where the client has disappeared?  


Nope. There's no way to quickly detect that a client is gone by writing to a tcp connection. The packets will be sent into a black hole into they timeout. 

 
Or to phrase the problem in another way, with lots of mobile clients connecting and abruptly disappearing without closing their connections (not sending FIN packets), the http server ends up with a lot of "zombie" connections and can eventually run out of file descriptors and is no longer able to service new clients trying to connect.  How are people avoiding that scenario?


You use http.CloseNotifier, which is implemented by the standard ResponseWriter. 
If the client has regular activity, I also like to implement an idle timeout by walking read and write deadlines forward after each action on the underlying connection.

Nathan Fisher

unread,
Mar 19, 2015, 12:59:39 AM3/19/15
to Traun Leyden, golan...@googlegroups.com
This seems like a reasonable article discussing the matter as it relates to go;


Note: you'll probably want to look at what is provided by tcpKeepAliveListener (linked below) as any changes you make to the socket will circumvent the use of this wrapper. This is because you'll need to make the listener first and then pass it to the server via Serve instead of calling ListenAndServe.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Nathan Fisher

Traun Leyden

unread,
Mar 19, 2015, 11:34:53 AM3/19/15
to golan...@googlegroups.com


On Wednesday, March 18, 2015 at 4:06:43 PM UTC-7, James Bardin wrote:

Are you using a current version of Go? TCP keepalive should be set on the server's connection by default. 


Yep, I'm using go 1.4
 
On Wednesday, March 18, 2015 at 5:21:08 PM UTC-4, Traun Leyden wrote

I'm expecting this http handler code:

_, err = response.Write([]byte("\n"))
flush()

to cause it to detect that the client is no longer around and close the connection.  (roughly based on the advice in this blog post)

Is that a reasonable expectation?  If not, is there any way to force connections to close where the client has disappeared?  


Nope. There's no way to quickly detect that a client is gone by writing to a tcp connection. The packets will be sent into a black hole into they timeout. 


Ok, that corresponds to the behavior I'm seeing.
 
 
Or to phrase the problem in another way, with lots of mobile clients connecting and abruptly disappearing without closing their connections (not sending FIN packets), the http server ends up with a lot of "zombie" connections and can eventually run out of file descriptors and is no longer able to service new clients trying to connect.  How are people avoiding that scenario?


You use http.CloseNotifier, which is implemented by the standard ResponseWriter.

I was under the impression that CloseNotifier was only useful for clients that explicitly closed their side of the connection, as opposed to dead peers (as described in section 2.3 of the TCP-Keepalive-HOWTO, which is the same scenario I'm testing)
 
 
If the client has regular activity, I also like to implement an idle timeout by walking read and write deadlines forward after each action on the underlying connection.

In my case, the client has been unplugged and so has no activity. 

James Bardin

unread,
Mar 19, 2015, 12:05:20 PM3/19/15
to Traun Leyden, golan...@googlegroups.com
On Thu, Mar 19, 2015 at 11:34 AM, Traun Leyden <traun....@gmail.com> wrote:


On Wednesday, March 18, 2015 at 4:06:43 PM UTC-7, James Bardin wrote:

Are you using a current version of Go? TCP keepalive should be set on the server's connection by default. 


Yep, I'm using go 1.4


Ah, I forgot that the idle time is set fairly high. TCP Keepalive won't kick in for 2 hours, and take another 10min to close the connection.
There's an issue here to make other parameters available https://github.com/golang/go/issues/8328

Don't use the keepalive library linked in that HOWTO article. It leaks file descriptors, and leaves each connection in blocking mode. 



You use http.CloseNotifier, which is implemented by the standard ResponseWriter.

I was under the impression that CloseNotifier was only useful for clients that explicitly closed their side of the connection, as opposed to dead peers (as described in section 2.3 of the TCP-Keepalive-HOWTO, which is the same scenario I'm testing)

That's correct. I just meant that you should use that as part of your code to detect client closes as soon as possible, but yes, it doesn't solve disconnects. 

 
 
 
If the client has regular activity, I also like to implement an idle timeout by walking read and write deadlines forward after each action on the underlying connection.

In my case, the client has been unplugged and so has no activity. 


I mean before they're unplugged, if there is regular activity you can set deadlines on read and write operations, i.e. you can stipulate that a client with no activity for 30min should be disconnected. 

Traun Leyden

unread,
Mar 19, 2015, 12:20:26 PM3/19/15
to golan...@googlegroups.com, traun....@gmail.com


On Wednesday, March 18, 2015 at 9:59:39 PM UTC-7, Nathan Fisher wrote:

Note: you'll probably want to look at what is provided by tcpKeepAliveListener (linked below) as any changes you make to the socket will circumvent the use of this wrapper. This is because you'll need to make the listener first and then pass it to the server via Serve instead of calling ListenAndServe.


Thanks for the hint.  Are you suggesting to make a custom Listener and then call SetKeepAlivePeriod() on the connection object?  Do you have any examples you can point me to?

 

On Thursday, 19 March 2015, Traun Leyden <traun....@gmail.com> wrote:

I'm trying to figure out why persistent http connections show up in lsof as established for so long (approximately 90 minutes) after mobile clients have disappeared from the network abruptly.  
The way I'm testing is to run the following handler on a server:


and connect with an android test app to the /longpoll endpoint.  After it's connected I'm turning off the wifi on the device which takes it completely offline, since it has no other way to access the network.  

I'm expecting this http handler code:

_, err = response.Write([]byte("\n"))
flush()

to cause it to detect that the client is no longer around and close the connection.  (roughly based on the advice in this blog post)

Is that a reasonable expectation?  If not, is there any way to force connections to close where the client has disappeared?  

Or to phrase the problem in another way, with lots of mobile clients connecting and abruptly disappearing without closing their connections (not sending FIN packets), the http server ends up with a lot of "zombie" connections and can eventually run out of file descriptors and is no longer able to service new clients trying to connect.  How are people avoiding that scenario?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Traun Leyden

unread,
Mar 19, 2015, 12:28:50 PM3/19/15
to golan...@googlegroups.com, traun....@gmail.com


On Thursday, March 19, 2015 at 9:05:20 AM UTC-7, James Bardin wrote:


Ah, I forgot that the idle time is set fairly high. TCP Keepalive won't kick in for 2 hours, and take another 10min to close the connection.
There's an issue here to make other parameters available https://github.com/golang/go/issues/8328


Oh nice, I'll watch that issue.  I did figure out that by tuning the Linux kernel parameters, I was able to get the time sockets from disconnected clients hung around down from 2 hours to 3 minutes:

echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 6 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 5 > /proc/sys/net/ipv4/tcp_keepalive_probes
echo 8 > /proc/sys/net/ipv4/tcp_retries2

The downside is that it affects every process on the machine.  

Don't use the keepalive library linked in that HOWTO article. It leaks file descriptors, and leaves each connection in blocking mode. 

Which library exactly?   https://github.com/felixge/tcpkeepalive?
 

James Bardin

unread,
Mar 19, 2015, 12:36:43 PM3/19/15
to Traun Leyden, golan...@googlegroups.com
The code in server.go, specifically the Server.ListenAndServe and tcpKeepAliveListener are examples of how to create a custom listener.

yes, that one ;)  

Felix Geisendoerfer

unread,
Mar 19, 2015, 2:52:27 PM3/19/15
to golan...@googlegroups.com, traun....@gmail.com
Hi,


On Thursday, March 19, 2015 at 9:05:20 AM UTC-7, James Bardin wrote:

Don't use the keepalive library linked in that HOWTO article. It leaks file descriptors, and leaves each connection in blocking mode. 

Author here, I think you're right. I'm not sure how I've this since it's clearly spelled out by the TCPConn#File docs. Is there any way to get the fd from a TCPConn without the dup / blocking mode being set?

James Bardin

unread,
Mar 19, 2015, 4:03:14 PM3/19/15
to golan...@googlegroups.com, traun....@gmail.com
No. (OK, you can get the real fd is via reflection. Though I've used it for debugging purposes, don't do that ;) There's a mutex protecting it for a reason.)

I *think* this is OK, but I haven't really tested it, nor would it fit directly into your api. (typed offhand, not even checked)

f := conn.File()
syscall.SetsockoptInt(int(f.Fd()), syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs)
syscall.SetNonBlock(int(f.Fd()), true)
f.Close()

 
Reply all
Reply to author
Forward
0 new messages