I'm currently trying to break the 64511 concurrent tcp sessions barrier in my GO application.
I have optimized my kernel for high tcp concurrency as suggested by Richard Jones on his blog back in 2008 (see metabrew.com blog below). Unfortunately it doesn't have any effect on the number of concurrent sessions.
As you have only 64511 unprivileged ports available per IP address I have added several additional IP addresses to my system and made go loop though these addresses. When I use a LocalAddr with port 0 go directly starts complaining with "address already in use" as soon I pass the +/- 64511 connections.
So I tied to loop over my ip range in combination with a specified port for the LocalAddr. Unfortunately this doesn't improve the number of concurrent connections.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
On Fri, Aug 16, 2013 at 3:45 AM, Paul van Brouwershaven <pa...@vanbrouwershaven.com> wrote:I'm currently trying to break the 64511 concurrent tcp sessions barrier in my GO application.
I have optimized my kernel for high tcp concurrency as suggested by Richard Jones on his blog back in 2008 (see metabrew.com blog below). Unfortunately it doesn't have any effect on the number of concurrent sessions.
As you have only 64511 unprivileged ports available per IP address I have added several additional IP addresses to my system and made go loop though these addresses. When I use a LocalAddr with port 0 go directly starts complaining with "address already in use" as soon I pass the +/- 64511 connections.This is probably something that can be fixed; I'd file an issue.
So I tied to loop over my ip range in combination with a specified port for the LocalAddr. Unfortunately this doesn't improve the number of concurrent connections.What is the failure mode when you do this? Do you still get "address already in use" after 64511? Do you know that your router can handle NAT with one (physical) port having more than 65536 source ports? Does the tcpdump have any hints?
On Friday, 16 August 2013 20:02:50 UTC+2, Kyle Lemons wrote:
On Fri, Aug 16, 2013 at 3:45 AM, Paul van Brouwershaven <pa...@vanbrouwershaven.com> wrote:
I'm currently trying to break the 64511 concurrent tcp sessions barrier in my GO application.
I have optimized my kernel for high tcp concurrency as suggested by Richard Jones on his blog back in 2008 (see metabrew.com blog below). Unfortunately it doesn't have any effect on the number of concurrent sessions.
As you have only 64511 unprivileged ports available per IP address I have added several additional IP addresses to my system and made go loop though these addresses. When I use a LocalAddr with port 0 go directly starts complaining with "address already in use" as soon I pass the +/- 64511 connections.This is probably something that can be fixed; I'd file an issue.I have filed an issue: https://code.google.com/p/go/issues/detail?id=6176So I tied to loop over my ip range in combination with a specified port for the LocalAddr. Unfortunately this doesn't improve the number of concurrent connections.What is the failure mode when you do this? Do you still get "address already in use" after 64511? Do you know that your router can handle NAT with one (physical) port having more than 65536 source ports? Does the tcpdump have any hints?Basically you can't have more than 65535 ports, so that's the reason I'm trying to use 64511 (65535-1024) ports per IP address. This would mean that you have 1.0.0.1:65000 and 1.0.0.2:65000 having a connection at the same time (as advised by Richard Jones). Unfortunately this gives me the same "address already in use" error message as when I specify port 0 (auto).
I'm currently trying to break the 64511 concurrent tcp sessions barrier in my GO application.
[...]
1 Million TCP connections (Linux):
http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3
2 Million TCP connections (FreeBSD):
http://blog.whatsapp.com/index.php/2012/01/1-million-is-so-2011/
[...] I have added several additional IP addresses to my system and made go loop though these addresses. When I use a LocalAddr with port 0 go directly starts complaining with "address already in use" as soon I pass the +/- 64511 connections. So I tied to loop over my ip range in combination with a specified port for the LocalAddr. Unfortunately this doesn't improve the number of concurrent connections.
My connection handling code:
RETRY:// get the firs source ip:port address in queueipNext = <-iprange// manage our on port numbers per ip to overcome the 64511 limithost, port, _ := net.SplitHostPort(ipNext)nextPort, _ := strconv.Atoi(port)nextPort++if nextPort > 65535 {nextPort = 1024}// add this address with the next port number back to the end of the queueiprange <- host +":"+ strconv.Itoa(nextPort)d := net.Dialer{Timeout: 1*time.Second, Deadline: time.Now().Add(2*time.Second)}d.LocalAddr, err = net.ResolveTCPAddr("tcp4", ipNext)if err != nil {log.Fatal(err)}dc, err := d.Dial("tcp", dst)if err != nil && strings.Contains(err.Error(), "address already in use") {goto RETRY}
Right, I understand the approach. However, it seems likely that a (commodity) router might make the inappropriate assumption that one single (physical) port will have only one IP address, and only build in a NAT table that has enough rows to accommodate the 65535 source ports per (physical) port. If this hypothesis were true, adding more IPs wouldn't improve the total number of TCP flows you can maintain, but adding a second IP on a second interface connected to a second port on the router would double the effective number of flows you could maintain.
On Friday, 16 August 2013 11:45:40 UTC+1, Paul van Brouwershaven wrote:I'm currently trying to break the 64511 concurrent tcp sessions barrier in my GO application.[...]
1 Million TCP connections (Linux):
http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3
2 Million TCP connections (FreeBSD):
http://blog.whatsapp.com/index.php/2012/01/1-million-is-so-2011/Note those two articles are not about the same thing. Make sure you're not confusing the difference between outgoing tcp connections and incoming tcp connections.
If you just want a 1M tcp connections badge, do it with incoming connections.
[...] I have added several additional IP addresses to my system and made go loop though these addresses. When I use a LocalAddr with port 0 go directly starts complaining with "address already in use" as soon I pass the +/- 64511 connections. So I tried to loop over my ip range in combination with a specified port for the LocalAddr. Unfortunately this doesn't improve the number of concurrent connections.
Does "doesn't improve" mean you get the same error, or something else?
I don't see where the host (for ipNext, for LocalAddr) gets incremented. Based on the above, I can only presume it doesn't and this would explain the behaviour you're seeing.Personally, I'd go for a simpler approach. For each local IP address launch a goroutine that dials out repeatedly (with exponential backoff, if you will). I wouldn't bother with trying to manage ephemeral ports either - just keep dialing independently on each IP and whatever limit you hit will probably be the ephemeral limit anyways.
Please try using net.DialTCP to control the source address you connectfrom. You can get the list of available addresses from the net package
using InterfaceAddrs(). You should construct a *net.TCPAddr for the
source address with one of the IP addresses assigned to your outgoing
interface and the port set to 0 to allow the operating system to
choose an ephemeral outgoing port.
Can you please post executable sample code.
Please try running your program under the race detector. I can see at least one data race, on lasterror which may be confusing your results.
Here is a demo code that is running 60k concurrent connections with a timeout of 2 seconds.
The code will print a status output like below every 5 seconds:2013/08/19 10:51:33 GO: 60007 || TCP: ESTABLISHED 4 SYN_SENT 28232 || ERROR: dial tcp 10.2.149.179:80: address already in use
dconn, err := d.Dial("tcp", ipaddr+":80")if err != nil {[...]}defer dconn.Close()
Your goroutines are closing the connection almost immediately after dialing:
dconn, err := d.Dial("tcp", ipaddr+":80")if err != nil {[...]}defer dconn.Close()
The closing of the connections would explain why you can never get >60k connections ESTABLISHED (or SYN_SENT).
Furthermore, closing the connections will cause the tcp socket to go into TIME_WAIT state (for up to 60s by default on Linux), during which time you cannot rebind the socket. Eventually, you'll just run out of ports and this will happen far sooner than you'd expect from looking at the ESTABLISHED connections alone.
BTW, it seems you're dialing out to a different ip:port each time - technically, connections are identified by the tuple proto,laddr,lport,daddr,dport. So, theoretically, you could create multiple connections (think >1M) bound to the same laddr,lport. SO_REUSEPORT was not long added to Linux though, so it may not be in Go yet: http://grokbase.com/t/gg/golang-dev/13373v3nfm/net-so-reuseport-in-linux-3-9
On Tue, Aug 20, 2013 at 2:24 AM, Benjamin Measures <saint....@gmail.com> wrote:
Your goroutines are closing the connection almost immediately after dialing:
dconn, err := d.Dial("tcp", ipaddr+":80")if err != nil {[...]}defer dconn.Close()It are very short connections, but they don't close immediately. The "defer dconn.Close()" closes the connection when the function returns, I'm not downloading anything in the example program but the timing is the same.The closing of the connections would explain why you can never get >60k connections ESTABLISHED (or SYN_SENT).Furthermore, closing the connections will cause the tcp socket to go into TIME_WAIT state (for up to 60s by default on Linux), during which time you cannot rebind the socket. Eventually, you'll just run out of ports and this will happen far sooner than you'd expect from looking at the ESTABLISHED connections alone.
Most of the connection are in SYN_SENT status (waiting on connection), I don't have issues with TIME_WAIT. I'm running in high concurrency and it's no problem to start more than > 60k connections per second, except that I'm getting "address already in use" errors as soon I go over the +/- 64511 limit.TIME_WAIT 2917CLOSE_WAIT 5FIN_WAIT1 2049SYN_SENT 54926ESTABLISHED 836FIN_WAIT2 1063CLOSING 13LAST_ACK 61
BTW, it seems you're dialing out to a different ip:port each time - technically, connections are identified by the tuple proto,laddr,lport,daddr,dport. So, theoretically, you could create multiple connections (think >1M) bound to the same laddr,lport. SO_REUSEPORT was not long added to Linux though, so it may not be in Go yet: http://grokbase.com/t/gg/golang-dev/13373v3nfm/net-so-reuseport-in-linux-3-9Sounds interesting but is this not for incoming connections only?"For TCP, so_reuseport allows multiple listener sockets to be bound to the same port."
I don't think you can really say you have 1M TCP connections unless you have 1M that say ESTABLISHED there. I'd work on getting to 64511 ESTABLISHED first, and then move on to multiple IPs. It still looks to me like some router somewhere isn't able to actually route enough connections for you, thus tons of SYN_SENT -- no SYN/ACK, because the ACK can't be routed back to you.
> Are you getting these address already in use by coming up with the ports yourself or by letting the OS choose them?
I'm letting the OS select the ports now. But I get the same error when I make the selection.
2013/08/20 08:57:46 sockets: used 64723 TCP: inuse 64542 orphan 31 tw 0 alloc 64632 mem 771 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 1 memory 14802013/08/20 08:58:30 sockets: used 64407 TCP: inuse 62746 orphan 31 tw 0 alloc 64316 mem 772 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 0
When I run the test program with 64100 routines I can even get to 64632 but soon run into the error "address already in use" error.
2013/08/20 09:07:33 sockets: used 708 TCP: inuse 614 orphan 30 tw 0 alloc 616 mem 770 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 02013/08/20 09:08:04 sockets: used 64807 TCP: inuse 64632 orphan 30 tw 4 alloc 64715 mem 771 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 02013/08/20 09:08:23 *.*.*.1 -> 10.5.175.151 || dial tcp 10.5.175.151:80: address already in use
On Tuesday, 20 August 2013 10:19:44 UTC+1, Paul van Brouwershaven wrote:
2013/08/20 08:57:46 sockets: used 64723 TCP: inuse 64542 orphan 31 tw 0 alloc 64632 mem 771 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 1 memory 14802013/08/20 08:58:30 sockets: used 64407 TCP: inuse 62746 orphan 31 tw 0 alloc 64316 mem 772 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 0The TCP sockets inuse has decreased. This would suggest sockets are being closed.
When I run the test program with 64100 routines I can even get to 64632 but soon run into the error "address already in use" error.2013/08/20 09:07:33 sockets: used 708 TCP: inuse 614 orphan 30 tw 0 alloc 616 mem 770 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 0
2013/08/20 09:08:04 sockets: used 64807 TCP: inuse 64632 orphan 30 tw 4 alloc 64715 mem 771 UDP: inuse 0 mem 0 UDPLITE: inuse 0 RAW: inuse 0 FRAG: inuse 0 memory 02013/08/20 09:08:23 *.*.*.1 -> 10.5.175.151 || dial tcp 10.5.175.151:80: address already in useUsing more goroutines will cause connections to be opened (and closed) at a greater rate. That the error doesn't appear until ~20s after peak/high inuse, would support the CLOSE_WAIT issue as outlined earlier in this thread.
--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/Mi7QkAqP7II/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
for a total of 65514. No tuning other than what you posted in https://code.google.com/p/go/issues/detail?id=6176#c4