The max number of goroutines and file descriptors?

6,672 views
Skip to first unread message

Jingcheng Zhang

unread,
May 21, 2012, 3:12:32 AM5/21/12
to golang-nuts
Hello,

I wrote a client and a server in Go:

client - start N goroutines, each of which dials the server, send a request, and receive a response, then close the connection;
server - for each client, start a goroutine to serve it.

The question is that N is limited, nearly 1000 under Windows, and 6000 under Linux.

How can I get a higher number of concurrency (like goroutines, file descriptors) on both Windows & Linux, for client & server?

Thanks very much.

--
Best regards,
Jingcheng Zhang
Beijing, P.R.China

Ian Lance Taylor

unread,
May 21, 2012, 9:41:36 AM5/21/12
to Jingcheng Zhang, golang-nuts
Jingcheng Zhang <dio...@gmail.com> writes:

> I wrote a client and a server in Go:
>
> client - start N goroutines, each of which dials the server, send a
> request, and receive a response, then close the connection;
> server - for each client, start a goroutine to serve it.
>
> The question is that N is limited, nearly 1000 under Windows, and 6000
> under Linux.
>
> How can I get a higher number of concurrency (like goroutines, file
> descriptors) on both Windows & Linux, for client & server?

Those numbers seem too low. The test/chan/goroutines.go test creates
10,000 goroutines, and it could create except that we want the testsuite
to run quickly. What happens when you go above your limit?

Ian

Daniel Morsing

unread,
May 21, 2012, 9:51:06 AM5/21/12
to Ian Lance Taylor, Jingcheng Zhang, golang-nuts
I suspect the limiting factor here is the number of file descriptors open.

You could probably implement some sort of cache to restrict the number
of open FDs.

Regards,
Daniel Morsing
Message has been deleted

Jingcheng Zhang

unread,
May 21, 2012, 1:43:12 PM5/21/12
to Ian Lance Taylor, golang-nuts
Hello Ian,

I wrote a simple client program like this:

package main

import "net"

const n = 2000

var ok = make(chan bool, n)

func main() {
    for i := 0; i < n; i++ {
        go serve(i)
    }
    for i := 0; i < n; i++ {
        <-ok
    }
    println("finished.")
}

var req = "GET / HTTP/1.0\r\n\r\n"

func serve(i int) {
    c, e := net.Dial("tcp", "96.44.158.100:80")
    if e != nil {
        panic(e.Error())
    }
    defer c.Close()
    c.Write([]byte(req))
    b := make([]byte, 1024)
    n, e := c.Read(b)
    if e != nil {
        panic(e.Error())
    }
    println(string(b[:n]))
}

When I run this program, the first many hundureds of requests succeed, but the left requests are failed like:

goroutine 2002 [chan receive]:
net.(*ioSrv).ExecIO(0x116431e8, 0x126514c0, 0x131866f0, 0x0, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:191 +0x4b4
net.(*netFD).Read(0x12bdc000, 0x1318b400, 0x400, 0x400, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:385 +0x1c6
net.(*TCPConn).Read(0x1316ac10, 0x1318b400, 0x400, 0x400, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/tcpsock_posix.go:87 +0xb0
main.serve()
D:/code/src/demo/main.go:29 +0x147
created by main.main
D:/code/src/demo/main.go:11 +0x3a

goroutine 2003 [runnable]:
syscall.Syscall6(0x75c44c2d, 0x5, 0x9d00, 0x12661020, 0x126e0f10, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/runtime/zsyscall_windows_386.c:97 +0x49
syscall.GetQueuedCompletionStatus(0x9d00, 0x12661020, 0x126e0f10, 0x126e0f08, 0xffffffff, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/syscall/zsyscall_windows_386.go:489 +0x76
net.(*resultSrv).Run(0x116431f0, 0x0)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:107 +0x86
created by net.startServer
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:211 +0xfc

goroutine 2004 [select]:
net.(*ioSrv).ProcessRemoteIO(0x116431e8, 0x0)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:138 +0x183
created by net.startServer
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:216 +0x17c

What happens on my program?

Jingcheng Zhang

unread,
May 21, 2012, 1:50:16 PM5/21/12
to Daniel Morsing, Ian Lance Taylor, golang-nuts
Sometimes the client complete all the requests without any problem, sometimes not, so file descriptor may be not the reason. Windows has local temporary port range (49xxx - 65535) limit, but it's about 15000 ports, I use only 2000 of them. I met this problem in Linux too, with max file descriptors set to 32768, and local port range set to 1025-65535.  Maybe the goroutine scheduler is not stable? 

Jingcheng Zhang

unread,
May 21, 2012, 1:53:12 PM5/21/12
to Ian Lance Taylor, golang-nuts
Sorry, I forgot a line at the end of function serve():

    ok <- true

But the result is the same.

minux

unread,
May 21, 2012, 2:07:05 PM5/21/12
to Jingcheng Zhang, Ian Lance Taylor, golang-nuts
On Tue, May 22, 2012 at 1:43 AM, Jingcheng Zhang <dio...@gmail.com> wrote:
package main

import "net"

const n = 2000

var ok = make(chan bool, n)

func main() {
    for i := 0; i < n; i++ {
        go serve(i)
    }
    for i := 0; i < n; i++ {
        <-ok
    }
    println("finished.")
}

var req = "GET / HTTP/1.0\r\n\r\n"

func serve(i int) {
    c, e := net.Dial("tcp", "96.44.158.100:80")
    if e != nil {
        panic(e.Error())
    }
    defer c.Close()
    c.Write([]byte(req))
    b := make([]byte, 1024)
    n, e := c.Read(b)
    if e != nil {
        panic(e.Error())
    }
    println(string(b[:n]))
}

When I run this program, the first many hundureds of requests succeed, but the left requests are failed like:
You didn't paste the full panic trace.
My test on Linux showed this (I changed the IP to localhost:8080 and run a godoc there):
panic: dial tcp 127.0.0.1:8080: too many open files 

which clearly identified the problem.

Jingcheng Zhang

unread,
May 21, 2012, 2:19:32 PM5/21/12
to minux, Ian Lance Taylor, golang-nuts
Hello minux,

I did panic with e.Error() in the code, but the paniking process was not running at all, it looks like the goroutine panic in the "net" library, which I cannot caught, like this:

goroutine 1991 [runnable]:
syscall.Syscall(0x75586bdd, 0x3, 0x9c28, 0x1261dc08, 0x10, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/runtime/zsyscall_windows_386.c:74 +0x49
syscall.connect(0x9c28, 0x1261dc08, 0x10, 0x0, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/syscall/zsyscall_windows_386.go:1252 +0x5e
syscall.Connect(0x9c28, 0x116427e0, 0x1261dc00, 0x0, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/syscall/syscall_windows.go:576 +0x74
net.(*netFD).connect(0x12bb2e00, 0x116427e0, 0x1261dc00, 0x463b70, 0x3, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:273 +0x38
net.socket(0x463b70, 0x3, 0x2, 0x1, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/sock.go:56 +0x2ed
net.internetSocket(0x463b70, 0x3, 0x0, 0x0, 0x116216f0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/ipsock_posix.go:138 +0x25a
net.DialTCP(0x463b70, 0x3, 0x0, 0x12623210, 0x12623201, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/tcpsock_posix.go:243 +0x13d
net.dialAddr(0x463b70, 0x3, 0x469010, 0x10, 0x11636380, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/dial.go:102 +0x14c
net.Dial(0x463b70, 0x3, 0x469010, 0x10, 0x0, ...)
C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/dial.go:96 +0x9e
main.serve()
D:/code/src/demo/main.go:22 +0x3a

created by main.main
D:/code/src/demo/main.go:11 +0x3a

André Moraes

unread,
May 21, 2012, 2:20:21 PM5/21/12
to Jingcheng Zhang, Ian Lance Taylor, golang-nuts
>
> When I run this program, the first many hundureds of requests succeed, but
> the left requests are failed like:
>
> goroutine 2002 [chan receive]:
> net.(*ioSrv).ExecIO(0x116431e8, 0x126514c0, 0x131866f0, 0x0, 0x0, ...)
> C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:191
> +0x4b4
> net.(*netFD).Read(0x12bdc000, 0x1318b400, 0x400, 0x400, 0x0, ...)
> C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/fd_windows.go:385
> +0x1c6
> net.(*TCPConn).Read(0x1316ac10, 0x1318b400, 0x400, 0x400, 0x0, ...)
> C:/Users/ADMINI~1/AppData/Local/Temp/2/bindist894290290/go/src/pkg/net/tcpsock_posix.go:87

If you take a look at this trace, you can see that the problem is IO
and not Go Scheduler.
The scheduler is very good and can handle much more than 2000
goroutines. Like Ian said, in the testing it starts 10000 goroutines
(5x more than your case).
--
André Moraes
http://amoraes.info

André Moraes

unread,
May 21, 2012, 2:31:44 PM5/21/12
to Jingcheng Zhang, golang-nuts
> descriptors set to 32768, and local port range set to 1025-65535.  Maybe the
> goroutine scheduler is not stable?
>

It is, this code run's without any problem in my computer:

package main

import "fmt"

type data chan int

const max = 2000 * 100

func work(dataCh data) {
dataCh<-1
}

func releaseTheKraken(dataCh data) {
for i:=0;i<max;i++ {
go work(dataCh)
}
fmt.Printf("Kraken released\n")
}

func cageTheKraken(dataCh data) {
for i:=0;i<max;i++ {
// just consume from the channel
<-dataCh
}
fmt.Printf("Kraken caged\n")
}

func main() {
dataCh := make(data, 1)
releaseTheKraken(dataCh)
cageTheKraken(dataCh)
}

You can't run it' in the playground due to restrictions in the sandbox
that run's it, but on a real computer everything is ok. It only got
problem when i changed

const max = 2000 * 100
to
const max = 2000 * 1000

minux

unread,
May 21, 2012, 2:32:15 PM5/21/12
to Jingcheng Zhang, Ian Lance Taylor, golang-nuts
On Tue, May 22, 2012 at 2:19 AM, Jingcheng Zhang <dio...@gmail.com> wrote:
I did panic with e.Error() in the code, but the paniking process was not running at all, it looks like the goroutine panic in the "net" library, which I cannot caught, like this:
you still missed the head part of the trace.
you can use:
go run main.go 2>out.txt

and then view out.txt to see what went wrong.

Jingcheng Zhang

unread,
May 21, 2012, 2:34:10 PM5/21/12
to André Moraes, golang-nuts
I think you are right, but the fact that "I start 2000 goroutines, each of which connects to the same server and request for a service, but the program failed" is not a good impression for new Go comers.

Currently I use Go to write a client program for stress testing because of its simplicity and ability to start a lot of goroutines, but the result is not very good. I hope I can solve this problem, which may brings support for using Go as a major language in our team.

André Moraes

unread,
May 21, 2012, 2:41:29 PM5/21/12
to Jingcheng Zhang, golang-nuts
On Mon, May 21, 2012 at 3:34 PM, Jingcheng Zhang <dio...@gmail.com> wrote:
> I think you are right, but the fact that "I start 2000 goroutines, each of
> which connects to the same server and request for a service, but the program
> failed" is not a good impression for new Go comers.
>

Go can't overcome the limits that the OS impose.

Also all the traces that you sent came from a Windows machine right?
Can you test on a Linux box and do the ulimt magic?

Also, this could help:
http://msdn.microsoft.com/en-us/library/aa560610%28v=bts.20%29.aspx
http://stackoverflow.com/questions/2185834/maximum-number-of-socket-in-java

Basically this is a know issue for windows (and almost any OS). You
will need to change some windows registry values to increase the
number of open ports.

Jingcheng Zhang

unread,
May 21, 2012, 2:41:45 PM5/21/12
to minux, Ian Lance Taylor, golang-nuts
Thanks, the backtrace logs are so large that I didn't see the first panic. It is:

panic: WSARecv tcp 192.168.1.2:65457: The specified network name is no longer available.

So maybe the local port is exhasted? 

DisposaBoy

unread,
May 21, 2012, 2:43:33 PM5/21/12
to golan...@googlegroups.com, André Moraes


On Monday, May 21, 2012 7:34:10 PM UTC+1, Jingcheng Zhang wrote:
I think you are right, but the fact that "I start 2000 goroutines, each of which connects to the same server and request for a service, but the program failed" is not a good impression for new Go comers.

Currently I use Go to write a client program for stress testing because of its simplicity and ability to start a lot of goroutines, but the result is not very good. I hope I can solve this problem, which may brings support for using Go as a major language in our team.


As far as I can see there is no problem with Go here. You're hitting a system resource limit which either means you need to improve the design of your program or increase the limits.
 

Jingcheng Zhang

unread,
May 21, 2012, 3:07:38 PM5/21/12
to minux, golang-nuts
"The specified network name is no longer available. " should be a server exception, not a client issue.
The server is Nginx, I'll try the client using a linux box tomorrow.

Thank everyone for your help!

Rémy Oudompheng

unread,
May 21, 2012, 4:55:29 PM5/21/12
to Jingcheng Zhang, minux, Ian Lance Taylor, golang-nuts
2012/5/21 Jingcheng Zhang <dio...@gmail.com>:
> Thanks, the backtrace logs are so large that I didn't see the first panic.
> It is:
>
> panic: WSARecv tcp 192.168.1.2:65457: The specified network name is no
> longer available.
>
> So maybe the local port is exhasted?

You would get more readable logs if you didn't use panic but
fmt.Println or log.Print to display error messages.

Rémy.

Jingcheng Zhang

unread,
May 21, 2012, 11:17:17 PM5/21/12
to minux, golang-nuts
I tried the same code under Linux with the following configurations, and the result is OK even for 50000 concurrent connections between a client and a server in the same box:

/etc/sysctl.conf:

fs.file-max = 2000000
net.ipv4.ip_local_port_range = 1025 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_max_syn_backlog = 8192

/etc/security/limits.conf:

* hard nofile 1000000
* soft nofile 1000000

So I'm hitting the limits as other people say. I'll try to find the similar options for Windows.

Thank everyone :)

dvliman

unread,
Aug 1, 2013, 5:09:47 PM8/1/13
to golan...@googlegroups.com
not meaning to resurrect a closed thread -- but I find this article very useful (can be informative for others?)

given how cheap it is to create goroutines :) we need to limit the number of file descriptors open at a time
Reply all
Reply to author
Forward
0 new messages