I've been hitting a problem trying to create a Go program which makes many concurrent HTTP requests (about 20k). My thought was having a goroutine per request, and the Go runtime would multiplex these across a small thread pool (roughly size of GOMAXPROCS). However, my program keeps crashing after running for a few minutes. Basically it fails with: runtime: program exceeds 10000-thread limit
→ grep "^goroutine " threads1-500.txt | perl -npe 's/\d+/n/' | sort | uniq -c
412 goroutine n [IO wait, 1 minutes]:
162 goroutine n [IO wait, 2 minutes]:
111 goroutine n [IO wait, 3 minutes]:
24 goroutine n [IO wait, 4 minutes]:
6570 goroutine n [IO wait]:
133 goroutine n [chan receive, 1 minutes]:
54 goroutine n [chan receive, 2 minutes]:
52 goroutine n [chan receive, 3 minutes]:
12 goroutine n [chan receive, 4 minutes]:
4894 goroutine n [chan receive]:
3906 goroutine n [chan send]:
484 goroutine n [runnable]:
1053 goroutine n [select, 1 minutes]:
297 goroutine n [select, 2 minutes]:
156 goroutine n [select, 3 minutes]:
25 goroutine n [select, 4 minutes]:
17562 goroutine n [select]:
2 goroutine n [semacquire]:
422 goroutine n [sleep]:
1 goroutine n [syscall, 6 minutes, locked to thread]:
1 goroutine n [syscall, 6 minutes]:
12 goroutine n [syscall]:
There were only 14 goroutines in [syscall] state, so I looked at the ones in [runnable] with Syscall in the call stack. They were basically all network operations related to my HTTP requests, either syscall.connect, syscall.write, or syscall.read. Here's one example:
goroutine 700108 [runnable]:
syscall.Syscall(0x62, 0x3829, 0xc36758ac4c, 0x10, 0xffffffffffffffff, 0x0, 0x24)
/usr/local/go/src/syscall/asm_darwin_amd64.s:20 +0x5
syscall.connect(0x3829, 0xc36758ac4c, 0xc300000010, 0x0, 0x0)
/usr/local/go/src/syscall/zsyscall_darwin_amd64.go:64 +0x56
syscall.Connect(0x3829, 0x4b93ac8, 0xc36758ac40, 0x0, 0x0)
/usr/local/go/src/syscall/syscall_unix.go:198 +0x7f
net.(*netFD).connect(0xc3675cde30, 0x0, 0x0, 0x4b93ac8, 0xc36758ac40, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/fd_unix.go:75 +0x6c
net.(*netFD).dial(0xc3675cde30, 0x4b96d58, 0x0, 0x4b96d58, 0xc31de53e00, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/sock_posix.go:139 +0x37a
net.socket(0x450f340, 0x3, 0x2, 0x1, 0x0, 0xc31de53e00, 0x4b96d58, 0x0, 0x4b96d58, 0xc31de53e00, ...)
/usr/local/go/src/net/sock_posix.go:91 +0x422
net.internetSocket(0x450f340, 0x3, 0x4b96d58, 0x0, 0x4b96d58, 0xc31de53e00, 0x0, 0x0, 0x0, 0x1, ...)
/usr/local/go/src/net/ipsock_posix.go:137 +0x148
net.dialTCP(0x450f340, 0x3, 0x0, 0xc31de53e00, 0x0, 0x0, 0x0, 0x200000003, 0x0, 0x0)
/usr/local/go/src/net/tcpsock_posix.go:156 +0x125
net.DialTCP(0x450f340, 0x3, 0x0, 0xc31de53e00, 0x4015000, 0x0, 0x0)
/usr/local/go/src/net/tcpsock_posix.go:152 +0x25c
fetch.dialSingle(0x450f340, 0x3, 0xc36758abe0, 0x15, 0x0, 0x0, 0x4b96cc8, 0xc31de53e00, 0xecc4a675a, 0xc20cdfbbee, ...)
.../src/fetch/dial.go:41 +0x200
fetch.func·001(0xecc4a675a, 0xc20cdfbbee, 0x47a0160, 0x0, 0x0, 0x0, 0x0)
.../src/fetch/dial.go:17 +0xbd
fetch.dial(0x450f340, 0x3, 0x4b96cc8, 0xc31de53e00, 0xc31c79db60, 0xecc4a675a, 0xcdfbbee, 0x47a0160, 0x0, 0x0, ...)
.../src/fetch/dial.go:31 +0x6f
fetch.func·002(0x450f340, 0x3, 0xc36758abe0, 0x15, 0x0, 0x0, 0x0, 0x0)
.../src/fetch/dial.go:19 +0x38d
net/http.(*Transport).dial(0xc2c29b8630, 0x450f340, 0x3, 0xc36758abe0, 0x15, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:479 +0x84
net/http.(*Transport).dialConn(0xc2c29b8630, 0x0, 0xc285048430, 0x4, 0xc36758abe0, 0x15, 0xc21c2faf00, 0x0, 0x0)
/usr/local/go/src/net/http/transport.go:564 +0x1678
net/http.func·019()
/usr/local/go/src/net/http/transport.go:520 +0x42
created by net/http.(*Transport).getConn
/usr/local/go/src/net/http/transport.go:522 +0x335
So what I'm wondering is, how do you have a large number of concurrent HTTP requests, without ending up with a thread per request? Is this not a common enough scenario with Go? Do I need to limit concurrency of dialConn myself, using a custom transport? Won't that hurt overall throughput though, and are there plans to address this issue in a future release of Go?
Thanks,
Jason