Http transport keeps leaking goroutines

4,274 views
Skip to first unread message

ibi...@gmail.com

unread,
Mar 25, 2014, 9:01:46 AM3/25/14
to golan...@googlegroups.com
I wrote a proxy checker recently, it sends hundreds of thousands of requests to a url to check whether proxies are available.  Here is a part of my checker code, the check proxy function:

func checkProxy(proxy string, timeout int) bool {

proxyUrl, err := url.Parse("http://" + proxy)

httpClient := &http.Client{
Transport: &http.Transport{
Proxy: http.ProxyURL(proxyUrl),
Dial: func(netw, addr string) (net.Conn, error) {
deadline := time.Now().Add(time.Duration(timeout) * time.Second)
c, err := net.DialTimeout(netw, addr, time.Second*time.Duration(timeout))
if err != nil {
return nil, err
}
c.SetDeadline(deadline)
return c, nil
},
ResponseHeaderTimeout: time.Duration(timeout) * time.Second,
DisableKeepAlives:     true,
},
}

req, err := http.NewRequest("GET", "http://www.urltocheck.com", nil)

req.Close = true
resp, err := httpClient.Do(req)
if err != nil {
return false
}
              defer resp.Body.Close() 

body, _ := ioutil.ReadAll(resp.Body)
if strings.Contains(string(body), "xxx") {
return true
} else {
return false
}
}

After import "net/http/pprof", and after an hours or two,  i could see these two writeLoop and readLoop goroutines keep growing, the longer i run the checker, the more they are (could be more than ten thousands):
8054 @ 0x41a716 0x4080d4 0x407d22 0x488021 0x41a8e0
#	0x4080d4	selectgo+0x384				/usr/local/go/src/pkg/runtime/chan.c:996
#	0x407d22	runtime.selectgo+0x12			/usr/local/go/src/pkg/runtime/chan.c:840
#	0x488021	net/http.(*persistConn).writeLoop+0x271	/usr/local/go/src/pkg/net/http/transport.go:791

8030 @ 0x41a716 0x4072d2 0x407718 0x4879cf 0x41a8e0
#	0x4879cf	net/http.(*persistConn).readLoop+0x68f	/usr/local/go/src/pkg/net/http/transport.go:778

And my memory leaks so bad, I find a bug been reports two years ago https://code.google.com/p/go/issues/detail?id=4531 (net/http: Transport leaks goroutines when request.ContentLength is explicitly short) which had been marked fixed, but I think somewhere someplace transport still leaking. Anyone would help, thanks a lot!

 

Billy Shea

unread,
Mar 25, 2014, 9:46:04 PM3/25/14
to golan...@googlegroups.com
My go version is go1.2.1 linux/amd64, on a centos 6.3 64 bit machine.

James Bardin

unread,
Mar 25, 2014, 11:00:01 PM3/25/14
to golan...@googlegroups.com
You do realize that the number on the left is the goroutine ID, and not the number of currently running goroutines? (since your stack trace does only show 2)

Billy Shea

unread,
Mar 25, 2014, 11:05:41 PM3/25/14
to golan...@googlegroups.com
It's not the goroutine ID, it's the number of goroutines. I'm sure about that, i create about 10000 goroutines, but the stack trace shows there are total 36000+ goroutines. The track url is debug/pprof/goroutine?debug=1, if I switch to debug/pprof/goroutine?debug=2, I can see the full goroutine stack dump, it's too long too list...

James Bardin

unread,
Mar 25, 2014, 11:13:09 PM3/25/14
to golan...@googlegroups.com
Oops, that may be the case (hard to confirm on mobile :) )

You might want to try tip just to be sure, there's been some work done around http client and keepalive connections.

Billy Shea

unread,
Mar 25, 2014, 11:18:45 PM3/25/14
to golan...@googlegroups.com
Yes, I've set DisableKeepAlives to true to disable keep alive connections, and set req.Close = true, but it won't work, the goroutines keep growing. I'll try the tip.

在 2014年3月26日星期三UTC+8上午11时13分09秒,James Bardin写道:

James Bardin

unread,
Mar 26, 2014, 9:25:13 AM3/26/14
to golan...@googlegroups.com

The only way I can imagine this leaking is that you occasionally get a short response, with a missing or partial body, and the remote server isn't closing the connection (I've seen this more often that you'd would think).

Try setting an overall timeout on the entire process. If you're trying tip, in go1.3 the http.Client has a Timeout parameter which can handle this. With the current release you can try http://godoc.org/github.com/mreiferson/go-httpclient, which implements basically the same thing, but in Transport.RequestTimeout.

Billy Shea

unread,
Mar 26, 2014, 11:25:24 AM3/26/14
to golan...@googlegroups.com
Thanks a lot, I've tried the tip, the leak becomes slower, but it's still leaking. I'll try the package you recommend, thanks for your help.

在 2014年3月26日星期三UTC+8下午9时25分13秒,James Bardin写道:

Julian

unread,
Apr 5, 2014, 5:11:21 AM4/5/14
to golan...@googlegroups.com
Hi,

I'm having the exact same problem. I have a program that creates many many http requests using proxies (via Transport.Proxy).
After I stop a batch of 1000 requests I do a stack trace I have a lot of goroutines like these, which never finish:

goroutine 3317 [chan receive]:
net/http.(*persistConn).readLoop(0xc216c06680)
	/usr/local/go/src/pkg/net/http/transport.go:778 +0x68f
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 2908 [select]:
net/http.(*persistConn).writeLoop(0xc216a68880)
	/usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 3318 [select]:
net/http.(*persistConn).writeLoop(0xc216c06680)
	/usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 3513 [select]:
net/http.(*persistConn).writeLoop(0xc21627d180)
	/usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 2205 [chan receive]:
net/http.(*persistConn).readLoop(0xc213a08b80)
	/usr/local/go/src/pkg/net/http/transport.go:778 +0x68f
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 2206 [select]:
net/http.(*persistConn).writeLoop(0xc213a08b80)
	/usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

goroutine 3391 [chan receive]:
net/http.(*persistConn).readLoop(0xc216a3cc00)
	/usr/local/go/src/pkg/net/http/transport.go:778 +0x68f
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:528 +0x607

goroutine 3392 [select]:
net/http.(*persistConn).writeLoop(0xc216a3cc00)
	/usr/local/go/src/pkg/net/http/transport.go:791 +0x271
created by net/http.(*Transport).dialConn
	/usr/local/go/src/pkg/net/http/transport.go:529 +0x61e

I'm using go 1.2.1 on ubuntu/amd64 and almost the same code as the OP: disable keep alive, custom dial func with dial timeout and conn deadline.
ioutil.ReadAll would also hang sometimes but I found a workaround using io.Copy and a timer that runs resp.Body.Close() after a specified timeout.
Now I just have leaky goroutines from the http/transport.go read/write loops.

ucg...@gmail.com

unread,
Apr 5, 2014, 7:41:39 AM4/5/14
to golan...@googlegroups.com
There is a bug in "net\http\client.go".  
I created an issues: http://code.google.com/p/go/issues/detail?id=7620.
Just modify "net\http\client.go":
------------------------------------------
if err != nil { if resp != nil { +++++ if resp.Body != nil { +++++ resp.Body.Close() +++++ }
log.Printf("RoundTripper returned a response & error; ignoring response") } return nil, err }
------------------------------------------
Why nobody deal with the issus since I reported it weeks ago?


在 2014年3月25日星期二UTC+8下午9时01分46秒,Billy Shea写道:

Billy Shea

unread,
Apr 5, 2014, 9:07:55 AM4/5/14
to golan...@googlegroups.com
I've tried this client http://godoc.org/github.com/mreiferson/go-httpclient, the leaking continues..., here's my new transport, i've set as many timeouts as i could :(

transport := &httpclient.Transport{
Proxy: http.ProxyURL(proxyUrl),
Dial: func(netw, addr string) (net.Conn, error) {
deadline := time.Now().Add(time.Duration(timeout) * time.Second)
c, err := net.DialTimeout(netw, addr, time.Second*time.Duration(timeout))
if err != nil {
return nil, err
}
c.SetDeadline(deadline)
return c, nil
},
ResponseHeaderTimeout: time.Duration(timeout) * time.Second,
DisableKeepAlives:     true,
ConnectTimeout:        time.Duration(timeout) * time.Second,
RequestTimeout:        time.Duration(timeout) * time.Second * 2,
ReadWriteTimeout:      time.Duration(timeout) * time.Second,
}

But my proxy checker keeps leaking goroutines as always, this has trapped me over a month. Does http transport has some send request rate limit, i can see thousands net.runtime_pollWait goroutines like this:

4421 @ 0x416549 0x429630 0x428e9d 0x44ebd6 0x44eca2 0x451826 0x460cc9 0x53e48e 0x53e389 0x499646 0x4167e0
#	0x429630	netpollblock+0x130			/usr/local/go/src/pkg/runtime/netpoll.goc:349
#	0x428e9d	net.runtime_pollWait+0x5d		/usr/local/go/src/pkg/runtime/netpoll.goc:146
#	0x44ebd6	net.(*pollDesc).Wait+0x46		/usr/local/go/src/pkg/net/fd_poll_runtime.go:84
#	0x44eca2	net.(*pollDesc).WaitWrite+0x42		/usr/local/go/src/pkg/net/fd_poll_runtime.go:93
#	0x451826	net.(*netFD).Write+0x466		/usr/local/go/src/pkg/net/fd_unix.go:325
#	0x460cc9	net.(*conn).Write+0xe9			/usr/local/go/src/pkg/net/net.go:130
#	0x53e48e	bufio.(*Writer).flush+0xde		/usr/local/go/src/pkg/bufio/bufio.go:501
#	0x53e389	bufio.(*Writer).Flush+0x39		/usr/local/go/src/pkg/bufio/bufio.go:490
#	0x499646	net/http.(*persistConn).writeLoop+0x206	/usr/local/go/src/pkg/net/http/transport.go:890

4445 @ 0x416549 0x429630 0x428e9d 0x44ebd6 0x44ec42 0x4504d2 0x460ba9 0x49a7ba 0x4a38df 0x53c573 0x53c75d 0x498a16 0x4167e0
#	0x429630	netpollblock+0x130			/usr/local/go/src/pkg/runtime/netpoll.goc:349
#	0x428e9d	net.runtime_pollWait+0x5d		/usr/local/go/src/pkg/runtime/netpoll.goc:146
#	0x44ebd6	net.(*pollDesc).Wait+0x46		/usr/local/go/src/pkg/net/fd_poll_runtime.go:84
#	0x44ec42	net.(*pollDesc).WaitRead+0x42		/usr/local/go/src/pkg/net/fd_poll_runtime.go:89
#	0x4504d2	net.(*netFD).Read+0x332			/usr/local/go/src/pkg/net/fd_unix.go:232
#	0x460ba9	net.(*conn).Read+0xe9			/usr/local/go/src/pkg/net/net.go:122
#	0x49a7ba	net/http.noteEOFReader.Read+0x7a	/usr/local/go/src/pkg/net/http/transport.go:1148
#	0x4a38df	net/http.(*noteEOFReader).Read+0xdf	/usr/local/go/src/pkg/net/http/chunked.go:1
#	0x53c573	bufio.(*Reader).fill+0x143		/usr/local/go/src/pkg/bufio/bufio.go:92
#	0x53c75d	bufio.(*Reader).Peek+0x11d		/usr/local/go/src/pkg/bufio/bufio.go:120
#	0x498a16	net/http.(*persistConn).readLoop+0xd6	/usr/local/go/src/pkg/net/http/transport.go:770

Do I create too many goroutines that exceed the net package could handle which breaks the transport? I currently has 8500 goroutines that keep sending http request via proxies with a producer/consumer concurrency pattern:
go addJobs(jobs, results)
for i := 0; i < workers; i++ {
go doJobs(jobs, timeout)
}

Anyone else has the same problem? Need your help!

-Billy


在 2014年4月5日星期六UTC+8下午5时11分21秒,Julian写道:

Robert Melton

unread,
Apr 5, 2014, 9:30:01 AM4/5/14
to Billy Shea, golang-nuts
In the testing around this issue -- are you controlling both sides?
IE: A known well behaving server you can interrogate for information?
Would be curious if you wrote a server in Go for it, and the client
and confirmed the behavior. Then you could also give people a fully
working test case to play with and reproduce the errors.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Robert Melton | http://robertmelton.com

Carlos Castillo

unread,
Apr 6, 2014, 2:43:23 AM4/6/14
to golan...@googlegroups.com, ucg...@gmail.com

Billy Shea

unread,
Apr 6, 2014, 8:42:26 PM4/6/14
to golan...@googlegroups.com, ucg...@gmail.com
This patch solved my problem exactly https://codereview.appspot.com/84850043, the goroutines now stop leaking! Cheers, special thanks to ucgggg and snaury.

在 2014年4月5日星期六UTC+8下午7时41分39秒,ucg...@gmail.com写道:

Julian

unread,
Apr 7, 2014, 10:55:52 AM4/7/14
to golan...@googlegroups.com, ucg...@gmail.com
I didn't try your patch but after looking at it I disabled compression in the http.Transport I'm using and the goroutines stopped leaking, so I suppose you're right, that is where the problem lies.

Tom Maiaroto

unread,
Sep 23, 2014, 2:04:39 PM9/23/14
to golan...@googlegroups.com, ucg...@gmail.com
Yes. This.
This is plaguing me BAD right now.

goroutine 869 [select, 2 minutes]:
net
/http.(*persistConn).readLoop(0xc208956fd0)
       
/usr/local/go/src/pkg/net/http/transport.go:868 +0x829
created
by net/http.(*Transport).dialConn
       
/usr/local/go/src/pkg/net/http/transport.go:600 +0x93f

goroutine
778 [select, 2 minutes]:
net
/http.(*persistConn).writeLoop(0xc208956bb0)
       
/usr/local/go/src/pkg/net/http/transport.go:885 +0x38f
created
by net/http.(*Transport).dialConn
       
/usr/local/go/src/pkg/net/http/transport.go:601 +0x957

goroutine
515 [select, 2 minutes]:
net
/http.(*persistConn).readLoop(0xc2086878c0)
       
/usr/local/go/src/pkg/net/http/transport.go:868 +0x829
created
by net/http.(*Transport).dialConn
       
/usr/local/go/src/pkg/net/http/transport.go:600 +0x93f

goroutine
664 [select, 2 minutes]:
net
/http.(*persistConn).readLoop(0xc208957760)
       
/usr/local/go/src/pkg/net/http/transport.go:868 +0x829
created
by net/http.(*Transport).dialConn
       
/usr/local/go/src/pkg/net/http/transport.go:600 +0x93f

They'll sit like that forever. If I refresh the pprof web page, it'll just say 24 minutes, etc. The count on the main page never decreases either. Eventually, since I keep making more requests on a schedule...It just crashes the system. Varies on how long it takes.

I've been trying to close things all over. I tried a client wrapper that lets me set timeouts, etc. I just can't get it to work and am a little frustrated by it to be honest.

I'm on version 1.3.1 ... Is there something else I need still?
Is this referenced patch (5 months old now) not in the stable build?

James Bardin

unread,
Sep 23, 2014, 2:20:41 PM9/23/14
to Tom Maiaroto, golan...@googlegroups.com
The gzip related bug was fixed in go1.3, so you're likely leaking connections in some other way.

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/FnJZ9iZ0i_g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Tom Maiaroto

unread,
Sep 23, 2014, 2:26:09 PM9/23/14
to golan...@googlegroups.com, tom.ma...@gmail.com
Hmm, drat.
This is hard one to track down. I've tried various http client wrappers and the native client on my own. Nothing seems to work.
Are there any things obvious you can suggest I look into?
Do you know of a solid package to use when looping and making many requests (hundreds to thousands) within a goroutine?
Thanks.

Tom Maiaroto

unread,
Sep 23, 2014, 3:21:05 PM9/23/14
to golan...@googlegroups.com, tom.ma...@gmail.com
This is what I've been using:
https://gist.github.com/seantalts/11266762

Could be it be a particular URL? I'm calling the Facebook API.
Is there anything a server response could do to create these issues?

James Bardin

unread,
Sep 23, 2014, 3:45:36 PM9/23/14
to Tom Maiaroto, golan...@googlegroups.com

http.Client has a Timeout field, so you don't need your TimeoutTransport. In your case though, you've basically re-implemented Transport.ResponseHeaderTimeout, since you return before Body is read at all.

No idea what your actual code looks like, but your test stops before calling resp.Body.Close if there's a read error. If you used that same logic with an early return it would be a problem.

Tom Maiaroto

unread,
Sep 23, 2014, 7:48:50 PM9/23/14
to golan...@googlegroups.com, tom.ma...@gmail.com
K. Thanks!
I had to get medieval on this thing and track it down. Otherwise I would have been trying over and over something that was fixed =) So thanks for pointing that out.
Reply all
Reply to author
Forward
0 new messages