Memory leak in io.Copy?

1,777 views
Skip to first unread message

Sascha Matzke

unread,
Feb 10, 2012, 4:43:13 AM2/10/12
to golang-nuts
Hi guys,

when sending several thousand HTTP requests I've noticed some
irregular memory consumption.

Even after stopping the requests and manually triggering a GC, the
allocated memory is pretty high.

I ran the process with memory profiling enabled which lead to this:

https://docs.google.com/open?id=0B3CCB7CMFJ8FNGVlZDVkYjItNzc4Yy00ZmQ4LWExYTUtNzI3MjY3OTUzYmZh

Any idea what I'm doing wrong here?

Best regards,

Sascha
--
Through the darkness of future past
the magician longs to see
One chants out between two worlds
Fire walk with me.

Dave Cheney

unread,
Feb 10, 2012, 4:49:50 AM2/10/12
to Sascha Matzke, golang-nuts
Hello,

Can you please post some more details, including the source.

Are you using 386 or amd64 ?

How are you computing the memory usage of your process?

The runtime does not currently return memory to the operating system,
but there is a change in the works to improve the situation.
http://codereview.appspot.com/5451057/

Cheers

Dave

Sascha Matzke

unread,
Feb 10, 2012, 4:56:47 AM2/10/12
to golang-nuts
Hi,

I'm on 386. Here's the source - it's the last method in the diagram
written by me:

func (fw *FederatedWriter) sendRequest(url string, data []byte) error
{
t := time.Now()
defer fw.federationRequestTime.AddValue(int64(time.Now().Sub(t)))
req, _ := http.NewRequest("POST", url, bytes.NewBuffer(data))
req.Header.Add("Content-Type", "application/json")
resp, reqerr := fw.httpClient.Do(req)
if reqerr != nil {
return reqerr
}
resp.Body.Close()
return nil
}

If I don't stop the process manually it dies with:

runtime: memory allocated by OS not in usable range
runtime: memory allocated by OS not in usable range
runtime: out of memory: cannot allocate 65536-byte block (536870912 in
use)
throw: out of memory

Sascha

Sascha Matzke

unread,
Feb 10, 2012, 5:17:19 AM2/10/12
to golang-nuts

Sascha Matzke

unread,
Feb 10, 2012, 5:27:41 AM2/10/12
to Dave Cheney, golang-nuts
I saw several thousand of those:

goroutine 13328 [chan receive]:
net/http.(*persistConn).readLoop(0x3660eb00, 0x35f4f3a0)
/home/sascha/Development/Go/go/src/pkg/net/http/transport.go:528 +0x1b5
created by net/http.(*Transport).getConn
/home/sascha/Development/Go/go/src/pkg/net/http/transport.go:375 +0x591

No idea why there are so many of them.

Andy Balholm

unread,
Feb 10, 2012, 11:36:40 AM2/10/12
to golan...@googlegroups.com, Dave Cheney
Apparently you have some goroutines that aren't ever terminating. Goroutines are pretty lightweight, but several thousand of them could use quite a bit of memory, especially if they have local variables pointing to large data structures, keeping them from being garbage collected.

Sascha Matzke

unread,
Feb 10, 2012, 11:46:11 AM2/10/12
to Andy Balholm, golan...@googlegroups.com, Dave Cheney
The problem is that they're not *my* goroutines but some net/http
internal persistent connection handling routines.

Is there anything I can do to make them go away?

Andy Balholm

unread,
Feb 10, 2012, 11:56:17 AM2/10/12
to golan...@googlegroups.com, Dave Cheney
I don't really know what's going on, but it's something to do with persistent HTTP connections (connections that are left open in case you want to send another request to the same server).

Apparently all your requests are going to different servers, so each one results in a persistent connection hanging around.

This sounds familiar; I had a program recently that was crashing with "too many open files". I wonder if it was the same problem.

Try http.Transport.CloseIdleConnections. I've never used it, but it sounds like it does what you need.

Russ Cox

unread,
Feb 10, 2012, 12:09:11 PM2/10/12
to Andy Balholm, golan...@googlegroups.com, Dave Cheney
One common reason you'd see many more goroutines
and HTTP connections than expected is if you were
failing to call resp.Body.Close() on an HTTP response.
If the HTTP library doesn't know you are done reading
the response, it cannot reuse the connection.

Russ

Sascha Matzke

unread,
Feb 10, 2012, 12:12:21 PM2/10/12
to Andy Balholm, golan...@googlegroups.com, Dave Cheney
Hi,

quite the contrary... All requests are going to the same server.

It's a simple forwarding (and if not able to, store in file and
forward later) HTTP logging proxy.

There's an input and an output channel and a buffer file which is used
when the output channel is clogged (entries stored to the file will be
sent later, when the output channel is less then 75% full).

There's a limited number of workers working on the output channel (in
the example here there were 4 worker goroutines).

Sascha Matzke

unread,
Feb 10, 2012, 12:14:18 PM2/10/12
to r...@golang.org, Andy Balholm, golan...@googlegroups.com, Dave Cheney
Hi Russ,

I call resp.Body.Close() - here's the code again:


func (fw *FederatedWriter) sendRequest(url string, data []byte) error
{
      t := time.Now()
      defer fw.federationRequestTime.AddValue(int64(time.Now().Sub(t)))
      req, _ := http.NewRequest("POST", url, bytes.NewBuffer(data))
      req.Header.Add("Content-Type", "application/json")
      resp, reqerr := fw.httpClient.Do(req)
      if reqerr != nil {
              return reqerr
      }
      resp.Body.Close()
      return nil
}

Jim Robinson

unread,
Feb 10, 2012, 1:01:49 PM2/10/12
to golan...@googlegroups.com, r...@golang.org, Andy Balholm, Dave Cheney
On Friday, February 10, 2012 9:14:18 AM UTC-8, Sascha Matzke wrote:
      resp, reqerr := fw.httpClient.Do(req)
      if reqerr != nil {
              return reqerr
      }
      resp.Body.Close()


Won't this leak whenever you get an error?

Sascha Matzke

unread,
Feb 10, 2012, 1:20:05 PM2/10/12
to Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm, Dave Cheney
Maybe - if the resp is != nil in case of an error.

But all response I get are 204s - at least that's what the server answers to every single request.

Sascha Matzke

unread,
Feb 13, 2012, 1:32:11 AM2/13/12
to Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm, Dave Cheney
Hi all,

to make sure resp.Body.Close is called, I change to code to:

func (fw *FederatedWriter) sendRequest(url string, data []byte) error {
t := time.Now()
defer fw.federationRequestTime.AddValue(int64(time.Now().Sub(t)))
req, _ := http.NewRequest("POST", url, bytes.NewBuffer(data))
req.Header.Add("Content-Type", "application/json")
resp, reqerr := fw.httpClient.Do(req)
if resp != nil && resp.Body != nil {
defer resp.Body.Close()
}
if reqerr != nil {
return reqerr
}
return nil
}

Unfortunately this doesn't change anything. The process still crashes after several thousand requests (and with several thousand persistConn goroutines in the stack trace).

Sascha

Dave Cheney

unread,
Feb 13, 2012, 2:35:24 AM2/13/12
to Sascha Matzke, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
Hi

I'd like to help but it is not clear from the memory profile, or the
snippet of code you provided exactly what is going on.

Are you able to provide a single file test case that exhibits the
problem? Just strip out the parts of your app that join your federated
writer to some http.Handler and wrap a main() around it. If that
reproduces the problem then we stand a sizeable chance of getting to
the bottom of the issue.

Cheers

Dave

Brad Fitzpatrick

unread,
Feb 13, 2012, 3:11:28 AM2/13/12
to Dave Cheney, golang-nuts, Jim Robinson, Andy Balholm, r...@golang.org, Sascha Matzke

Yes, that would be wonderful.  I'd love to debug if I could reproduce.

Sascha Matzke

unread,
Feb 13, 2012, 3:57:07 AM2/13/12
to Dave Cheney, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
Here's an example that exhibits the same behavior.


The interesting bit is that the increased number of goroutines only appear when the http.Client is shared between the sendRequest() goroutines. When I allocate a separate http.Client for each sendRequest() (before the for-loop) there are no dangling persistConn.readLoop() goroutines. The out of memory error occurs nevertheless.

Dave Cheney

unread,
Feb 13, 2012, 4:18:47 AM2/13/12
to Sascha Matzke, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
You defer in sendRequest is never called because the function never exits. 

Place the inner part of the loop in a separate function, including the defer, and I think this will solve the problem. 

Sascha Matzke

unread,
Feb 13, 2012, 4:25:46 AM2/13/12
to Dave Cheney, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
Hi,

the defer is just an copy & paste error on my side (remember in the original code sendRequest is a function).

Removed "defer" - behavior is the same.

Brad Fitzpatrick

unread,
Feb 13, 2012, 4:44:24 AM2/13/12
to Sascha Matzke, Dave Cheney, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
Nice, thanks.  I haven't looked into it yet, but it looks like a nice example.

Does the problem go away if you change 204 to 200?

Sascha Matzke

unread,
Feb 13, 2012, 4:54:29 AM2/13/12
to Brad Fitzpatrick, Dave Cheney, Jim Robinson, golan...@googlegroups.com, r...@golang.org, Andy Balholm
I changed the RC to 200 - no difference. I then altered the response to really return some data ([]byte("Hello, World!") - no difference as well.

Florian Weimer

unread,
Feb 13, 2012, 9:18:28 AM2/13/12
to golan...@googlegroups.com
* Sascha Matzke:

> the defer is just an copy & paste error on my side (remember in the
> original code sendRequest is a function).
>
> Removed "defer" - behavior is the same.
>
> http://pastebin.com/UKuXbSar

It seems that the issue is that the goroutine calling putIdleConn does
not exit if putIdleConn discards the connection because the maximum is
hit.

Please try the attached patch.

diff --git a/src/pkg/net/http/transport.go b/src/pkg/net/http/transport.go
--- a/src/pkg/net/http/transport.go
+++ b/src/pkg/net/http/transport.go
@@ -235,15 +235,15 @@
return ""
}

-func (t *Transport) putIdleConn(pconn *persistConn) {
+func (t *Transport) putIdleConn(pconn *persistConn) bool {
t.lk.Lock()
defer t.lk.Unlock()
if t.DisableKeepAlives || t.MaxIdleConnsPerHost < 0 {
pconn.close()
- return
+ return false
}
if pconn.isBroken() {
- return
+ return false
}
key := pconn.cacheKey
max := t.MaxIdleConnsPerHost
@@ -252,9 +252,10 @@
}
if len(t.idleConn[key]) >= max {
pconn.close()
- return
+ return false
}
t.idleConn[key] = append(t.idleConn[key], pconn)
+ return true
}

func (t *Transport) getIdleConn(cm *connectMethod) (pconn *persistConn) {
@@ -565,7 +566,9 @@
lastbody = resp.Body
waitForBodyRead = make(chan bool)
resp.Body.(*bodyEOFSignal).fn = func() {
- pc.t.putIdleConn(pc)
+ if !pc.t.putIdleConn(pc){
+ return
+ }
waitForBodyRead <- true
}
} else {
@@ -578,7 +581,9 @@
// read it (even though it'll just be 0, EOF).
lastbody = nil

- pc.t.putIdleConn(pc)
+ if !pc.t.putIdleConn(pc) {
+ return
+ }
}
}

Brad Fitzpatrick

unread,
Feb 13, 2012, 7:09:15 PM2/13/12
to Florian Weimer, golan...@googlegroups.com
Sascha, Florian, thanks!

That patch is almost right.  But it definitely helped me see the issue.

Fix is out for review here, with a test: http://codereview.appspot.com/5656046/

- Brad

darre...@gmail.com

unread,
Jun 2, 2015, 8:34:22 AM6/2/15
to golan...@googlegroups.com
I have a simple http log server in Go(1.4.2), that serves about 4000 requests per second
the http connections are all short-lived, ie, Connection: Close is sent for every request.

It runs ok after startup, is uses resident memory up to 80MB or so for several days.
but as time goes, after running continuously for a month or so, it eats up 1GB  resident memory:

is io.Copy still leaking? (an svg from pprof is attached), any ideas about what's going on?

(pprof) top 
486.21MB of 500.88MB total (97.07%)
Dropped 274 nodes (cum <= 2.50MB)
Showing top 10 nodes out of 30 (cum >= 4MB)
      flat  flat%   sum%        cum   cum%
  446.16MB 89.08% 89.08%   446.66MB 89.18%  io.Copy
      11MB  2.20% 91.27%       11MB  2.20%  runtime.deferproc_m
    8.50MB  1.70% 92.97%     8.50MB  1.70%  newproc_m
    5.02MB  1.00% 93.97%     5.02MB  1.00%  bufio.NewReaderSize
    4.52MB   0.9% 94.87%     4.52MB   0.9%  net/http.newBufioWriterSize
       4MB   0.8% 95.67%     5.50MB  1.10%  main.NewClientInfo
    3.01MB   0.6% 96.27%     3.01MB   0.6%  main.NewLogFile
    2.50MB   0.5% 96.77%    12.04MB  2.40%  net/http.(*Server).newConn
       1MB   0.2% 96.97%        3MB   0.6%  net.(*netFD).accept
    0.50MB   0.1% 97.07%        4MB   0.8%  net/http.ReadRequest


pprof001.svg

Darren Hoo

unread,
Jun 4, 2015, 10:03:43 PM6/4/15
to golan...@googlegroups.com
After I removed these lines of code, io.Copy seems no longer leaking now.

Did I misuse closeNotify?

func (zd *ZoneDispatcher) RegisterNotify(close <-chan bool) {
atomic.AddUint32(&zd.ActiveConn, 1)
startAt := time.Now()
<-close
finishAt := time.Now()
diff := finishAt.Sub(startAt)

atomic.AddUint32(&zd.ActiveConn, ^uint32(0))
atomic.AddUint64(&zd.FinishedConn, 1)

i := uint32(diff / 1e9)

if i >= 5 {
fmt.Fprintf(os.Stderr, "Time: %v\n", i)
}
}

func (zd *ZoneDispatcher) ServeHTTP(
w http.ResponseWriter,
r *http.Request) {

atomic.AddUint64(&zd.FinishedConn, 1)
notify := w.(http.CloseNotifier).CloseNotify()
go zd.RegisterNotify(notify)

w.Header().Set("Server", ServerVersion)
        ...
}

Maybe  I  shoud use http.Server.ConnState to do connection tracking instead of using closeNotify

Matthew Zimmerman

unread,
Jun 4, 2015, 10:22:47 PM6/4/15
to Darren Hoo, golan...@googlegroups.com
In your profile do list io.Copy , see where/what the allocation is, then follow the rabbit and see what isn't releasing it.

I've had leaks like this before because I'd keep a reference to the underlying slice.  Even slicing off 20 bytes of 10k will keep all 10k in memory until you release the 20 byte slice.

Conversion to string and back to []byte is an easy way of making a copy; not sure how performant it is but it's never come up in any of my profiles when I've used it.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages