Advice re memory allocation leak

205 views
Skip to first unread message

Jim Robinson

unread,
May 30, 2016, 2:42:52 PM5/30/16
to golang-nuts
I'm baffled at what appears to be a small memory leak in my application.
I'm hoping someone here can give me some advice beyond trying to
reproduce the problem with a smaller set of code.  I'll do that if I have to,
but  I'm hoping I'm just doing something obviously wrong in the code I
discuss below.

I've got a little library that was inspired (stolen) from Rob's demo
on lexing a few years back:


I use it to make it easier to write applications that parse fixed format records.
As part of the logic there's en Emit method that sends the current token over
a channel.  It converts the token to a string and sends it back within a struct
Item that captures the token type, it's position, and a string value:


// Emit reports the current item to the client
func (l *Lexer) Emit(t ItemType) {
l.items <- Item{t, l.rpos - int64(l.pos-l.start), string(l.buf[l.start:l.pos])}
l.Skip()
}

The l.buf referenced in that code is a []byte, and the Skip method either
sets l.start to l.pos, advancing the position within l.buf, or copies the
remainder of l.buf to the start of the []byte and resets l.pos and l.start
to zero.


// Skip advances over the current item without reporting it
func (l *Lexer) Skip() {
// We're at a point where we know we have completely read a
// token.  If we've read 90% of an l.buf's capacity, shift the
// unread content to the start of the buffer.  Otherwise just
// move l.start to the current position.
n := cap(l.buf)
r := n - l.pos
if n/10 >= r {
l.buf, l.start, l.pos = append(l.buf[0:0], l.buf[l.pos:]...), 0, 0
} else {
l.start = l.pos
}
}

In my particular application that is using the lexer library, only time the
string is actually used is to convert the string value to a number using
strconv.Atoi or a string value to a time.T using time.Date, and then it
(and its Item struct) are allowed to fall out of reference, to be garbage
collected.

I didn't think any of this could result in a leak, but when I watch my
program over time it very slowly increases in size by 4 bytes on a
regular basis.  A new lexer is created every 5 minutes, used to parse
a chunk of log file data, and then discarded.  From what I've observed
the program grows in 4 byte increments 2 or 3 times per hour.

When I look at it in the pprof profiler what I see indicates a number of
allocations from the l.Emit method, where I'm generating a string
out of the sub-slice of l.buf and putting it into a struct that I pass back
along a channel"

l.items <- Item{t, l.rpos - int64(l.pos-l.start), string(l.buf[l.start:l.pos])}

Am I doing something wrong in the code above?  Either with the
construction of the Item struct or the string?  I played around with the
code to see if I could isolate the growth and I believe it's from the string
allocation.  If substitute the above for:

b := make([]byte, l.pos-l.start)
copy(b, l.buf[l.start:l.pos])
l.items <- Item{t, l.rpos - int64(l.pos-l.start), string(b)}

The pprof points to the creation of b.  I'm confused at why this
would be a leak, since after a loop of my program as far as I can
tell I am no longer referencing the string once I've finished an
iteration of my program (everything falls out of scope at that
point).


Jim

Tamás Gulácsi

unread,
May 30, 2016, 3:33:30 PM5/30/16
to golang-nuts


Creating a string from a []byte slice will allocate - strings are immutant, so it has to copy the underlying slice to somewhere else to keep it save (not modified).
 

James A. Robinson

unread,
May 30, 2016, 3:56:42 PM5/30/16
to Tamás Gulácsi, golang-nuts
On Mon, May 30, 2016 at 12:33 PM Tamás Gulácsi <tgula...@gmail.com> wrote:
Creating a string from a []byte slice will allocate - strings are immutant, so it has to copy the underlying slice to somewhere else to keep it save (not modified)

Sure, I understand that but what I don't understand is why it apparently is hanging around, despite that copy going out of scope (and no pointers refer to it).

Jim

James Bardin

unread,
May 30, 2016, 9:09:15 PM5/30/16
to golang-nuts
Allocating 4 bytes 2 or 3 times per hour is no where near enough to trigger a garbage collection via allocation. Have you watched the gctrace output or the runtime MemStats to show that your memory increase is unbounded?

James A. Robinson

unread,
May 30, 2016, 11:04:52 PM5/30/16
to James Bardin, golang-nuts
On Mon, May 30, 2016 at 6:09 PM James Bardin <j.ba...@gmail.com> wrote:
Allocating 4 bytes 2 or 3 times per hour is no where near enough to trigger a garbage collection via allocation. Have you watched the gctrace output or the runtime MemStats to show that your memory increase is unbounded?

I've been using runtime/pprof to periodically (at the end of each cycle) run runtime.GC and then calling

pprof.Lookup("heap").WriteTo(os.Stdout, 1)

What I'm seeing over time is a steady increase of entries like the following:

0: 0 [1: 32] @ 0x44ef37 0x44edbf 0x4935db 0x495a70 0x49292c 0x466c41
#       0x4935db        github.com/jimrobinson/lexrec.(*Lexer).Emit+0xeb        /Users/jimr/proj/github/src/github.com/jimrobinson/lexrec/lex.go:308
#       0x495a70        github.com/jimrobinson/lexrec.ExceptRun.func1+0x90      /Users/jimr/proj/github/src/github.com/jimrobinson/lexrec/lex.go:408
#       0x49292c        github.com/jimrobinson/lexrec.(*Lexer).run+0xfc         /Users/jimr/proj/github/src/github.com/jimrobinson/lexrec/lex.go:178

Over a period of 12 cycles it's climbed from 8 instances of 28 and didn't drop in the subsequent 6 cycles.

cycle count
1 8
2 15
3 17
4 17
5 18
6 18
7 21
8 22
9 22
10 25
11 26
12 28
13 28
14 28
15 28
16 28
17 28
18 28

The Heapinuse numbers reported for that test were:

573440
737280
802816
843776
786432
770048
794624
811008
794624
770048
770048
811008
761856
811008
802816
835584
761856
794624
794624

I suppose it's possible that the GC just doesn't think the usage warrants collection, but this slow and steady increase surprises me given what the program is doing: Every 5 minutes waking up, reading 2-3 mb of data via the lexer, and spitting out a json array over the network, at which point all those resources allocated for processing fall out of scope.

Jim

Jim Robinson

unread,
May 31, 2016, 8:48:00 AM5/31/16
to golang-nuts, ji...@highwire.org
I gathered the memory profiling output over the course of an entire
night.  While top is showing the same behavior, a slow increase
in resident memory, the statistics from the memory profiler show
a normal lookup up and down in HeapAlloc and HeapInuse.


Jim Robinson

unread,
May 31, 2016, 2:11:05 PM5/31/16
to golang-nuts, ji...@highwire.org
And after running for > 24 hours it appears to have leveled out around 9 mb.

I'm wondering if it's possible to trigger a "stop the world and gc everything"
process and then dump a snapshot of currently allocated memory.  That
would have let me understand right away whether what I was seeing was
a leak (which is what I was thinking) vs. the normal GC cycle not collecting
everything it possibly can (which I'm now guessing may be what is going on).


Jim
 

James Bardin

unread,
May 31, 2016, 2:25:52 PM5/31/16
to Jim Robinson, golang-nuts, ji...@highwire.org
I'm not sure what you're looking for here. Your Sys and HeapSys
haven't changed at all, so there haven't been any more memory
allocations from the system.

The goal of the GC isn't to release every possible span of unused
memory, it's to use the memory as efficiently as possible, which
includes reducing future allocations by re-using the memory.

(if you're still watching stats in top, don't. RSS doesn't give you
any details on process memory usage. Something like "proportional set
size" can be a little more useful, but trying to track small changes
that way is fruitless)
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/golang-nuts/IfILC-wllaI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Jim Robinson

unread,
Jun 1, 2016, 12:04:55 AM6/1/16
to golang-nuts
On Tuesday, May 31, 2016 at 11:25:52 AM UTC-7, James Bardin wrote:
I'm not sure what you're looking for here. Your Sys and HeapSys
haven't changed at all, so there haven't been any more memory
allocations from the system.

The goal of the GC isn't to release every possible span of unused
memory, it's to use the memory as efficiently as possible, which
includes reducing future allocations by re-using the memory.

(if you're still watching stats in top, don't. RSS doesn't give you
any details on process memory usage. Something like "proportional set
size" can be a little more useful, but trying to track small changes
that way is fruitless)


Thank you for the feedback, I'm satisfied I understand what was
going on now and it's something I can keep in mind for the future.

To answer you question as to why I was looking at top, Sysadmins
look at RSS, and would ask for an explanation as to why a program
appears to gain a megabyte in size over the period of  day.   The
question will be raised "Is this thing leaking?  Is it going to grow by
a megabyte a day forever?"

So yes, I was looking at RSS in top because in most programs that's
a perfectly reasonable way to tell how much memory something is
basically using.

Dumping out many runs of memory profile statistics satisfied
me that it wasn't a leak I was seeing, so what I was looking for,
as you asked was, I think, "A way to *quickly* tell that there
isn't a memory leak, something that doesn't take me hours of
running stats to determine."  Being able to run some sort of
GC that forces it to clean up everything that is unreferenced
would let me satisfy the question with a single run of the
program.

Looking at a memory profile dump that shows a steadily increasing
number of allocations from a specific line of code it wasn't obvious
to me that the GC was merely deciding it wasn't worthwhile/optimal
to free up those references.


Jim

Reply all
Reply to author
Forward
0 new messages