I am writing a NBD server in go (see
https://github.com/abligh/gonbdserver if you are interested).
All is well when the client uses small read/write sizes (something I am not in control of). However, when the client uses random large read write sizes, the program appears to leak memory. I say "appears" to leak memory, as if GC() is called, it gives the memory back (so there isn't a real leak), but in practice the packets come in fast enough that golang just uses more and more memory until swap is exhausted (i.e. it appears not to be able to GC() fast enough).
Operation is pretty simple. A packet comes in, which is either a read or a write. If a write(), a []byte is allocated (of exactly the correct length and capacity), exactly that number of bytes is read from the network connection, a struct with that slice is in is put into a channel, a worker reads the struct from a channel, then writes the []byte to disk. A similar process is followed for a read(), only a worker reads from disk into a preallocated []byte, puts it in a channel, and the sender discards it.
I've been playing with pprof and as far as I can tell what happens is that whilst a large amount of memory taken up (both resident and virtual), nearly all of it is released on a GC (I have SIGUSR1 doing that). What I care about is the large blocks (multi-megabyte slices of bytes), and all of these release. But whilst the program is running they seem to be on the heap or at least not released to the operating system. In essence the behaviour is that these are returned to the OS less than 50% of the time.
If I was writing this in C, what I'd be doing is allocating my large blocks with mmap(MAP_ANON, ...) rather than on the normal heap.
Is there a comparable way for go? I thought about recycling the []byte (there's a pool allocator somewhere I believe), but that's not ideal as they are not all the same size. I realise I could maintain a separate pool for each power of two or something.
I can illustrate this behaviour with a simple 'go test' statement on the above program.
go test -v -run '^TestConnectionIntegrityHuge$' -longtests
on Linux - appears to be similar on OS-X. This is golang 1.6.1.
Is this behaviour expected? Is there a suggested workaround?
--
Alex Bligh