Hi all,
I've been recently debugging an interesting memory "leak" inside Docker,
and have tracked it down to the Go runtime. The "leak" involves
overcommitted memory that is never freed. While I am aware that the Go
GC doesn't free memory to the operating system (meaning `munmap`,
because MADV_DONTNEED is not a substitute), I believe there is a
legitimate resource leak issue inside the Go runtime.
Unfortunately I can't appear to provide you any better "small test case"
than running Docker (sorry about that). I'm working on a better test
case, but this is really causing us some grief so hopefully someone can
give us a hand. I'm going to be referring to the Docker daemon < 1.11
(this problem still affects the currently in-development version of
Docker but the process interactions are much more complicated in that
version even though the base problem is the same).
If you run the following script:
% echo 1 | sudo tee /proc/sys/vm/overcommit_memory
% for i in {1..1000}; do docker run --name shell_$i -dit busybox sh; done
% for i in {1..1000}; do docker rm -f shell_$i; done
You have started and then destroyed 1000 containers. You can verify that
there's no resource leaks inside the Docker daemon if you start the
Docker daemon like so:
# DEBUG=1 docker daemon -H tcp://:8080
And then nagivating to
http://127.0.0.1:8080/pprof/heap. But if you look
at the overcommitted amount of memory in /proc/meminfo, it'll look
something like this:
% grep Commit /proc/meminfo
CommitLimit: 1472088 kB
Committed_AS: 3535880 kB
However, if one assumes that this is just heap memory (sysUnused) that
will be reused if we start that many containers again, that doesn't
appear to be the case. If you run the same "start 1000 containers"
command again (bearing in mind the old containers were all completely
purged from existence), the memory **will still climb**. If you can't
reproduce it with 1000 containers, do it with 500 or something.
Now, the weird thing is that this growing of overcommited memory doesn't
appear to go on forever. For me (on a machine with 8GB of physical
memory), it stops growing once it reaches ~8GB. Does anyone know if this
is some "feature" of the Go runtime, that it decides to reuse heap
memory after it's exhausted as much overallocation as possible?
I played around with the Go runtime a little bit (and I ended up
modifying sysUnused in src/runtime/mem_linux.go to call `munmap` rather
than `madvise(v, n, _MADV_DONTNEED)`. One would assume this would cause
my program to segfault. *It didn't*. Nor did it seem to reduce the
overcommiting problem. So I'm pretty much out of my depth here, is there
any help anyone can give me? Thanks so much.
Is there some MemStats trickery I can try, or some `runtime` method that
can help me debug this issue or provide you with information?
--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/