basics of understanding golang memory usage

1,400 views
Skip to first unread message

Mike Spreitzer

unread,
Aug 14, 2020, 6:58:16 PM8/14/20
to golang-nuts
I have several basic questions.  I examined a couple of golang programs, and got really confused.

I started by looking at a kube-proxy from Kubernetes release 1.15 (probably either 1.15.9 or 1.15.12) on Linux 4.15.0 using `/proc/[pid]/statm`. 
As a reminder, here is what man procfs has to say about that:
              Provides information about memory usage, measured in pages.  The
columns are:
size (1) total program size
(same as VmSize in /proc/[pid]/status)
resident (2) resident set size
(same as VmRSS in /proc/[pid]/status)
shared (3) number of resident shared pages (i.e., backed by a file)
(same as RssFile+RssShmem in /proc/[pid]/status)
text (4) text (code)
lib (5) library (unused since Linux 2.6; always 0)
data (6) data + stack
dt (7) dirty pages (unused since Linux 2.6; always 0)
Here is what I saw:
/proc/10368$ cat statm
35207 8684 6285 4026 0 26265 0
The page size is 4 KiB.  That’s about 16 MB of code plus 108 MB of data+stack, plus another 20 MB of something else.  What is that something else?  Why is the data+stack so huge?

Next I looked at `maps`:
/proc/10368$ sudo cat maps
00400000-013ba000 r-xp 00000000 fc:01 777880                             /usr/local/bin/kube-proxy
013ba000-026e9000 r--p 00fba000 fc:01 777880                             /usr/local/bin/kube-proxy
026e9000-02749000 rw-p 022e9000 fc:01 777880                             /usr/local/bin/kube-proxy
02749000-02770000 rw-p 00000000 00:00 0 
c000000000-c000200000 rw-p 00000000 00:00 0 
c000200000-c000400000 rw-p 00000000 00:00 0 
c000400000-c000a00000 rw-p 00000000 00:00 0 
c000a00000-c004000000 rw-p 00000000 00:00 0 
7f4b47346000-7f4b49937000 rw-p 00000000 00:00 0 
7ffcd6169000-7ffcd618a000 rw-p 00000000 00:00 0                          [stack]
7ffcd61d0000-7ffcd61d3000 r--p 00000000 00:00 0                          [vvar]
7ffcd61d3000-7ffcd61d5000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

The block from c000000000 to c004000000 is about 67 MB, and the block from 7f4b47346000 to 7f4b49937000 is about another 40 MB.  So that must be most of the 108 MB data+stack.  The fact that they are distinct memory ranges suggest they are two different kinds of usage.  Which range is which usage?


The first block for /usr/local/bin/kube-proxy is the 4026 pages that procfs reports for text.  Why are the other two /usr/local/bin/kube-proxy blocks not reported as text?  They add up to about that mysterious 20 MB.  Is the first block included in "number of resident shared pages (i.e., backed by a file)"?

Next, I tried scraping /metrics from a different golang process.  I have lots of questions about the memory related metrics.  Is there any more definition of the go_memstats_… metrics beyond their HELP lines?

For example, what is the difference beween
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 4.881872e+06
and
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 7.33184e+06
?

It clearly is not
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.885952e+07
… and what is that, anyway?

And then there is this:
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 5.7155584e+07
… is that “released” as in “cumulative over release operations in the past” or as in “currently in released-to-os state”?

Same question for “obtained” in
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.619136e+07

Then there is
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 917504
does “use by the stack allocator” refer to overhead of the allocator, or use for stacks themselves?

There are several metrics described as being “obtained from system” for various purposes, but they do not add up to what appears to be the total.
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.619136e+07

# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384

# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 131072

# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 917504
Those add up to about 67256320, but then there is
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.1762168e+07

Thanks,
Mike

Mike Spreitzer

unread,
Aug 17, 2020, 11:30:32 PM8/17/20
to golang-nuts
FYI, after asking on Slack I got some useful pointers:

- https://golang.org/pkg/runtime/#MemStats - definitions of memory stats from the golang runtime
- https://github.com/prometheus/client_golang/blob/master/prometheus/go_collector.go - the code that publishes golang runtime memory stats as Prometheus metrics.
- https://man7.org/linux/man-pages/man2/madvise.2.html - has some clues about memory blocks in special in-between states.  This explains why the above stuff talks about some of the memory that a golang process has as being "released to the OS".
- https://github.com/golang/go/issues/32284 - has an interesting discussion.  I looked around more than a little, and could not find it collected in a wiki page or blog entry or article.
- https://godoc.org/golang.org/x/debug/cmd/viewcore - tool for crawling coredumps, mentioned in issue 32284.

So I tried viewing the memory usage of a program from four angles.  I got some consistency, and some inconsistencies.

Mike Spreitzer

unread,
Aug 18, 2020, 1:56:30 AM8/18/20
to golang-nuts
I collected data on an example program from four angles.  See https://docs.google.com/document/d/1KUz7IjnD93X2VTVkRAhhNa7rHScCaIH8GLrsoaDIW_g for the raw data and my puny attempts to correlate the four views.  There are some exact equalities, some near matches, some gross differences in things that seem like they should be the same, and things I wasn't able to match in any way.  This begs lots of questions.

Why is memory.usage_in_bytes from cgroups so much smaller than the other top-line measurements?  What is statm.size counting that the coredump "all" is not?

I assume that the text part of the connection-agent binary is shared.  Why is the readonly data part of the connection-agent binary _not_ shared?

What is statm.data counting that MemStats.Sys is not?

Why is MemStats.HeapIdle so much bigger than the coredump's heap free spans?
Why is MemStats.heapReleased so much bigger than the coredump's heap released?
In MemStats, (HeapIdle - HeapReleased) = 1,490,944
In coredump, (heap free spans) - (heap released) = 1,613,824 --- not so very different.  What is making the roughly 52 MB difference between MemStats and the coredump?

Why is the bss in the coredump so huge?  Does it correspond to anything in any of the other views?

What do the three big anonymous blocks in the procfs view correpond to in the other views?

Thanks,
Mike

Message has been deleted

mspr...@us.ibm.com

unread,
Sep 24, 2020, 3:39:46 PM9/24/20
to golang-nuts
With help from Nelson Mimura Gonzalez and Bryan S Rosenburg I was able to make some progress on the relationship between the cgroups view and the procfs view.

In the cgroups view: `memory.usage_in_bytes` is the sum of `memory.stat:rss` + `memory.stat:cache` + `memory.stat:mapped_file` + `memory.kmem.usabe_in_bytes`.

The cgroups `memory.stat:rss` is one page less than the sum of these procfs values: (sum of Rss of anonymous pmap blocks) + (sum of Anonymous of fd pmap blocks).

Regards,
Mike
Reply all
Reply to author
Forward
0 new messages