Change in virtual memory patterns in Go 1.12

546 views
Skip to first unread message

Rémy Oudompheng

unread,
Apr 2, 2019, 3:49:44 AM4/2/19
to golang-nuts
Hello,

In a large heap program I am working on, I noticed a peculiar change in the way virtual memory is reserved by the runtime : with comparable heap size (about 150GB) and virtual memory size (growing to 400-500GB probably due to a kind of fragmentation), the number of distinct memory mappings has apparently increased between Go 1.11 and Go 1.12 reaching the system limit (Linux setting vm.max_map_count).

Is it something that has been experienced by someone else ? I don't believe this classifies as a bug, but I was a bit surprised (especially as I wasn't aware of that system limit).

Rémy

Austin Clements

unread,
Apr 2, 2019, 10:15:47 AM4/2/19
to Rémy Oudompheng, Michael Knyszek, golang-nuts
Hi Rémy. We often fight with vm.max_map_count in the runtime, sadly. Most likely this comes from the way the runtime interacts with Linux's transparent huge page support. When we scavenge (release to the OS) only part of a huge page, we tell the OS not to turn that huge page frame back into a huge page since that would just make that memory used again. Unfortunately, each time we do this counts as a separate "mapping" just to track that one flag. These "mappings" are always at least 2MB, but you have a large enough virtual address space to reach the max_map_count even then. You can see this in /proc/PID/smaps, which should list mostly contiguous neighboring regions that differ only in a single "VmFlags" bit.

We did make memory scavenging more aggressive in Go 1.12 (+Michael Knyszek), though I would have expected it to converge to roughly the same "huge page flag fragmentation" as before over the course of five to ten minutes. Is your application's virtual memory footprint the same between 1.11 and 1.12 or do that grow?

You could try disabling the huge page flag manipulation to confirm and/or fix this. In $GOROOT/src/runtime/internal/sys/arch_amd64.go (or whichever GOARCH is appropriate), set HugePageSize to 0. Though there's a danger that Linux's transparent huge pages could blow up your application's resident set size if you do that.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rémy Oudompheng

unread,
Apr 16, 2019, 1:23:44 AM4/16/19
to Austin Clements, Michael Knyszek, golang-nuts
Thanks Austin,

The application workload is a kind of fragmentation torture test as it
involves a mixture of many long-lived small and large (>100 MB)
objects, with regularly allocated short-lived small and large objects.
I have tried creating a sample synthetic reproducer but did not
succeed at the moment.

Regarding the max_map_count, your explanation is very clear and I
apparently missed the large comment in the runtime explaining all of
that.
Do you expect a significant drawback between choosing 2MB or 16MB as
the granularity of the huge page flag manipulation in the case of huge
heaps ?

Regarding the virtual memory footprint, it changed radically with Go
1.12. It basically looks like a leak and I saw it grow to more than
1TB where the actual heap total size never exceeds 180GB.
Although I understand that it is easy to construct a situation where
there is repeatedly no available contiguous interval of >100MB in the
address space, it is a significant difference from Go 1.11 where the
address space would grow to 400-500GB for a similar workload and stay
flat after that, and I could not find an obvious change in the
allocator explaining the phenomenon (and unfortunately my resources do
not allow for an easy live comparison of both program lifetimes).

Am I right saying that scavenging method or frequency does not
(cannot) affect at all virtual memory footprint and dynamics ?

Regards,
Rémy.

Austin Clements

unread,
Apr 16, 2019, 11:04:07 AM4/16/19
to Rémy Oudompheng, Michael Knyszek, golang-nuts
On Tue, Apr 16, 2019 at 1:23 AM Rémy Oudompheng <remyoud...@gmail.com> wrote:
Thanks Austin,

The application workload is a kind of fragmentation torture test as it
involves a mixture of many long-lived small and large (>100 MB)
objects, with regularly allocated short-lived small and large objects.
I have tried creating a sample synthetic reproducer but did not
succeed at the moment.

Regarding the max_map_count, your explanation is very clear and I
apparently missed the large comment in the runtime explaining all of
that.
Do you expect a significant drawback between choosing 2MB or 16MB as
the granularity of the huge page flag manipulation in the case of huge
heaps ?

Most likely this will just cause less use of huge pages in your application. This could slow it down by putting more pressure on the TLB. In a sense, this is a self-compounding issue since huge pages can be highly beneficial to huge heaps.

Regarding the virtual memory footprint, it changed radically with Go
1.12. It basically looks like a leak and I saw it grow to more than
1TB where the actual heap total size never exceeds 180GB.
Although I understand that it is easy to construct a situation where
there is repeatedly no available contiguous interval of >100MB in the
address space, it is a significant difference from Go 1.11 where the
address space would grow to 400-500GB for a similar workload and stay
flat after that, and I could not find an obvious change in the
allocator explaining the phenomenon (and unfortunately my resources do
not allow for an easy live comparison of both program lifetimes).

Am I right saying that scavenging method or frequency does not
(cannot) affect at all virtual memory footprint and dynamics ?

It certainly can affect virtual memory footprint because of how scavenging affects the allocator's placement policy. Though even with the increased VSS, I would expect your application to have lower RSS with 1.12. In almost all cases, lower RSS with higher VSS is a fine trade-off, though lower RSS with the same VSS would obviously be better. But it can be problematic when it causes the map count (which is roughly proportional to the VSS) to grow too large. It's also unfortunate that Linux even has this limit; it's the only OS Go runs on that limits the map count.

We've seen one other application experience VSS growth with the 1.12 changes, and it does seem to require a pretty unique allocation pattern. Michael (cc'd) may be zeroing in on the causes of this and may have some patches for you to try if you don't mind. :)

Michael Knyszek

unread,
Apr 16, 2019, 1:34:52 PM4/16/19
to Austin Clements, Rémy Oudompheng, golang-nuts
Hey Rémy,

If you have a chance, could you please try out this patch? It's been known to help the other application Austin mentioned with virtual memory footprint and it should patch cleanly onto the go1.12. Let me know what you see! It'd help us to confirm the root cause of the VSS growth.

Thanks,
Michael

kevin.a...@gmail.com

unread,
Apr 24, 2019, 8:34:34 AM4/24/19
to golang-nuts
Hi Michael,

I found my way to this thread as we are experiencing similar issues in one of our applications after upgrading from 1.11.1 to 1.12.4.  Our application has a lot of persistent, large, busy maps so we periodically recreate them per https://github.com/golang/go/issues/20135.  What we are seeing is that MemStats.HeapSys and MemStats.HeapReleased begin to grow linearly together while MemStates.HeapIdle stays relatively constant.  Other measures, like the system memory usage or the process RSS are also relatively flat.

I applied your patch on the 1.12.4 tag and it looked good in that HeapSys/HeapReleased stopped growing  

Please let me know if I can provide any other details here.

Thank you
>> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Rémy Oudompheng

unread,
May 8, 2019, 3:48:48 AM5/8/19
to Michael Knyszek, Austin Clements, golang-nuts
Hello,

I didn't have time to try the patch, but as Go 1.12.5 is out with another fix, I can confirm with high certainty that the problem is now gone, and I am pretty much confident that the situation is even better than Go 1.11, which is consistent with the fact that the bug was already there, but milder because it only applied to large allocations.

Thanks a lot,

Rémy

Michael Knyszek

unread,
May 8, 2019, 11:45:26 AM5/8/19
to Rémy Oudompheng, Austin Clements, golang-nuts
I'm glad to hear it! What went into Go 1.12.5 should supersede that patch anyway.
Reply all
Reply to author
Forward
0 new messages