GC pauses with large map[string]string.

3,498 views
Skip to first unread message

jonathan....@gmail.com

unread,
Nov 26, 2014, 2:08:01 AM11/26/14
to golan...@googlegroups.com
While doing some profiling on a piece of time-sensitive code, we discovered unusually long GC pauses occurring. We narrowed down the culprit to a large collection of type map[string]string.  This map is created at application startup by allocating it with sufficient capacity and then populating it with a few million entries.  After startup, the contents of the map never change.  Despite this, we found that because the map exists and contains lots of entries, the current Go 1.3 and 1.4rc1 mark-and-sweep garbage collector take upwards of 50-70ms trying to inspect this map and collect the entries in this map.

My objective is to minimize the amount of time spent performing garbage collection. I'm open to the possibility of revising the structure in some way to help make it perform better.  In various posts on this forum I have found that having fewer, yet larger structures can help and that removing pointers where possible all have a significant and positive impact on GC times.  In this use case it's a little more difficult because a string (or slice for that matter) is an implicit pointer.  For what it's worth, our character set is plain old ASCII and the longest string in the map (key or value) is 48 characters.

Is there any mechanism to instruct the GC to avoid collecting a given structure? (For example, making it package scope...) Failing that, are there any ideas to rework the structure in such way that GC can effectively skip over it?

One idea we have kicked around (but would like to avoid) is having the map implemented in C and then using cgo to perform lookups against the unmanaged map.

Dmitry Vyukov

unread,
Nov 26, 2014, 2:16:59 AM11/26/14
to jonathan.s.oliver42, golang-nuts
One simple thing that can help (and also reduce memory usage somewhat)
is to concatenate all strings (both keys and values) into a single
large string, and then insert into the hashmap sub-strings of that
string.

Henrik Johansson

unread,
Nov 26, 2014, 2:29:03 AM11/26/14
to jonathan.s.oliver42, golang-nuts
Perhaps the strings are not needed? I don`t know what you use it for but it sounds like checks for existence and some translation may be on the table.
Perhaps changing the keys to int via some hash function could be a first step?


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tahir

unread,
Nov 26, 2014, 2:56:33 AM11/26/14
to golan...@googlegroups.com
I was thinking about using a fixed size buffer for storage of each string but you will have additional retrieval overheads and larger memory usage.. Might or might not be worth it.

xingtao zhao

unread,
Nov 26, 2014, 3:00:01 AM11/26/14
to golan...@googlegroups.com
use map[[48]byte][48]byte instead?

Robert Melton

unread,
Nov 26, 2014, 3:09:42 AM11/26/14
to Henrik Johansson, jonathan.s.oliver42, golang-nuts
On Wed, Nov 26, 2014 at 2:28 AM, Henrik Johansson <dahan...@gmail.com> wrote:
> Perhaps the strings are not needed? I don`t know what you use it for but it
> sounds like checks for existence and some translation may be on the table.
> Perhaps changing the keys to int via some hash function could be a first
> step?

Would just [][]int8 Keys / [][]int8 Values work -- naturally paired?
That lets you quickly scan the Keys, they are in memory sequentially
so hopefully very good lookup times, without wasting cache even
considering Values... then you hop over and load the matching Values
entry on hit and convert that []int8 into a series of ascii
characters. All fixed size arrays which might help with the GC pause
issue... no maps or strings or pointers required. Fair warning... I
have no idea about Go GC internals... so pinch of salt, buyer beware,
etc.

--
Robert Melton | http://robertmelton.com

Tahir

unread,
Nov 26, 2014, 3:26:16 AM11/26/14
to golan...@googlegroups.com
Alternatively, maybe some kind of radix tree or trie will help.


On Wednesday, November 26, 2014 7:08:01 AM UTC, jonathan....@gmail.com wrote:

Jan Mercl

unread,
Nov 26, 2014, 3:59:22 AM11/26/14
to golang-nuts
On Wed, Nov 26, 2014 at 9:00 AM, xingtao zhao <zhaox...@gmail.com> wrote:
> use map[[48]byte][48]byte instead?

That's IMHO an interesting solution as it does not involve any
pointers. Strings are structs with a pointer field thus GC must
consider them. But GC is also precise so arrays of non pointers should
be ignored by the GC.

-j

Kevin Gillette

unread,
Nov 26, 2014, 4:03:35 AM11/26/14
to golan...@googlegroups.com, dahan...@gmail.com, jonathan....@gmail.com
Is this something particular to strings, or would a few million separately allocated records in any key-and-value pointer/reference-typed map produce the same behavior? Is it something particular to maps?

If it's not specific to maps or strings, but rather just based on the number of allocated "objects", then slices of slices containing offsets would ultimately have the same problem (also, slices can't be map keys).

Dmitry: I seem to recall some recent mention about the possibility of a generational GC -- if true, would that be of the sort in which very long-lived objects eventually get "moved" to an area/category that gets scanned less frequently?

Dmitry Vyukov

unread,
Nov 26, 2014, 4:30:14 AM11/26/14
to Kevin Gillette, golang-nuts, Henrik Johansson, jonathan.s.oliver42
On Wed, Nov 26, 2014 at 12:03 PM, Kevin Gillette
<extempor...@gmail.com> wrote:
> Is this something particular to strings, or would a few million separately
> allocated records in any key-and-value pointer/reference-typed map produce
> the same behavior? Is it something particular to maps?
>
> If it's not specific to maps or strings, but rather just based on the number
> of allocated "objects", then slices of slices containing offsets would
> ultimately have the same problem (also, slices can't be map keys).
>
> Dmitry: I seem to recall some recent mention about the possibility of a
> generational GC -- if true, would that be of the sort in which very
> long-lived objects eventually get "moved" to an area/category that gets
> scanned less frequently?

Yes, that's what generational GC intended for. But we don't any
particular plans about generational GC for Go (most current production
generational GCs also require heap compaction).

Sugu Sougoumarane

unread,
Nov 26, 2014, 7:10:44 PM11/26/14
to golan...@googlegroups.com, jonathan....@gmail.com
If you want to consolidate string allocations, you can look at StringArena: https://github.com/youtube/vitess/blob/master/go/hack/hack.go. It uses unsafe, which may get deprecated in the future.
Consolidating allocations may not help with GC pauses, because I think the total number of pointers chased will still be the same. But reducing the number of allocations overall should still give you some speed up.

And my customary question: Have you looked at memcache?

Caleb Spare

unread,
Nov 26, 2014, 7:22:39 PM11/26/14
to Sugu Sougoumarane, golang-nuts, jonathan....@gmail.com
I also had a problem with very large maps causing long GC pauses (3+
seconds). Changing the map key/value types to not have any pointers
helped somewhat, but the pauses were still ~250ms. I think the GC must
have to scan internal pointers of map types. Rolling my own hashmap
implementation based only on slices and slice offsets brought pause
times down to <1ms.

-Caleb

jonathan....@gmail.com

unread,
Nov 26, 2014, 7:47:39 PM11/26/14
to golan...@googlegroups.com, sou...@google.com, jonathan....@gmail.com
@Caleb,

I would love to see the hashmap implementation. Where is it found in your Github repos?

Jonathan

Caleb Spare

unread,
Nov 26, 2014, 8:05:29 PM11/26/14
to jonathan....@gmail.com, golang-nuts, Sugu Sougoumarane
I'll link it if you promise not to use it ;D

https://github.com/cespare/kvcache/blob/master/refmap.go

This was good enough for my immediate use case, but it's a very
simplistic hashmap and not suitable for general use.

- It's built around my particular key/value type
- It has hard-coded constants for a particular map size (~1e7)
- It resizes all at once, rather than incrementally, making for a
large pause when that happens
- It hasn't been exhaustively tested and might eat your data

It might be an interesting exercise to implement a general
pointer-less hash table (perhaps with 'go generate' specialization).
Or maybe future map/GC implementations will avoid this issue.

-Caleb

jonathan....@gmail.com

unread,
Nov 28, 2014, 1:17:17 AM11/28/14
to golan...@googlegroups.com, jonathan....@gmail.com, sou...@google.com
I swear I must be working with Go incorrectly because I'm still seeing significant latency on GC. (My exact environment is found below) Here's a barebones Go app to demonstrate the problem: http://play.golang.org/p/q1iGjFUB6v

I have a latency plotting tool performing HTTP requests against the application. I run this plotting tool multiple times over a 1-minute period with an increasing number of requests per second, starting from 1 request per second up to a 128K requests per second.

When I run the application as a barebones HTTP server with no behavior, I easily get 99% of requests finishing in under a millisecond. The remaining 1% finish around 2-3ms.

By creating a map[int]int filled with 10 million items (350+ MB of memory), I have seen massive latency spikes during GC. It's not uncommon to see spikes of 70ms during GC. I have tried varying the GC collection percentage from 1-200 but I don't see that much variance in the GC latency. Turning off GC completely eliminates any GC spikes (of course), but that's not really a solution.

Note that I have also tried creating a simple byte buffer of 1 GB+ of memory and I have been able to limit the GC latency by requiring increased collection percentage. For example, with a 1GB heap, the time between GC invocations increases greatly which means there's more to clean. By increasing the rate (decreasing the time between GC), I was able to see the exact same numbers as a barebones example--about 2ms per request.

Here's the trouble: A lot of advice I'm seeing for reducing GC latency is about reducing pointers, etc., but what I'm measuring shows that a map, regardless of the contents (int, string, etc.), appears to be scanned completely during collection and it is slowing things down significantly. The results seem to indicate that the current GC implementation has a difficult time where instances of large maps are concerned.

Am I completely off base here, or is there some obvious bit of context that I'm missing?

My environment:
Two dedicated boxes running Intel 1270v3 (Haswell) CPUs with 16 GB of RAM.
Ubuntu 14.04 x64 LTS
Trying with Go 1.3.3 and Go 1.4rc1
Dedicated gigabit link between the boxes

On a side note, I thought Dmitry's initial response to my post of creating a single string and then doing string slices from there into a map was a really creative and unique solution to the problem, but the numbers didn't show any kind of improvement. I therefore decided to more fully investigate how maps of integers, booleans, etc. affected latency. The finding (as noted above) shows that the size of the map is what matters most--not the contents in my use case.

Caleb Spare

unread,
Nov 28, 2014, 1:41:01 AM11/28/14
to jonathan....@gmail.com, golang-nuts, Sugu Sougoumarane
Yes, this is what I found too. That's why I replaced big Go maps in my application.

As for the key/value types of the maps not slowing down GC -- try using a map[*X]*Y instead -- you'll see a significant further slowdown :)

-Caleb

Tahir

unread,
Nov 28, 2014, 3:41:19 AM11/28/14
to golan...@googlegroups.com, jonathan....@gmail.com, sou...@google.com
I think it is somewhat expected. It is likely that internally, the hashmap uses pointers to buckets. The more values, the more buckets.
As Sugu said, have you tried Memcache ? 

(It's true that from a high level perspective, the idea of deferring/disabling GC on a portion of memory seems attractive, but it will hopefully be a non issue in a few months, especially in your case where you don't write to the structure (no cache issue with the GC thread))

In the meanwhile though, perhaps storing offheap is a more sensible solution.

Gerard

unread,
Nov 28, 2014, 4:49:20 AM11/28/14
to golan...@googlegroups.com
Stupid idea probably but why not having:

map[first_key_letter rune]map[key string]string

That will reduce the map sizes and is still fast because the first_key_character is a rune (fast integer lookup)

Robert Melton

unread,
Nov 28, 2014, 4:50:30 AM11/28/14
to jonathan.s.oliver42, golang-nuts, sou...@google.com
Did you try any of the non-map solutions recommended?
Robert Melton | http://robertmelton.com

jonathan....@gmail.com

unread,
Nov 28, 2014, 11:24:22 AM11/28/14
to golan...@googlegroups.com, jonathan....@gmail.com, sou...@google.com
This has been an incredibly helpful thread with lots of great ideas. I've got a solution now that solves the problem, but I thought I'd post my findings here:

1. Maps are bad--even if they're storing integers or other non-pointer structs. The implementation appears to have lots of pointers inside which must be evaluated and followed during mark/sweep GC.  Using structures with pointers magnifies the expense.
2. Slices are surprisingly bad (including strings and substrings of existing strings). A slice is a pointer to the backing array with a length and capacity. It would appear that the internal pointer that causes the trouble because GC wants to inspect it.

Using Dmitry's solution (where all strings are concatenated into a single string and then substrings are extracted from this single string and thus all extracted slices point to the same backing buffer) I found that this caused only a single additional heap object to be tracked and it used perhaps 25% less memory. Even so, because each of these new slices is inspected by GC, there were still significant GC pauses.

For the present time, the solution for my particular use case is as follows: A custom hashmap implementation which stores the strings represented as [48]byte.

Only by completely eliminating maps AND slices was I able to resolve the expense of GC pauses.

The silver lining in all of this is that I have high hopes for the Go garbage collector. It would appear that there's lots of room for improvement and I'm excited to see what Go 1.5 (and beyond) does.

Thank you for all of your help!

Caleb Spare

unread,
Dec 30, 2014, 5:38:17 PM12/30/14
to golang-nuts
I made an issue to discuss large GC pause times caused by maps:
https://golang.org/issue/9477

-Caleb

Peter Vessenes

unread,
Jan 24, 2015, 4:38:58 AM1/24/15
to golan...@googlegroups.com
Sorry for the double post, but I roughed out a (untested) lock-free pointer-free simple hash map library today based on the conversation here. Comments and improvements welcome. 

Reply all
Reply to author
Forward
0 new messages