Millions of map keys

417 views
Skip to first unread message

Christian von Kietzell

unread,
May 3, 2016, 7:13:39 AM5/3/16
to golang-nuts
Hi Go nuts,

would it be a terribly bad idea to have a map with millions of keys? In my case those keys would be short to medium-length strings (anything from 1 to about 20 characters). Values would be slices of int64.

If that could provoke the GC to show less-than-optimal behaviour what are some possible alternatives?


Cheers,
Chris

Matthew Zimmerman

unread,
May 3, 2016, 7:31:57 AM5/3/16
to Christian von Kietzell, golang-nuts

Millions should be fine in my experience, however you cannot use a slice of anything as the keys so you'll have to convert it to something else first.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Val

unread,
May 3, 2016, 7:59:30 AM5/3/16
to golang-nuts, cub...@gmail.com
@Matthew the keys are strings, which is already fine. The values are slices.

Nick Craig-Wood

unread,
May 3, 2016, 9:11:39 AM5/3/16
to Christian von Kietzell, golang-nuts
My understanding is that big maps with no pointers are fine.

However a string has an implicit pointer in, so I suspect that might
trigger the suboptimal GC behaviour.


--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

Egon

unread,
May 3, 2016, 10:06:11 AM5/3/16
to golang-nuts
It's hard to suggest really good optimizations based on an abstract problem. Even the characters and values in the slices may suggest a better data-structure that is not possible in the generic case.

In other words, can you show a sample of the data or a program that generates a representative dataset, (so that the character distribution, length distribution, value distributions are good). Then also include how the dataset is accessed, how often iterated, modified etc. (see https://github.com/golang/go/wiki/HowToAsk).

As a general suggestion, do the stupid thing first, measure -> if it's good enough, proceed with more important problems :)

+ Egon

Alex Bligh

unread,
May 3, 2016, 10:12:52 AM5/3/16
to Christian von Kietzell, Alex Bligh, golang-nuts

On 3 May 2016, at 12:13, Christian von Kietzell <cub...@gmail.com> wrote:

> would it be a terribly bad idea to have a map with millions of keys? In my case those keys would be short to medium-length strings (anything from 1 to about 20 characters). Values would be slices of int64.
>
> If that could provoke the GC to show less-than-optimal behaviour what are some possible alternatives?

Do the keys change frequently? Do the values change frequently? Certainly if the map is static or nearly static there will be no issue.

--
Alex Bligh




Daniel Skinner

unread,
May 3, 2016, 11:17:12 AM5/3/16
to Alex Bligh, Christian von Kietzell, golang-nuts
> As a general suggestion, do the stupid thing first

That can't be stressed enough. If I had a nickel for every time the naive/simple approach worked best ...

halmstad....@gmail.com

unread,
May 3, 2016, 2:39:24 PM5/3/16
to golang-nuts
If you're interested in big, possibly memory efficient maps; this might be worth having a look at:
https://github.com/fncodr/godbase/tree/master/maps

They offer more control than maps at least, which is what you often crave for when working with big datasets...

Peace

Brad Fitzpatrick

unread,
May 3, 2016, 4:35:33 PM5/3/16
to Christian von Kietzell, golang-nuts
That's fine. I have millions of things in maps.

Use Go 1.6 for best performance.


--

Dan Kortschak

unread,
May 3, 2016, 9:51:42 PM5/3/16
to Christian von Kietzell, golang-nuts
I'm doing a genomics analysis at the moment that creates maps with
>600million map entries. It works fine.
Reply all
Reply to author
Forward
0 new messages