Random key from a map

6,946 views
Skip to first unread message

Alex Bligh

unread,
Mar 13, 2016, 7:03:15 PM3/13/16
to golang-nuts, Alex Bligh
Let us suppose I have a map, and wish to get a random entry from it (actually in this case I want to delete an entry at random).

Is there a way to do that that is O(1) or at least better than O(n)?

Deleting the first entry (and relying on golang's map randomisation) is tempting, but sadly won't work because in any given run of golang, the keys hash in the same way, and I need the entry to be deleted at random (with respect to the keys).

--
Alex Bligh




Sasha Sobol

unread,
Mar 13, 2016, 7:22:16 PM3/13/16
to golang-nuts, al...@alex.org.uk

Grabbing the first entry works.

Dominik Honnef

unread,
Mar 13, 2016, 7:29:57 PM3/13/16
to golang-nuts
On Sun, Mar 13, 2016 at 04:22:16PM -0700, Sasha Sobol wrote:
>
> Grabbing the first entry works.
> https://play.golang.org/p/M4SnqXpWX0

Random iteration order over maps is an implementation detail, it's not
guaranteed. A Go implementation can have predictable iteration order
and still be spec compliant. Also, it's not _good_ randomness, it's
just random enough to break poorly written tests. It's not good enough
for dice rolls or anything more demanding.

--
Dominik Honnef

Dave Cheney

unread,
Mar 13, 2016, 7:44:32 PM3/13/16
to golang-nuts
Iterating over mapa returns values in an undefined order. There is no guarantee that the order of keys will be random, just as there is no guarantee that order will not be random.

adon...@google.com

unread,
Mar 13, 2016, 8:58:09 PM3/13/16
to golang-nuts, al...@alex.org.uk
It can be done in amortized constant time if you construct a slice of the keys and then pick slice indices at random.  If you build the slice at the same time as the map, the overhead will be even lower.

Alex Bligh

unread,
Mar 14, 2016, 3:57:01 AM3/14/16
to Sasha Sobol, Alex Bligh, golang-nuts

On 13 Mar 2016, at 23:22, Sasha Sobol <sa...@asobol.com> wrote:

>
> Grabbing the first entry works.
> https://play.golang.org/p/M4SnqXpWX0

Grabbing the first entry does not work (as I said in the message) as if my map contains keys "p", "q" and "r", grabbing the first entry will always return the same value (either "p", "q", or "r") during any given run. That is not random.

--
Alex Bligh




Alex Bligh

unread,
Mar 14, 2016, 4:01:23 AM3/14/16
to adon...@google.com, Alex Bligh, golang-nuts

On 14 Mar 2016, at 00:57, adon...@google.com wrote:

> It can be done in amortized constant time if you construct a slice of the keys and then pick slice indices at random. If you build the slice at the same time as the map, the overhead will be even lower.

Let's assume the map is long living and changes constantly. Generating a slice of keys from the map is O(n) at the time of picking. Keeping a slice of keys separately to the map is going to make (e.g.) deletes from the map far worse than current performance. Unless I'm missing something.

--
Alex Bligh




Caleb Spare

unread,
Mar 14, 2016, 4:19:29 AM3/14/16
to Alex Bligh, Alan Donovan, golang-nuts
As long as you keep the index as part of the map value, you can keep
the slice updated by doing O(1) slice operations along with each map
insert, delete, etc. For example, for a delete you can swap in the
last key in the slice into the index to be deleted and decrement the
size of the slice by 1.
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Tamás Gulácsi

unread,
Mar 14, 2016, 4:32:14 AM3/14/16
to golang-nuts
How random has the delete must be?
If your map changes a lot, remove the nth element (with for range), where n=Rand(0,len(map)/2) or some lower or fix limit.

Caleb Spare

unread,
Mar 14, 2016, 4:35:02 AM3/14/16
to Alex Bligh, Alan Donovan, golang-nuts
Here's a quick example of what I mean: http://play.golang.org/p/El1_Ntfqo6

I didn't spend much time testing this so there may be bugs lurking.

-Caleb

Alex Bligh

unread,
Mar 14, 2016, 5:06:19 AM3/14/16
to Tamás Gulácsi, Alex Bligh, golang-nuts

On 14 Mar 2016, at 08:32, Tamás Gulácsi <tgula...@gmail.com> wrote:

> How random has the delete must be?
> If your map changes a lot, remove the nth element (with for range), where n=Rand(0,len(map)/2) or some lower or fix limit.

Finding the nth element is O(n), which is what I was trying to avoid. I wondered whether there was a way to (ab)use the underlying storage, e.g. find a bucket at random, and proceed a random length along the bucket chain.

--
Alex Bligh




Alex Bligh

unread,
Mar 14, 2016, 5:06:33 AM3/14/16
to Caleb Spare, Alex Bligh, Alan Donovan, golang-nuts

On 14 Mar 2016, at 08:34, Caleb Spare <ces...@gmail.com> wrote:

> Here's a quick example of what I mean: http://play.golang.org/p/El1_Ntfqo6
>
> I didn't spend much time testing this so there may be bugs lurking.
>
> -Caleb
>
> On Mon, Mar 14, 2016 at 1:18 AM, Caleb Spare <ces...@gmail.com> wrote:
>> As long as you keep the index as part of the map value, you can keep
>> the slice updated by doing O(1) slice operations along with each map
>> insert, delete, etc. For example, for a delete you can swap in the
>> last key in the slice into the index to be deleted and decrement the
>> size of the slice by 1.

Thanks. That looks promising.

--
Alex Bligh




Konstantin Shaposhnikov

unread,
Mar 14, 2016, 6:08:23 AM3/14/16
to golang-nuts
You can also keep slice of the map keys shuffled. When adding a new key to the map add this key to the end of the slice and swap it with a random element in the slice (including just added). Then you can always use the last key of the slice as a random key.

Both add and delete are O(1).

Dave Cheney

unread,
Mar 14, 2016, 6:09:44 AM3/14/16
to golang-nuts, al...@alex.org.uk
Maybe you don't want a map, but a slice instead, then choosing a random element is simple. What other kinds of access to you do to the "map" besides addition ?

Alex Bligh

unread,
Mar 14, 2016, 6:44:14 AM3/14/16
to Dave Cheney, Alex Bligh, golang-nuts

On 14 Mar 2016, at 10:09, Dave Cheney <da...@cheney.net> wrote:

> Maybe you don't want a map, but a slice instead, then choosing a random element is simple. What other kinds of access to you do to the "map" besides addition ?

The map is frequently indexed by its key, for insert, delete, and get, i.e. all the normal map operations. What I wanted is something just like a map, with the additional property of being able to pick a random element. It's in essence a cache of a fixed size that evicts at random rather by LRU, but that's by the by. I don't need it to be cryptographically random, but I do need it to evict different elements each time called on the same run if the cache is set up the same way (so using the 'first' element in the hash won't work). So I'd really like a put() call that returned the item it evicted if any.

I realise I can do this with another data structure (the slice containing the data with a map between key and index is quite neat, except I think it means I need to keep the key twice so I know the key of the data I'm evicting). I could also use (e.g.) an LLRB tree and descend a random path.

I was really wondering whether there is some way to look into the back end implementation of the map, and (e.g.) pick a random bucket (by choosing a random hash value) and walk the chain a random distance (I'm assuming here it's bucket plus chain) which if the hash algorithm is good should produce a (roughly) random result.

If I'm being honest, in practical terms I'm quite happy with the sledgehammer method of using another data structure (with all the nasty mess of casts that involves if I want it applicable to multiple types of map), but I was wondering if there was some elegant way of *just* using a map.

--
Alex Bligh




Peter Weinberger

unread,
Mar 14, 2016, 7:07:52 AM3/14/16
to Alex Bligh, Dave Cheney, golang-nuts
"but I was wondering if there was some elegant way of *just* using a map."
Not that anyone has thought of. The implementation of map could change at any time, as long as it is consistent with the documentation and stability guarantee. (It did change once: the order of a range on small maps used to be deterministic across runs.)

In Java, where the hash code for keys is under the program's control, they changed the implementation to avoid hash collision attacks. I think the implementation now includes some trees too.


--
Alex Bligh




roger peppe

unread,
Mar 14, 2016, 9:14:09 AM3/14/16
to Alex Bligh, Dave Cheney, golang-nuts
On 14 March 2016 at 10:44, Alex Bligh <al...@alex.org.uk> wrote:
>
> On 14 Mar 2016, at 10:09, Dave Cheney <da...@cheney.net> wrote:
>
>> Maybe you don't want a map, but a slice instead, then choosing a random element is simple. What other kinds of access to you do to the "map" besides addition ?
>
> The map is frequently indexed by its key, for insert, delete, and get, i.e. all the normal map operations. What I wanted is something just like a map, with the additional property of being able to pick a random element. It's in essence a cache of a fixed size that evicts at random rather by LRU, but that's by the by. I don't need it to be cryptographically random, but I do need it to evict different elements each time called on the same run if the cache is set up the same way (so using the 'first' element in the hash won't work).

It actually will work in Go as it currently is, and I don't see that changing
any time soon.

for key := range someMap {
delete(someMap, key)
break
}

It's certainly not cryptographically random, and it's perhaps not even *that*
random, but it may well be good enough for your purposes.

But the "keep an extra slice" code is quite straightforwardly implemented too:

http://play.golang.org/p/hiRpc-bJK_

You might want to add code to shrink the slice if the map shrinks too much
so that you don't get left with garbage from a spike in usage.

> So I'd really like a put() call that returned the item it evicted if any.
>
> I realise I can do this with another data structure (the slice containing the data with a map between key and index is quite neat, except I think it means I need to keep the key twice so I know the key of the data I'm evicting). I could also use (e.g.) an LLRB tree and descend a random path.
>
> I was really wondering whether there is some way to look into the back end implementation of the map, and (e.g.) pick a random bucket (by choosing a random hash value) and walk the chain a random distance (I'm assuming here it's bucket plus chain) which if the hash algorithm is good should produce a (roughly) random result.
>
> If I'm being honest, in practical terms I'm quite happy with the sledgehammer method of using another data structure (with all the nasty mess of casts that involves if I want it applicable to multiple types of map), but I was wondering if there was some elegant way of *just* using a map.
>
> --
> Alex Bligh
>
>
>
>

Alex Bligh

unread,
Mar 14, 2016, 7:01:32 PM3/14/16
to roger peppe, Alex Bligh, Dave Cheney, golang-nuts

On 14 Mar 2016, at 13:13, roger peppe <rogp...@gmail.com> wrote:

> It actually will work in Go as it currently is, and I don't see that changing
> any time soon.
>
> for key := range someMap {
> delete(someMap, key)
> break
> }

So, per
https://play.golang.org/p/hJpbvsgUHj
contrary to what I thought, this does actually work. I had thought the hash function was randomized at runtime, so running the same go program many times would result in a different iteration order of the same hash, but within the same run the iteration order of the same hash would always be the same. It seems I am wrong and you (and various others who pointed this out) are right. However, this relies on what I believe is undocumented behaviour outside golang's 'promise' of back compatibility.

Also note that it isn't very random. On 100 runs through sort | uniq -c

12 Map is map[a:aa b:bb]
11 Map is map[a:aa c:cc]
65 Map is map[b:bb c:cc]
3 Map is map[c:cc a:aa]
9 Map is map[c:cc b:bb]

or putting that in English:

Deleted a: 74%
Deleted b: 14%
Deleted c: 12%

It isn't improved with 1,000 runs.

--
Alex Bligh




Matt Harden

unread,
Mar 16, 2016, 12:55:21 PM3/16/16
to Alex Bligh, roger peppe, Dave Cheney, golang-nuts
I'm curious whether this gets any better with a larger map. Iterating a small but random number of times before deleting would still be O(1) and might improve the randomness of the selection significantly. I agree though that the promise doesn't apply here. A map implementation could always return the keys in the same order and still be valid.

Reply all
Reply to author
Forward
0 new messages