Re: [go-nuts] Get random key in map

6,668 views
Skip to first unread message

Michael Jones

unread,
Mar 7, 2013, 11:28:01 AM3/7/13
to Julian Orth, golan...@googlegroups.com
mapname[keyvalue]

On Thu, Mar 7, 2013 at 5:56 AM, Julian Orth <ju....@gmail.com> wrote:
Hi,

What's the best way to get a random key out of a map? range is effectively randomized in Go 1 so I can range and break after the first result. But that's not in the specification and I don't know if it is efficient.

Julian

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Michael T. Jones | Chief Technology Advocate  | m...@google.com |  +1 650-335-5765

Julian Orth

unread,
Mar 7, 2013, 11:30:46 AM3/7/13
to golan...@googlegroups.com, Julian Orth
I don't understand. I don't have a keyvalue. I'm looking for one.

minux

unread,
Mar 7, 2013, 11:31:39 AM3/7/13
to Julian Orth, golan...@googlegroups.com
On Thu, Mar 7, 2013 at 9:56 PM, Julian Orth <ju....@gmail.com> wrote:
> What's the best way to get a random key out of a map? range is effectively
> randomized in Go 1 so I can range and break after the first result. But
> that's not in the specification and I don't know if it is efficient.
if you care about potential overheads with range, you need to generate
a random key yourself (which means you need to manage key values
outside of the map or use alternative data structures).

range over map do have one-time overheads.

chris dollin

unread,
Mar 7, 2013, 11:33:21 AM3/7/13
to Michael Jones, Julian Orth, golan...@googlegroups.com
On 7 March 2013 16:28, Michael Jones <m...@google.com> wrote:
> mapname[keyvalue]

That doesn't find a random key present in the
map, which is what I believe the OP is asking
for.

If you're going to ask for one, you're probably
going to ask for more, in which case, I'd extract
all the keys into an array and shuffle it. But
that may be overkill.
Chris "allusive" Dollin

Robert Carlsen

unread,
Mar 7, 2013, 11:34:02 AM3/7/13
to golan...@googlegroups.com, Julian Orth
var key keytype
for key, _ = range mapname {
    break
}

Although it isn't guaranteed to be random - it is in the current implementation.

Kevin Gillette

unread,
Mar 7, 2013, 12:01:23 PM3/7/13
to golan...@googlegroups.com, Julian Orth
The reason it's random is so that you cannot count on the ordering. Therefore, it would be wrong to also count on the "randomness". Ideally, it would be random, but not too random, so that people will not use map iteration as a source of randomness.

Patrick Mylund Nielsen

unread,
Mar 7, 2013, 12:09:38 PM3/7/13
to Kevin Gillette, golang-nuts, Julian Orth
Julian, is your map relatively static? If so, it might be much easier to reason about selecting a random int and using it as an index into a []string that simply has all the keys in the map.


--

Julian Orth

unread,
Mar 7, 2013, 12:35:52 PM3/7/13
to golan...@googlegroups.com, Kevin Gillette, Julian Orth
Thanks for the input so far.

I'm trying to implement a cache based on a randomized marking algorithm. Code: http://pastebin.com/aNbXNzBc
The relevant code is in line 60.

I'm using interface{}, but in practice I would create new types for different key/value types. If range is not random enough, I will probably use a random number instead of breaking after the first iteration.

But maybe I shouldn't use map at all?

Julian Orth

unread,
Mar 7, 2013, 12:36:48 PM3/7/13
to golan...@googlegroups.com, Kevin Gillette, Julian Orth
Sorry, the relevant code is in line 63, not 60.

Hotei

unread,
Mar 11, 2013, 9:36:28 AM3/11/13
to golan...@googlegroups.com
Assumption: You cant just take the first key from a range as guaranteed random.

Solution: Build an array of all the keys in keys[].  Get the length of keys[].  Pick a random int between 0 and length-1.  YourKey = keys[randomint]

Perhaps there's a better way?

Nate Finch

unread,
Mar 11, 2013, 2:43:28 PM3/11/13
to golan...@googlegroups.com
This only iterates over enough of the map to get to the correct value:


That's better than copying all the keys out into a separate data structure.

(note if you want the random value to change, you'll have to add random spaces to the text each time you run it, since the playground caches values of already-run programs)

Patrick Mylund Nielsen

unread,
Mar 11, 2013, 2:46:02 PM3/11/13
to Nate Finch, golang-nuts
"Better" is fairly subjective. You're doing a lot of copying there. A separate slice of keys would be better if the contents don't often change, and much more easy to reason about: you know what PRG is used, and you don't depend on iteration order randomization from the implementation.


--

Nate Finch

unread,
Mar 11, 2013, 3:13:23 PM3/11/13
to golan...@googlegroups.com
It depends a lot on how it's used, obviously. I was assuming that you didn't have a "local" copy of the map, but instead were passed the map (as in my GetRand method). Yes, if you are holding on to the map and this happens a lot, you should weigh the performance of the iteration vs. the extra memory of the list of keys.  If you're passed the map, this is at least better than copying all the keys (unless the keys are really small and the values are really big).

Julian Orth

unread,
Mar 11, 2013, 7:01:26 PM3/11/13
to golan...@googlegroups.com
Hello again.

I took some of the ideas in this thread, implemented them, and created some benchmarks. First let's look at the definitions:


In my case, KeyType is int and ValueType is bool. 

In MarkedCache1 I check if the cache reached the full size, and if unmared == 0, then I iterate over all keys and set the marked field to false. In either case I then use range to find the first element which is unmarked and delete it. Then the new key is added.
MarkedCache3 is implemented in the obvious way. I use range here too.
MarkedCache4 uses separately stored keys to get random unmarked elements via a random number generator.
MarkedCache7 goes further and creates an efficient bijection between natural numbers and the stored keys. This allows me to store the cached values in a slice. This means that I don't have to use the keys when I search for unmarked elements.

Here are the Set functions:


Here is the benchmark: http://pastebin.com/cYPitSrH

I tested with
sizeMul = 2, setMul = 3
sizeMul = 2, setMul = 30
sizeMul = 2, setMul = 300
sizeMul = 20, setMul = 300

The results:

sizeMul = 2, setMul = 3
Benchmark1 1000000      3225 ns/op
Benchmark3 1000000      3455 ns/op
Benchmark4 1000000      3408 ns/op
Benchmark7 1000000      2943 ns/op

sizeMul = 2, setMul = 30
Benchmark1   10000    245712 ns/op
Benchmark3   20000       82162 ns/op
Benchmark4   10000    450070 ns/op
Benchmark7   50000    124547 ns/op

sizeMul = 2, setMul = 300
Benchmark1      5000   4326246 ns/op
Benchmark3   10000   1038187 ns/op
Benchmark4      5000   2070510 ns/op
Benchmark7   10000      613224 ns/op

sizeMul = 20, setMul = 300
Benchmark1      2000   3383479 ns/op
Benchmark3   10000   1382566 ns/op
Benchmark4      2000   7889167 ns/op
Benchmark7   10000   1523249 ns/op


PS: The code I posted some days ago is broken.

Rémy Oudompheng

unread,
Mar 11, 2013, 7:16:51 PM3/11/13
to Julian Orth, golan...@googlegroups.com
On 2013/3/12 Julian Orth <ju....@gmail.com> wrote:
> In MarkedCache1 I check if the cache reached the full size, and if unmared
> == 0, then I iterate over all keys and set the marked field to false. In
> either case I then use range to find the first element which is unmarked and
> delete it. Then the new key is added.
> MarkedCache3 is implemented in the obvious way. I use range here too.
> MarkedCache4 uses separately stored keys to get random unmarked elements via
> a random number generator.
> MarkedCache7 goes further and creates an efficient bijection between natural
> numbers and the stored keys. This allows me to store the cached values in a
> slice. This means that I don't have to use the keys when I search for
> unmarked elements.
>
> Here are the Set functions:
>
> MC1: http://pastebin.com/uAsjQXZB
> MC3: http://pastebin.com/eCHpbi1Z

Breaking after the first iteration key is a bad idea. Don't do that.
It's not only badly random, it's just so unrandom you want to get away
from that.
I have tried and failed, and it had very, very nasty effects on production.

Running the below program a few times (go tip on linux/amd64) I get this:
map[2:130 3:99100 1:102 4:128 5:540]
map[4:99103 5:132 1:503 2:135 3:127]
map[1:256 2:129 3:111 4:99378 5:126]

I am pretty sure it was discussed already but can't find the discussion.

// PROGRAM BEGINS HERE
package main

import "fmt"

type T map[string]int

func main() {
m := T{"1": 1, "2": 1, "3": 1, "4": 1, "5": 1}
count := make(T)
for i := 0; i < 100000; i++ {
for k := range m {
count[k]++
break
}
}
fmt.Printf("%+v", count)
}
Reply all
Reply to author
Forward
0 new messages