Generation of random sequences

9 views
Skip to first unread message

sebastien

unread,
Aug 25, 2009, 10:16:40 AM8/25/09
to Clojure
What is the most efficient way to generate list of unique random
numbers? The function must look something like this:

(defn make-random-numbers [dim max]
....)

where dim is dimensionality of the vector, and max is the upper bound
of random numbers.
The problem is not to allow duplicates in the sequence. I've found two
options to solve this task:
1) the most straightforward way is to check if sequence already
contains new element on every addition, but it requires O(n!) steps
for a list of n numbers;
2) using clojure sets will make it easy to add unique numbers, but it
will return sorted sequence, so i need to shuffle it by myself, and i
don't see efficient way to do this.

So, is there any other ways to do the task?

Fogus

unread,
Aug 25, 2009, 10:42:37 AM8/25/09
to Clojure
A quick and dirty way would be to use a map as your intermediate
storage with your generated numbers as keys and some constant as their
assoc'd value. Once you've populated said map with the proper number
of entries (keeping track of clashes along the way) then get a
sequence using `(seq (.keySet myMap))`.

-m

John Harrop

unread,
Aug 25, 2009, 10:45:29 AM8/25/09
to clo...@googlegroups.com
Why not just use a HashSet rather than a SortedSet to begin with? 

Fogus

unread,
Aug 25, 2009, 10:51:27 AM8/25/09
to Clojure
> Why not just use a HashSet rather than a SortedSet to begin with?

Yes, that is clearly better.

-m

Christophe Grand

unread,
Aug 25, 2009, 10:57:59 AM8/25/09
to clo...@googlegroups.com
On Tue, Aug 25, 2009 at 4:16 PM, sebastien <sebast...@gmail.com> wrote:

What is the most efficient way to generate list of unique random
numbers? The function must look something like this:

(defn make-random-numbers [dim max]
  ....)

The simplest I can think of is:
(defn make-random-numbers [dim max]
  (take dim (distinct (repeatedly #(rand-int max)))))

Christophe

--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)

sebastien

unread,
Aug 25, 2009, 12:19:34 PM8/25/09
to Clojure

> The simplest I can think of is:
> (defn make-random-numbers [dim max]
>   (take dim (distinct (repeatedly #(rand-int max)))))

But distinct itself gives order of growth O(n!), through it scans
resulting list for every generated number, isn't it?

Christophe Grand

unread,
Aug 25, 2009, 12:26:32 PM8/25/09
to clo...@googlegroups.com
no, distinct uses a hash-set (nearly-constant lookup) so distinct is O(n).

sebastien

unread,
Aug 25, 2009, 12:38:06 PM8/25/09
to Clojure


On Aug 25, 7:26 pm, Christophe Grand <christo...@cgrand.net> wrote:
> no, distinct uses a hash-set (nearly-constant lookup) so distinct is O(n).

Oh, that's great, i will use it.
Thank you.

Jonathan Smith

unread,
Aug 25, 2009, 1:30:04 PM8/25/09
to Clojure
Why not make a list of numbers from 0 to max, randomly shuffle it,
and do a take on the resulting sequence until you have enough numbers.

Jonathan Smith

unread,
Aug 25, 2009, 1:43:47 PM8/25/09
to Clojure
like this:

(defn shuffle-java ;;from Shawn Hoover post on these google groups
"Shuffles coll using a Java ArrayList. Blows shuffle out of the
water on speed and space."
[coll]
(let [l (java.util.ArrayList. coll)]
(java.util.Collections/shuffle l)
(seq l)))

(defn random-number-list [max dim]
(take dim (shuffle-java (range 0 max))))

user> (random-number-list 100 20)
(38 70 73 57 10 81 32 99 92 19 77 39 27 24 47 17 86 8 58 76)

Jonathan Smith

unread,
Aug 25, 2009, 1:56:35 PM8/25/09
to Clojure
I would have worried that if dim and max were close you'd get to the
last number and keep calling #(rand-int) and never hit the last
number.

how does distinct work that it keeps that situation from occuring?

On Aug 25, 12:26 pm, Christophe Grand <christo...@cgrand.net> wrote:
> no, distinct uses a hash-set (nearly-constant lookup) so distinct is O(n).
>

Christophe Grand

unread,
Aug 25, 2009, 2:24:27 PM8/25/09
to clo...@googlegroups.com
On Tue, Aug 25, 2009 at 7:56 PM, Jonathan Smith <jonathan...@gmail.com> wrote:

I would have worried that if dim and max were close you'd get to the
last number and keep calling #(rand-int) and never hit the last
number.

how does distinct work that it keeps that situation from occuring?

(make-random-numbers 10 10) works but (make-random-numbers 11 10) hangs (trying to get 11 unique integers between 0 and 9...).

My solution works better when dim << max and yous works better when dim is close to max.
 
Christophe
Reply all
Reply to author
Forward
0 new messages