DISCUSS: replace (rand)

43 views
Skip to first unread message

Stuart Halloway

unread,
Dec 2, 2008, 12:04:31 AM12/2/08
to clo...@googlegroups.com
Clojure's rand delegates to Java's Math.random(), which I am pretty
sure has a synchronized block in it.

One problem with living on top of Java is calling into methods that
have no (conceptual) need to be synchronized. This could hurt
performance in an app carefully written in Clojure to avoid mutable
state and locking. Since unsynchronized PRNGs exist, I would suggest
we modify rand to use one. (I am willing to take the lead on writing
one in Clojure if needed.)

Thoughts?

Stuart

Paul Barry

unread,
Dec 2, 2008, 1:05:23 AM12/2/08
to Clojure
Looks like the only synchronization is for lazy initialization of the
instance of Random used by the static method:

public final class Math {

private static Random randomNumberGenerator;

private static synchronized void initRNG() {
if (randomNumberGenerator == null)
randomNumberGenerator = new Random();
}

public static double random() {
if (randomNumberGenerator == null) initRNG();
return randomNumberGenerator.nextDouble();
}

}

public class Random implements java.io.Serializable {
public Random() { this(++seedUniquifier + System.nanoTime()); }
private static volatile long seedUniquifier = 8682522807148012L;

public double nextDouble() {
long l = ((long)(next(26)) << 27) + next(27);
return l / (double)(1L << 53);

bOR_

unread,
Dec 2, 2008, 4:42:04 AM12/2/08
to Clojure
I wanted to ask what algorithm Java is using for calculating its
random numbers (having a choice of algorithms would be great - some
applications favor one above the other), but I found a website digging
into exactly this question:

http://www.math.utah.edu/~beebe/java/random/README

"I therefore had to resort to peeking inside the source, found on Sun
Solaris 9 in /usr/j2se/src.zip, member java/util/Random.java. It uses
a 48-bit linear congruential generator recommended by Donald Knuth (a
good sign), but returns only a 32-bit subset of the computed bits.

The default seed is the current time-of-day, returned by
System.currentTimeMillis()."

I'll compare it later on to the one ruby is using (Mersenne Twister
Algorithm, which is excellent for monte carlo-like simulations), just
to get my own bearings.

Stuart Halloway

unread,
Dec 2, 2008, 8:05:07 AM12/2/08
to clo...@googlegroups.com
nextDouble calls next, which according to Java 5 docs is:

synchronized protected int next(int bits) {
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
return (int)(seed >>> (48 - bits));
}

This is exactly the kind of thing I think we shouldn't have to worry
about in Clojure.

Stuart

Michael Wood

unread,
Dec 2, 2008, 8:21:14 AM12/2/08
to clo...@googlegroups.com
On Tue, Dec 2, 2008 at 3:05 PM, Stuart Halloway
<stuart....@gmail.com> wrote:
>
> nextDouble calls next, which according to Java 5 docs is:
>
> synchronized protected int next(int bits) {
> seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) - 1);
> return (int)(seed >>> (48 - bits));
> }
>
> This is exactly the kind of thing I think we shouldn't have to worry
> about in Clojure.

The docs for Math.random() say:

; When this method is first called, it creates a single
; new pseudorandom-number generator, exactly as if by
; the expression
;
; new java.util.Random
;
; This new pseudorandom-number generator is used
; thereafter for all calls to this method and is used
; nowhere else.
;
; This method is properly synchronized to allow correct
; use by more than one thread. However, if many threads
; need to generate pseudorandom numbers at a great rate,
; it may reduce contention for each thread to have its
; own pseudorandom-number generator.

So can't you just create multiple instances of java.util.Random if you
need that sort of thing?

--
Michael Wood <esio...@gmail.com>

Stuart Halloway

unread,
Dec 2, 2008, 8:42:16 AM12/2/08
to clo...@googlegroups.com
Even if you use a per-thread instance, you pay the synchronization
penalty. Because there is no contention, the penalty is lower, but it
is still not zero.

Is it big enough to matter? My intuition says "yes". That's worth
nothing, so I will write some tests when I have some spare time ...
but secretly I was hoping that this thread would goad someone else
into writing the tests and publishing their results. :-)

Stuart

Paul Barry

unread,
Dec 2, 2008, 8:57:53 AM12/2/08
to Clojure
Ah, I didn't see the call to next. The java docs do say that is the
implementation for that method, but they are lying:

protected int next(int bits) {
long oldseed, nextseed;
AtomicLong seed = this.seed;
do {
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
} while (!seed.compareAndSet(oldseed, nextseed));
return (int)(nextseed >>> (48 - bits));

Stuart Halloway

unread,
Dec 2, 2008, 9:17:03 AM12/2/08
to clo...@googlegroups.com
Cool, that's much better. Can we establish that all (or all important)
Java 5+ VMs use AtomicLong in next?

Lauri Pesonen

unread,
Dec 2, 2008, 10:25:48 AM12/2/08
to clo...@googlegroups.com
2008/12/2 Stuart Halloway <stuart....@gmail.com>:

>
> Cool, that's much better. Can we establish that all (or all important)
> Java 5+ VMs use AtomicLong in next?

While compare-and-swap is a lot better than a mutex, it's still not
free. IIRC a CAS is one (or maybe two) order(s) of magnitude more
expensive than a normal store. So a non-threadsafe PRNG should still
give you a performance boost compared to the CAS-version. But, as you said,
this is all speculation until someone writes some tests...

--
! Lauri

Paul Barry

unread,
Dec 2, 2008, 10:42:24 AM12/2/08
to Clojure
Since this is just pure java, shouldn't it be the same on all 1.5
JVMs? Would it be different on other JVM implementations? Just to
verify, I checked on my Mac OS X 10.5 and in the Sun JDK 1.5 on
Windows XP, and it does appear to be the same.

Dave Newton

unread,
Dec 2, 2008, 11:40:03 AM12/2/08
to clo...@googlegroups.com
--- On Tue, 12/2/08, Paul Barry wrote:
> Since this is just pure java, shouldn't it be the same on all 1.5
> JVMs? Would it be different on other JVM implementations? Just
> to verify, I checked on my Mac OS X 10.5 and in the Sun JDK
> 1.5 on Windows XP, and it does appear to be the same.

A Sun v. Sun comparison? What about JDK/JVM implementations from different vendors?

Dave

Christian Vest Hansen

unread,
Dec 3, 2008, 9:06:14 AM12/3/08
to clo...@googlegroups.com
For what it is worth, I implemented (for another project) George
Marsaglias XorShift algorithm (as recommended in Java Concurrency in
Practice) in a thread-safe contention free maner, by making the
thread-id part of the source entropy. This way, two threads will
generate different random numbers from the same seed. And when a
thread see a seed written by another thread, then that is just another
source of entropy.

The implementation has two drawbacks compared java.util.Random:
* It's a "medium quality" algorithm, whatever that means, and I
_presume_ that whatever is in java.util.Random is high quality. But I
have not had any problems with lack of randomness by the
implementation.
* If you want to share an instance between threads, then it's _your_
responsibility to guarantee safe publication.

Link: http://github.com/karmazilla/nanopool/tree/master/src/main/java/net/nanopool/CheapRandom.java
--
Venlig hilsen / Kind regards,
Christian Vest Hansen.

Mark H.

unread,
Dec 3, 2008, 11:45:36 AM12/3/08
to Clojure
On Dec 2, 5:42 am, Stuart Halloway <stuart.hallo...@gmail.com> wrote:
> Is it big enough to matter? My intuition says "yes".  That's worth  
> nothing, so I will write some tests when I have some spare time ...  
> but secretly I was hoping that this thread would goad someone else  
> into writing the tests and publishing their results. :-)

I've got a parallel (using Pthreads) Monte Carlo test program in C
that could be easily ported. If nobody wants to do my work for me ;-P
then I'll get to it as soon as I can. But I don't have PRNG quality
metrics -- those are the main things we should port if we're into
writing new parallel PRNGs. btw you probably don't have to implement
your own from scratch -- you could just port something like SPRNG:

http://sprng.fsu.edu/Version2.0/index.html

I've written a little about interfaces for parallel PRNGs here:

http://www.cs.berkeley.edu/~mhoemmen/cs194/Tutorials/prng.pdf

I have the feeling that the "right interface" for a single stream of
pseudorandom numbers is a seq, rather than a "function" that you
call.

mfh

Konrad Hinsen

unread,
Dec 3, 2008, 12:27:53 PM12/3/08
to clo...@googlegroups.com
On Dec 3, 2008, at 17:45, Mark H. wrote:

> I have the feeling that the "right interface" for a single stream of
> pseudorandom numbers is a seq, rather than a "function" that you
> call.

I'd provide two interfaces:

1) Low-level: (rng seed) yielding a pair [random-number, new-seed]
2) High-level: (random seed) yielding an infinite (obviously lazy)
seq of the random numbers obtained starting from the given seed.

What I consider important is the explicit specification of the seed,
which means that the code always says explicitly where its random
number sequence starts from.


I just happen to have those two functions in the example section of
my monad implementation, as rng works nicely in the state monad:

(defn rng [seed]
(let [m 259200
value (/ (float seed) (float m))
next (rem (+ 54773 (* 7141 seed)) m)]
(list value next)))

(defn random [seed]
(let [[value next] (rng seed)]
(lazy-cons value (random next))))

This particular implementation of rng is probably not a good one
(it's the first algorithm I found), but my point here is the
interface, not the algorithm.

Konrad.

Mark H.

unread,
Dec 3, 2008, 1:55:32 PM12/3/08
to Clojure
On Dec 3, 9:27 am, Konrad Hinsen <konrad.hin...@laposte.net> wrote:
> I'd provide two interfaces:
>
> 1) Low-level: (rng seed) yielding a pair [random-number, new-seed]
> 2) High-level: (random seed) yielding an infinite (obviously lazy)  
> seq of the random numbers obtained starting from the given seed.
>
> What I consider important is the explicit specification of the seed,  
> which means that the code always says explicitly where its random  
> number sequence starts from.

Some PRNGs make a distinction between "seed" and "state." For
example, the Mersenne Twister uses one integer as a seed, but an array
of integers as its state, which it modifies each time a new
pseudorandom number is requested. Of course the "seed" in the sense
of your interface could be that "state" (with the PRNG rewritten to
copy the state before updating it), since the state suffices to
continue the PRNG stream from any particular point. The MT uses a
rather large amount of state but you lose all the advantages of seqs
and immutability if you don't copy it.

Some parallel PRNGs have two different seeds: one to guarantee a
unique noncorrelated stream for each processor, and another to seed
the stream. This isn't much different than your interface, though.

btw (not to you, to another post I recall in this thread) seeding a
PRNG using the thread id isn't enough to guarantee that different
threads don't have correlated PRNG streams.

mfh

don.aman

unread,
Dec 4, 2008, 3:07:56 AM12/4/08
to Clojure

Since we're being all high-level, it'd be good for a random function
which allows us to specify the range of numbers, since % doesn't
promise an even spread of probabilities (especially for large ranges).

Kyle Schaffrick

unread,
Dec 4, 2008, 9:28:07 PM12/4/08
to clo...@googlegroups.com

If one plans to get very involved in the business of noise, I would
recommend exploring Boost's random library for C++. It provides an
array of entropy sources (MT, lagged fib, nondeterministic sources,
etc) and statistical distributions (uniform, normal, lognormal, and a
BUNCH more), and the user composes one of each to produce a generator.

I don't know how well the API would translate into a functional
programming style, but such a separation turns out to be fairly elegant.
Of course, it's probably also far too specialized to belong in the core!

As much flak as C++ gets, I feel like Boost goes a *long* way towards
making C++ productive and enjoyable for certain kinds of problems. There
are a lot of smart folks behind Boost and I personally wouldn't hesitate
to steal good ideas from them :)

-Kyle

bOR_

unread,
Dec 5, 2008, 9:56:57 AM12/5/08
to Clojure
Not sure if I understand the second part (% an even part), but I
realized that
a random function for a specific range of numbers can be written using
the
lazy example in this thread. (I earlier got stuck trying to use (cycle
(rand-int 10))
for that.

(defn lazy-rand-vect
"a lazy cycle of random elements taken from a vector."
[vector]
(lazy-cons (nth vector (rand-int (count vector))) (lazy-rand-vect
vector)))


example:
(def amino_acids 21)
(def virus_length 1000)

(defn make-virus
"Creates a virus of length virus_length"
[]
(apply str (take virus_length (lazy-rand-vect (apply vector (map #
(char (+ 65 %)) (range 10))))))
; old way also worked, but lazy-rand-vect I can use for a few other
things as
well.:
;(apply str (take virus_length (map #(char (+ (rand-int amino_acids)
65 %)) (cycle (list 0))))))

Mark H.

unread,
Dec 5, 2008, 5:32:04 PM12/5/08
to Clojure
Sure it does, as long as you get enough random bits that scaling the
output of the number to an arbitrary range doesn't lose more random
bits than it promises to give. It's not the PRNG's responsibility to
do that, it's the responsibility of the function doing the scaling.
Furthermore, if you're in floating-point arithmetic, then scaling by
powers of two doesn't lose any bits as long as the numbers don't
overflow or underflow. You'll lose a bit or two from going from the
powers of two to the range you want, but you can always get a few
extra random bits to account for that if you're concerned.

mfh

Robert Feldt

unread,
Dec 6, 2008, 8:58:13 PM12/6/08
to Clojure
On Dec 2, 10:42 am, bOR_ <boris.sch...@gmail.com> wrote:
> I'll compare it later on to the one ruby is using (Mersenne Twister
> Algorithm, which is excellent for monte carlo-like simulations), just
> to get my own bearings.
>

An alternative to MT that is as good, according to RNG expert George
Marsaglia, but simpler and faster is Marsaglias multiply-with-carry
(MWC). There are of course plenty of MT implementations in Java out
there that you could reuse but if you need a pure clojure PRNG the
simplest MWC1 below might fit the bill.

I have not tested this extensively but I implemented his simplest MWC
PRNG (interface is not so nice maybe but I was playing with
multimethods and STMs ;)), FWIW:

(defstruct *mwc-rng* :rng-type :seed1 :seed2 :z :w :constz :constw)

(defn make-rng-mwc1 [seed1 seed2]
(struct *mwc-rng* :marsaglia-mwc1 seed1 seed2
(bit-and seed1 0xffffffff) (bit-and seed2 0xffffffff) 36969
18000))

(defmulti update :rng-type)
(defmulti randint32 :rng-type)

(defn mwc1 [c v]
(+ (* c (bit-and v 0xffff)) (bit-shift-right v 16)))

(defmethod update :marsaglia-mwc1
[rng]
(merge rng {:z (mwc1 (:constz rng) (:z rng))
:w (mwc1 (:constw rng) (:w rng))}))

(defmethod randint32 :marsaglia-mwc1
[rng]
(+ (bit-shift-left (bit-and (:z rng) 0xffff) 16)
(bit-and (:w rng) 0xffff)))

(defstruct *autoupdating-rng* :rng-type :rng-ref)

(defn make-rng-autoupdating [corerng]
(struct *autoupdating-rng* :autoupdating (ref corerng)))

(defn update-then-apply
[autorng fn]
(let [refrng (:rng-ref autorng)]
(dosync
(alter refrng #(update %))
(fn @refrng))))

(defmethod randint32 :autoupdating [rng]
(update-then-apply rng #(randint32 %)))

Easy to turn into a lazy stream version if preferred and type hints
could speed it up...

For details check: http://www.ms.uky.edu/~mai/RandomNumber

Cheers,

Robert Feldt

Mark H.

unread,
Dec 7, 2008, 2:17:12 PM12/7/08
to Clojure
btw this suggests an interface analogous to Java's separation of
"streams" and "readers" for I/O. In this case, a "stream" would be a
source of uniformly distributed bits, and a "reader" would be the
particular range (addressing your concern above, which could be a
problem for bignum ranges if people don't know to take enough random
bits) and distribution (uniform, gaussian, etc.). Both Intel's MKL
and AMD's ACML use this style of interface in their PRNG libraries:

http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm
http://developer.amd.com/cpu/libraries/acml/Pages/default.aspx

A reader may take more or less elements from the stream than it
produces in this case: for example, Knuth's "reader" producing
Gaussian pseudorandom numbers alternates between taking two uniformly
distributed numbers and taking none. Thus the reader should tell
users its expected period, which could be different than the period of
the underlying stream.

Fun fact: in 2007, use of a linear congruential PRNG with an unlucky
choice of input size caused the LINPACK benchmark to crash
(unnecessarily) on a huge supercomputer after wasting 20 hours of
compute time:

http://www.netlib.org/lapack/lawnspdf/lawn206.pdf

(There's a nice analysis in the paper.)

mfh

Mark H.

unread,
Dec 14, 2008, 2:19:58 AM12/14/08
to Clojure
So I'm going to stop pretending like I'm an expert and actually post
some Clojure code. Be constructively critical 'cause I'm a n00b in
that regard ;-) This is a pseudorandom number generator for the
Gaussian (0,1) distribution.

(defn next-gaussrand-state [current-
state]
^{:doc "Given the current state (current-state) of the Gaussian-
(0,1)
distribution pseudorandom number generator, return the next state.
This
operation is nondestructive as long as current-state's underlying
uniform
pseudorandom number generator is nondestructive. next-gaussrand-state
is
a low-level routine out of which one can build a seq of
Gaussian-
distribution pseudorandom numbers. See notes in the text
below."}
(if (= (:phase current-state)
0)
(loop [state current-
state]
(let [prng (:uniform-prng-seq
state)
U1 (first
prng)
U2 (frest
prng)
prng-rrest (rrest
prng)
V1 (- (* 2.0 U1)
1.0)
V2 (- (* 2.0 U2)
1.0)
S (+ (* V1 V1) (* V2
V2))
X (* V1 (. Math sqrt (/ (* -2.0 (. Math log S))
S)))]
(if (or (>= S 1) (= S
0))
;; Reject the out-of-range sample. Recall that U1 and U2
have
;; been shifted from (0,1] to (-1, 1]. It's entirely
possible
;; for S to be > 1 -- no more than 2 in exact arithmetic,
and
I
;; suspect V1 and V2 are computed exactly for the extreme
case
;; of U1=1 and
U2=1.
(recur {:uniform-prng-seq prng-rrest :X X :V1 V1 :V2 V2 :S
S :phase (:phase
state)})
{:uniform-prng-seq prng-rrest :X X :V1 V1 :V2 V2 :S S :phase
(- 1 (:phase state))})))
(let [V1 (:V1 current-
state)
V2 (:V2 current-
state)
S (:S current-
state)]
{:uniform-prng-seq (:uniform-prng-seq current-
state)
:X (* V2 (. Math sqrt (/ (* -2.0 (. Math log S))
S)))
:V1 V1 :V2 V2 :S S :phase (- 1.0 (:phase current-
state))})))

defn first-gaussrand-state [uniform-prng-
seq]
^{:doc "Given a seq of uniformly distributed pseudorandom numbers on
(0,1),
return the initial state of a seq of Gaussian-(0,1) pseudorandom
numbers.
"}
(next-gaussrand-state {:X 0.0 :V1 0.0 :V2 0.0 :S 0.0 :phase
0 :uniform-prng-seq uniform-prng-
seq}))

(defn gaussrand-value [gaussrand-
state]
(:X gaussrand-
state))

(defn new-gaussrand-seq [uniform-prng-
seq]
^{:doc "Return a seq of Gaussian-(0,1) (mean 1 and standard
deviation
0)
pseudorandom numbers, given a seq of uniformly distributed (on
(0,1))
pseudorandom floating-point
numbers."}
(map gaussrand-value (iterate next-gaussrand-state (first-gaussrand-
state uniform-prng-
seq))))

(defn test-gaussrand
[n]
^{:doc "Simple test for new-gaussrand-seq. Should return a small
floating-
point
number that gets smaller as n (a positive integer) gets
bigger."}
(let [PRNG (new
java.util.Random)
;; Somewhat broken because java.util.Random is not stateless,
and
;; the nextDouble() method changes its state. However,
this
;; particular java.util.Random object is not shared outside of
this
;; routine, so we can guarantee that uniform-prng-seq is a
true
seq
;; in the context of this
method.
uniform-prng-seq (iterate (fn [_] (. PRNG nextDouble)) (. PRNG
nextDouble))
gaussrand-seq (new-gaussrand-seq uniform-prng-
seq)]
(/ (reduce + 0 (take n gaussrand-seq)) n)))

Notes on next-gaussrand-state:

This implementation was ported almost naively from the C FAQ
online:

http://c-faq.com/lib/gaussian.html

Knuth discusses this method in his classic tome ("The Art of Computer
Programming, Volume 2: Seminumerical Algorithms", section 3.4.1,
subsection C, algorithm P), in which he cites the following:

G. Marsaglia and T. A. Bray, "A convenient method for generating
normal variables," SIAM Review, volume 6, issue 3, pp. 260-4, July
1964.

After porting the C FAQ code, I discovered that the same method is
also used by java.util.Random.nextDouble(), at least as of Java
version 1.4.2: see

http://java.sun.com/j2se/1.4.2/docs/api/java/util/Random.html#nextGaussian()

There it is called "the polar method of G. E. P. Box, M. E. Muller,
and G. Marsaglia." (Where is this G. E. P. Box in the list of
authors in the SIAM Review article? I don't own a copy of Knuth so
I'll have to dig a little to find out.) Note also that Marsaglia and
others have written further works on the subject, even in the last few
years, so the problem is by no means dead. I haven't looked more
closely at this algorithm so I include it here merely as a useful
exercise for myself and not because I intend others to go and copy it
without questioning it. Anyway, the C FAQ makes a mistake multiple
times in the aforementioned article: it tries to get uniformly
distributed pseudorandom doubles on (0,1) by scaling the output of rand
(). rand() only gives you 32 (or 31?) bits of entropy but you need 53
bits for a double-precision floating-point number. That's why Java
calls nextDouble() (which according to Sun's Java documentation does
the Right Thing). So take note if you roll your own PRNG not to do
that ;-)

mfh

Randall R Schulz

unread,
Dec 14, 2008, 9:09:43 AM12/14/08
to clo...@googlegroups.com
On Saturday 13 December 2008 23:19, Mark H. wrote:
> So I'm going to stop pretending like I'm an expert and actually post
> some Clojure code. Be constructively critical 'cause I'm a n00b in
> that regard ;-) This is a pseudorandom number generator for the
> Gaussian (0,1) distribution.
>
> ... [ lots of unreadable stuff ] ...

Why don't you send it as an attachment?


> ...
>
> mfh


Randall Schulz

Konrad Hinsen

unread,
Dec 14, 2008, 12:40:13 PM12/14/08
to clo...@googlegroups.com
On 14.12.2008, at 08:19, Mark H. wrote:

> So I'm going to stop pretending like I'm an expert and actually post
> some Clojure code. Be constructively critical 'cause I'm a n00b in
> that regard ;-) This is a pseudorandom number generator for the
> Gaussian (0,1) distribution.

...

Isn't that the standard Box-Muller transform?

Anyway, I tried to rewrite your code into a function that takes an
input stream of uniformly distributed random numbers and transforms
it into a stream of Gaussian random numbers. The advantage is that
there is no need to keep track explicitly of the state of the
calculation. In the following, I construct my input stream by
repeatedly calling Clojure's (rand), but you could easily substitute
any other source, including java.Util.Random.

Konrad.

(defn rng-uniform
"Return an infinite lazy sequence of random numbers"
[]
(drop 1 (iterate (fn [_] (rand)) nil)))

(defn transform-to-gaussian
"Transform a sequence of uniform random number in the interval [0, 1)
into a sequence of Gaussian random numbers."
[uniform-seq]
(let [[U1 U2 & uniform-rest] uniform-seq


V1 (- (* 2.0 U1) 1.0)
V2 (- (* 2.0 U2) 1.0)
S (+ (* V1 V1) (* V2 V2))

LS (. Math sqrt (/ (* -2.0 (. Math log S)) S))
X1 (* V1 LS)
X2 (* V2 LS)]


(if (or (>= S 1) (= S 0))

(recur uniform-rest)
(lazy-cons X1 (lazy-cons X2 (transform-to-gaussian uniform-
rest))))))

; Get 10 Gaussian random numbers
(take 10 (rng-gaussian))

Mark H.

unread,
Dec 14, 2008, 8:11:32 PM12/14/08
to Clojure
Haha, wow, it is completely unreadable ;-)

Konrad's solution is way more elegant than mine so I'll omit attaching
the whole thing. The test might be useful though.

mfh

Mark H.

unread,
Dec 14, 2008, 8:24:31 PM12/14/08
to Clojure
On Dec 14, 9:40 am, Konrad Hinsen <konrad.hin...@laposte.net> wrote:
> Isn't that the standard Box-Muller transform?

Ah, yes, now I know what to call it ;-) Yes, it's a particular
implementation of the Box-Muller transform that is designed to avoid
trig functions.

http://en.wikipedia.org/wiki/Marsaglia_polar_method

> (defn transform-to-gaussian
>    "Transform a sequence of uniform random number in the interval [0, 1)
>     into a sequence of Gaussian random numbers."
>    [uniform-seq]
>    (let [[U1 U2 & uniform-rest] uniform-seq

Yes, much better ;-)

>        (lazy-cons X1 (lazy-cons X2 (transform-to-gaussian uniform-
> rest))))))

Also much better -- the structs were ugly. Thanks for the nice
revision!

Do you think type hints on the double-floats could give some
performance benefit? I'm curious where the type hints would go but
too lazy to try the "add them everywhere and remove one by one until
performance decreases" approach.

mfh

Konrad Hinsen

unread,
Dec 14, 2008, 4:23:44 PM12/14/08
to clo...@googlegroups.com
On 14.12.2008, at 18:40, Konrad Hinsen wrote:

> (defn rng-uniform
> "Return an infinite lazy sequence of random numbers"
> []
> (drop 1 (iterate (fn [_] (rand)) nil)))
>
> (defn transform-to-gaussian
> "Transform a sequence of uniform random number in the interval
> [0, 1)
> into a sequence of Gaussian random numbers."
> [uniform-seq]
> (let [[U1 U2 & uniform-rest] uniform-seq
> V1 (- (* 2.0 U1) 1.0)
> V2 (- (* 2.0 U2) 1.0)
> S (+ (* V1 V1) (* V2 V2))
> LS (. Math sqrt (/ (* -2.0 (. Math log S)) S))
> X1 (* V1 LS)
> X2 (* V2 LS)]
> (if (or (>= S 1) (= S 0))
> (recur uniform-rest)
> (lazy-cons X1 (lazy-cons X2 (transform-to-gaussian uniform-
> rest))))))

I forgot to copy one part:

(defn rng-gaussian
[]
(transform-to-gaussian (rng-uniform)))

Konrad Hinsen

unread,
Dec 15, 2008, 3:38:16 AM12/15/08
to clo...@googlegroups.com
On 15.12.2008, at 02:24, Mark H. wrote:

>> (lazy-cons X1 (lazy-cons X2 (transform-to-gaussian uniform-
>> rest))))))
>
> Also much better -- the structs were ugly. Thanks for the nice
> revision!

In fact, this is a good test-case for the interface design for random-
number generators. If you want to be able to implement rejection-
based ones (which are numerous), there is a lot to gain from an
interface in terms of lazy sequences. The Box-Muller transform is
even more difficult because it transforms pairs of numbers. With any
other interface that would require some bookkeeping for managing the
additional state (first/second member of the pair), whereas a
sequence-based interface makes it nearly trivial.

Long live laziness! ;-)

> Do you think type hints on the double-floats could give some
> performance benefit? I'm curious where the type hints would go but
> too lazy to try the "add them everywhere and remove one by one until
> performance decreases" approach.

I have no experience at all with type hints. For example, I wonder if
Clojure already uses the calls to Math.log etc. to determine that
certain numbers must be floats. It would be nice to have tools that
show what type information the compiler already knows.

Konrad.

Reply all
Reply to author
Forward
0 new messages