concurrency and rand-int

62 views
Skip to first unread message

Lee Spector

unread,
Mar 25, 2010, 10:35:14 PM3/25/10
to clo...@googlegroups.com

I'm trying to track down the reason that I sometimes see a lot of concurrency in my system (up to 1200% CPU utilization on a dual quadcore mac that also has some kind of hyperthreading, allegedly allowing a maximum of 1600% CPU) while other times it gets stuck at around 100-200%. My system (a genetic programming system) has a *lot* of randomness in it, so it's hard to repeat runs and get a firm handle on what's going on.

But after a bunch of testing I'm beginning to suspect that it might be the random number generator itself (clojure-core/rand-int in this case, which calls (. Math (random))). This seems at least somewhat plausible to me because I guess that the underlying Java random method must be accessing and updating a random number generator state, and so this would be a concurrency bottleneck. So if I'm in a condition in which lots of concurrent threads are all calling rand-int a lot then all of the accesses to the state have to be serialized and my concurrency suffers (a lot).

Does this sound plausible to you? If so, is there a straightforward way to avoid it? It is not important to me that the random numbers being generated in different threads be generated from the same generator or coordinated/seeded in any way. I just need lots of numbers that are "random enough." I guess I could roll my own random number generator(s) and either have a lot of them with independent states or maybe even make them stateless (always generating numbers by scrambling the clock?). But I would hope there would be something simpler.

Thanks,

-Lee

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspe...@hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/

Per Vognsen

unread,
Mar 25, 2010, 10:50:08 PM3/25/10
to clo...@googlegroups.com
Clojure calls out to Java's java.lang.Math.Random:

"This method is properly synchronized to allow correct use by more
than one thread. However, if many threads need to generate
pseudorandom numbers at a great rate, it may reduce contention for
each thread to have its own pseudorandom-number generator."

Look at java.util.Random for local RNGs. In the long term, it might
make sense for core's rand to refer to a *random-number-generator* var
that defaults to a shared thread-safe RNG but can be rebound per
thread in cases like yours to reduce contention. In the short term,
you can just write your own wrapper functions.

-Per

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>
> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
>

Chas Emerick

unread,
Mar 25, 2010, 10:55:19 PM3/25/10
to clo...@googlegroups.com
Hi Lee,

Indeed -- from the docs for Math.random():

"This method is properly synchronized to allow correct use by more
than one thread. However, if many threads need to generate
pseudorandom numbers at a great rate, it may reduce contention for
each thread to have its own pseudorandom-number generator."

At its root, java.util.Random uses an AtomicLong to store the last
dispensed pseduorandom number, so that is the fundamental point of
contention (all of your threads are blocking on a CAS on a single atom
in the Math class' Random instance).

You can certainly have a Random instance per-thread -- when you set up
each thread of execution (in a send to an agent, at the start of the
fn that you're pmap'ing across a dataset, whatever), just bind a new
java.util.Random to a var, and have all your code pull random numbers
from there.

- Chas

Anniepoo

unread,
Mar 25, 2010, 11:05:18 PM3/25/10
to Clojure
As for the problems debugging code that calls nextInt, call setSeed
with an arbitrary number at startup.
After setting setSeed the sequence of ints returned will be the same
every run.
You may have trouble if you don't have any synchronization between
threads.
It could still have race conditions for the nth nextInt call.
But otherwise, run it and see which of two conditions you have. Record
the number you seeded with,
change the number and run repeatedly until you get the other
condition. You now have a way to switch between
the two conditions at will.

Andrzej

unread,
Mar 26, 2010, 12:30:18 AM3/26/10
to clo...@googlegroups.com
As others have pointed out using per-thread java.util.Random objects
is probably the best way to go in this particular case. However, I'm
curious if the following code could give any speed gain on your
machine:

(defn rand-seq [] (repeatedly #(. Math (random))))

(def rand-seq-ref (ref (rand-seq)))
(nth @rand-seq-ref 100) ;; pre-cache random values; evaluate it every some time
;;btw, how to do it automatically?

(defn next-rand-val []
(dosync (commute rand-seq-ref next) (first @rand-seq-ref)))

user=> (next-random-val)
0.5558267606843464
user=> (next-random-val)
0.32353157456467474

Cheers,
Andrzej

Chas Emerick

unread,
Mar 26, 2010, 12:43:04 AM3/26/10
to clo...@googlegroups.com
I was going to suggest something similar using seque in an atom, but
in neither case (using an atom or a ref) is the contention going to be
minimized -- just shifted from the AtomicLong in java.util.Random to
the now-app-level atom or ref.

- Chas

mac

unread,
Mar 26, 2010, 3:30:38 AM3/26/10
to Clojure
There is a fast Java version of Mersenne Twister here, if you feel
like compiling a java file:
http://cs.gmu.edu/~sean/research/

> >> lspec...@hampshire.edu,http://hampshire.edu/lspector/


> >> Phone: 413-559-5352, Fax: 413-559-5438
>
> >> Check out Genetic Programming and Evolvable Machines:

> >>http://www.springer.com/10710-http://gpemjournal.blogspot.com/

Lee Spector

unread,
Mar 26, 2010, 8:14:38 AM3/26/10
to clo...@googlegroups.com

Thanks all for the quick and helpful responses on the issues with rand-int/rand and concurrency.

I probably have some other issues that are also gumming up some of my concurrency, but following the advice on of creating per-thread java.util.Random objects *seems* to have helped, although it's a little hard to tell and I'm not sure I did it right.

What I did was:

(def thread-local-random-state (new java.util.Random)) ;; just an initial global binding

(defn lrand-int
"Return a random integer using the thread-local random state."
[n]
(if (< n 1)
0
(. thread-local-random-state (nextInt n))))

(defn lrand
"Return a random float between 0 and 1 usng the thread-local random state."
[]
(. thread-local-random-state (nextFloat)))

And then I wrap the bodies of the the functions that I pass to send (which is the only way I launch threads) with:

(binding [thread-local-random-state (new java.util.Random)]
<everything else goes here, with calls that make calls that eventually call lrand-int and lrand>
)

Is that the right way to do it?

Thanks,

-Lee

> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359

lspe...@hampshire.edu, http://hampshire.edu/lspector/

Chas Emerick

unread,
Mar 26, 2010, 8:37:21 AM3/26/10
to clo...@googlegroups.com
Just as long as your binding form is within the function body, you're
good to go. e.g., you need to do this (or its equivalent):

(send some-agent #(binding [random-state (Random.)] (other-fns %&)))

*not* this:

(binding [random-state (Random.)]
(send some-agent other-fns))

The latter will cause the sent fn to be invoked outside the scope of
the binding.

I wrote up a little guide on some of the wrinkles of binding some
months ago:

http://muckandbrass.com/web/display/~cemerick/2009/11/03/Be+mindful+of
+Clojure%27s+binding

On a very minor note, these forms are more syntactically idiomatic:

(java.util.Random.)
(.nextFloat random-state)
(.nextInt random-state n)

Cheers,

- Chas

> Lee Spector, Professor of Computer Science
> School of Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359
> lspe...@hampshire.edu, http://hampshire.edu/lspector/
> Phone: 413-559-5352, Fax: 413-559-5438
>
> Check out Genetic Programming and Evolvable Machines:
> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
>

Per Vognsen

unread,
Mar 25, 2010, 11:05:25 PM3/25/10
to clo...@googlegroups.com
On Fri, Mar 26, 2010 at 9:55 AM, Chas Emerick <ceme...@snowtide.com> wrote:
> Hi Lee,
>
> Indeed -- from the docs for Math.random():
>
> "This method is properly synchronized to allow correct use by more than one
> thread. However, if many threads need to generate pseudorandom numbers at a
> great rate, it may reduce contention for each thread to have its own
> pseudorandom-number generator."
>
> At its root, java.util.Random uses an AtomicLong to store the last dispensed
> pseduorandom number

Nitpick: The state of a production-quality PRNG is never the last
number it generated. That would mean you could initialize your own
instance of the algorithm with the last number you saw it output and
immediately duplicate all its future output.

-Per

Lee Spector

unread,
Mar 26, 2010, 4:51:39 PM3/26/10
to clo...@googlegroups.com

Interesting blog post, Chas, which led me to doall some of my lazy sequences (not 100% sure I caught them all) and in the process I also found another source of contention that I removed (over a Date object).

But I'm still having contention somewhere that I can't find.

Is there a tool that will help me to see what is being contended for, and/or who is contending for it? Anything would be helpful.

Also, is it possible that there is contention over the reading, by many different threats (from 10000 agents), of the same parts of many (10000+) nested immutable lists with lots of shared structure? I know this immutable forrest is *safe* for reading by many threads, but I don't know if there's a performance implication. I've tried to make copies of list structures just to explore this, but it's not obvious how to do that in clojure and I'm not sure I've been doing it right.

Thanks,

-Lee

>>> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.


>>
>> --
>> Lee Spector, Professor of Computer Science
>> School of Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspe...@hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>>
>> Check out Genetic Programming and Evolvable Machines:
>> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clo...@googlegroups.com
>> Note that posts from new members are moderated - please be patient with your first post.
>> To unsubscribe from this group, send email to
>> clojure+u...@googlegroups.com
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en
>>

>> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.


>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.

Lee Spector

unread,
Mar 26, 2010, 4:53:36 PM3/26/10
to clo...@googlegroups.com

mac: Thanks for that -- it's a pointer to one of my own former students, who I should be bugging about this stuff :-)

-Lee

On Mar 26, 2010, at 3:30 AM, mac wrote:

> To unsubscribe from this group, send email to clojure+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359

lspe...@hampshire.edu, http://hampshire.edu/lspector/

Scott

unread,
Mar 26, 2010, 7:44:29 PM3/26/10
to Clojure
does java.security.SecureRandom have the same problem?

I have been using it for evolutionary algorithms, but have noticed
bottlenecks (likely mostly because my fitness evaluations are
computationally expensive).

You can implement it as such:

(import '(java.security SecureRandom))

(defn srand
"Returns a random floating point number between 0 (inclusive) and n
(default 1) (exclusive)."
([]
(let [sr (SecureRandom/getInstance "SHA1PRNG")]
(.nextDouble sr)
)

)
([n] (* n (srand))))

(defn srand-int
"Returns a random integer between 0 (inclusive) and n (exclusive)."
[n] (int (srand n)))


then just replace rand-int with srand-int

Scott


On Mar 25, 10:35 pm, Lee Spector <lspec...@hampshire.edu> wrote:
> I'm trying to track down the reason that I sometimes see a lot of concurrency in my system (up to 1200% CPU utilization on a dual quadcore mac that also has some kind of hyperthreading, allegedly allowing a maximum of 1600% CPU) while other times it gets stuck at around 100-200%. My system (a genetic programming system) has a *lot* of randomness in it, so it's hard to repeat runs and get a firm handle on what's going on.
>
> But after a bunch of testing I'm beginning to suspect that it might be the random number generator itself (clojure-core/rand-int in this case, which calls (. Math (random))). This seems at least somewhat plausible to me because I guess that the underlying Java random method must be accessing and updating a random number generator state, and so this would be a concurrency bottleneck. So if I'm in a condition in which lots of concurrent threads are all calling rand-int a lot then all of the accesses to the state have to be serialized and my concurrency suffers (a lot).
>
> Does this sound plausible to you? If so, is there a straightforward way to avoid it? It is not important to me that the random numbers being generated in different threads be generated from the same generator or coordinated/seeded in any way. I just need lots of numbers that are "random enough." I guess I could roll my own random number generator(s) and either have a lot of them with independent states or maybe even make them stateless (always generating numbers by scrambling the clock?). But I would hope there would be something simpler.
>
> Thanks,
>
>  -Lee
>
> --
> Lee Spector, Professor of Computer Science
> School of Cognitive Science, Hampshire College
> 893 West Street, Amherst, MA 01002-3359

> lspec...@hampshire.edu,http://hampshire.edu/lspector/

Reply all
Reply to author
Forward
0 new messages