I recently integrated Clojure with two async messaging systems.
I wound up doing "send" operations through a Clojure agent.
I was curious how many agents I could spawn per second and found I
could spawn about 20K agents / second.
> I recently integrated Clojure with two async messaging systems.
> I wound up doing "send" operations through a Clojure agent.
> I was curious how many agents I could spawn per second and found I
> could spawn about 20K agents / second.
> (defn agent-speed-test []
> (time (loop [b 1]
> (send-off t tt b)
> (if (> b 20000)
> b
> (recur (inc b))))))
> After a few iterations in the repl (started with -server) my best time
> was:
> "Elapsed time: 1021.700789 msecs" - which is pretty much exactly 20k/
> s.
> CPU: 2.4GHz Core2 Duo.
> Java 1.6.0_14-ea-b03
> Clojure from git/master around Sept/2009.
On Mon, Oct 5, 2009 at 11:51 AM, MarkSwanson <mark.swanson...@gmail.com>wrote:
> On Oct 5, 2:45 am, ngocdaothanh <ngocdaoth...@gmail.com> wrote: > > I think it is not "spawn about 20K agents / second", it is 20K message > > passings / second. The number is about that of Erlang.
> As Clojure uses a thread pool for agents I agree 'spawn' was the wrong > word. Thanks for the correction.
Some confusion here may also be from the subject line. It mentions send-off, which actually can spawn unlimited numbers of new threads. Regular send doesn't, using a fixed thread pool instead.
Thanks John.
I was curious about the details so I took a dive in to the source to
see for myself.
In case anyone else stumbles upon this here's what I found:
In Agent.java, the number of worker threads for (send) are defined
like this:
final public static ExecutorService pooledExecutor =
Executors.newFixedThreadPool(2 + Runtime.getRuntime
().availableProcessors());
The clojure (send) calls Java Agent dispatch(), which winds up using
the pooledExecutor.
Clojure (send-off) follows the same path but winds up using the
soloExecutor - which can spawn (and temporarily cache) an unlimited
number of threads as required:
Random thought: Let's test (send) vs (send-off). New results using
(send):
(the -server jvm produces some wild results for a bit then I get
something crazy: 7.9ms):
Since you are sending all actions to the same agent they are enqueued and
processed sequentially. That's why you are seeing only two active cores (one
for the main thread (repl) or GC and one for the agent).
On Tue, Oct 6, 2009 at 4:35 PM, MarkSwanson <mark.swanson...@gmail.com>wrote:
> Thanks John.
> I was curious about the details so I took a dive in to the source to
> see for myself.
> In case anyone else stumbles upon this here's what I found:
> In Agent.java, the number of worker threads for (send) are defined
> like this:
> final public static ExecutorService pooledExecutor =
> Executors.newFixedThreadPool(2 + Runtime.getRuntime
> ().availableProcessors());
> The clojure (send) calls Java Agent dispatch(), which winds up using
> the pooledExecutor.
> Clojure (send-off) follows the same path but winds up using the
> soloExecutor - which can spawn (and temporarily cache) an unlimited
> number of threads as required:
> Random thought: Let's test (send) vs (send-off). New results using
> (send):
> (the -server jvm produces some wild results for a bit then I get
> something crazy: 7.9ms):
I did some more tests with 4 queues and found that there seemed to be
some contention going on that prevented all cores from being utilized
fully.
With the example provided 2 cores will pin at 100% (excellent). With 4
atoms and one test fn one core will stay around 20%, no other core >
70%.
When I used 4 separate worker fns along with 4 separate atoms things
improved and all 4 cores pinned at around 60%. It would be interesting
to know why having 4 separate fns makes a difference (STM?).
Wrt 60%: I simply think this was an artifact of my test. If I had
spawned another thread to feed another atom I'm sure I could have
easily pegged all CPUs to 100%.
Your worker fn is too quickly executed: more time is spent managing queues
(and that require some synchronization) than executing actions.
If your worker fn was more realistic (more computationally heavy) I tjink
you'll see 4 cores humming at 100%.
(You wrote "atom" several times but I guess you meant "agent".)
On Tue, Oct 6, 2009 at 7:48 PM, MarkSwanson <mark.swanson...@gmail.com>wrote:
> I did some more tests with 4 queues and found that there seemed to be
> some contention going on that prevented all cores from being utilized
> fully.
> With the example provided 2 cores will pin at 100% (excellent). With 4
> atoms and one test fn one core will stay around 20%, no other core >
> 70%.
> When I used 4 separate worker fns along with 4 separate atoms things
> improved and all 4 cores pinned at around 60%. It would be interesting
> to know why having 4 separate fns makes a difference (STM?).
> Wrt 60%: I simply think this was an artifact of my test. If I had
> spawned another thread to feed another atom I'm sure I could have
> easily pegged all CPUs to 100%.