Pulling threads...

12 views
Skip to first unread message

Brian Ericson

unread,
Oct 29, 2010, 12:58:47 AM10/29/10
to clo...@googlegroups.com
Please forgive the lengthy post...

New to Clojure and trying to get a friend excited about it by demonstrating how Clojure's functional/STM heritage makes it easy to write concurrent/parallel code, but I can't get Clojure to

I started with Java, where 256 threads are vying to increment a single counter 10,000 times each:
public class Concurrent {
   static final int NUM_THREADS = 256;
   static final int NUM_CALLS = 10000;

   volatile boolean started = false;

   volatile int counter = 0;
   synchronized void incrementCounter() { counter += 1; }

   public static void main(final String[] args) throws Exception {
      final Concurrent c = new Concurrent();
      final Thread[] threads = new Thread[NUM_THREADS];
      for (int i = 0; i < NUM_THREADS; i++) {
         threads[i] = new Thread(new Runnable() {
            public void run() {
               while (!c.started);
               for (int i = 0; i < NUM_CALLS; i++) c.incrementCounter();
            }
         });
         threads[i].start();
      }

      final long start = System.currentTimeMillis();
      c.started = true;
      for (int i = 0; i < NUM_THREADS; i++) threads[i].join();
      System.out.println("counter = "+c.counter+" and took "+(System.currentTimeMillis() - start)+" ms.");
      System.exit(0);
   }
}

You may wonder why such a high thread/increment count.  I needed numbers this substantial because Java and my W510 are simply that efficient at handling it:
$ time /opt/jdk1.6/bin/java Concurrent
counter = 2560000 and took 481 ms.

real    0m12.782s
user    1m36.286s
sys    0m0.251s

(Most of the "real" time is spent creating and starting the threads.  In fact, the purpose of "started" is because it takes less time to count 10,000 times than to create a new thread, so I wanted to force all 256 threads to "hold" until I could "start" them all simultaneously.)

This is a trivial example, but illustrates the practice of combining volatile fields with synchronized methods...  I wish that looked "harder", because I'm trying to sell Clojure here!  In any case, the following is the first Clojure program I've ever written to "demonstrate" how easily STM tackles this:
(def NUM_THREADS 256)
(def NUM_CALLS 10000)

(def started (ref false))

(def counter (ref 0))
(defn increment-counter [] (dosync (alter counter inc)))

(def threads
  (map #(Thread. %)
        (repeat NUM_THREADS
              #(do
                   (while (not @started))
                   (dotimes [_ NUM_CALLS] (increment-counter))))))
(map #(.start %) threads)

(let [start (System/currentTimeMillis)]
  (dosync (ref-set started true))
  (map #(.join %) threads)
  (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start))))

The only problem is that this DOESN'T WORK and I don't know why (I'll save a humorous tangent about the effects of forgetting the "@" in "(while (not @started))").  This is best illustrated by running it:
$ time /opt/jdk1.6/bin/java -jar ~/dev/lang/clojure-1.2.0/clojure.jar java_like.clj
counter = 0 and took 1 ms.

real    0m1.059s
user    0m1.280s
sys    0m0.058s

As you can see, it very quickly did nothing -- ignoring my "(map #(.join %) threads)".  Running the same code in the REPL produces interesting results.  The REPL, too, will print instantly and prompt me for more work.  However, it's clearly executing the threads, as repeated "@counter" invocations show an incrementing counter.  And it is absolutely dog slow:  my 1/2 seconds of spinning in native Java takes an eternal 2 1/2 minutes in Clojure.

I suspect the culprit is my "let":  getting rid of it causes "(map #(.join %) threads)" to behave as expected.  But, I don't get it.  Maybe "let" is executing in another thread I don't know about and it is _that_ thread that is waiting to join.  Thoughts?

Meanwhile, this probably isn't what I want, anyway.  I should be using Clojure's own mechanisms for parallelism: pmap.  My first attempt failed miserably:
(def NUM_THREADS 256)
(def NUM_CALLS 10000)

(def TOTAL (* NUM_THREADS NUM_CALLS))

(def counter (ref 0))
(defn increment-counter [_] (dosync (alter counter inc)))

(let [start (System/currentTimeMillis)]
  (pmap increment-counter (range TOTAL))
  (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start))))

This suffers the same problem as my first attempt: it prints before it completes.  The problem is that it once it completes "@counter" returns 32 (I believe that's the size of Clojure's thread pool).  I've not figured out how to force Clojure to actually get beyond this number (I thought maybe some element of laziness was coming into play here, but, if so, I've not figured out how to de-lazify it)...

Interestingly, my second attempt fared better:
(def NUM_THREADS 256)
(def NUM_CALLS 10000)

(def TOTAL (* NUM_THREADS NUM_CALLS))

(def counter (ref 0))
(defn increment-counter [] (dosync (alter counter inc)))

(let [start (System/currentTimeMillis)]
  (apply pcalls (repeat TOTAL increment-counter))
  (println (format "counter = %d and took %d ms." @counter (- (System/currentTimeMillis) start))))

Still the same problem with the printing.  However, this is the only Clojure program I can come up with that'll actually run to completion (in and outside of the REPL):
$ time /opt/jdk1.6/bin/java -jar clojure.jar ../../play/clojure_threading/clj_like.clj
counter = 2 and took 8 ms.

real    1m0.965s
user    0m1.184s
sys    0m0.069s

Still slow, though, but not as bad.  Not quite apples-to-apples, though.  (Something I'll need to ponder:  if everything were functional and transactional, couldn't the language automatically handle parallelism?)

What am i missing?  Why is this printing prior to join/pmap/pcalls?  Why won't pmap keep going until I reach TOTAL?

Meikel Brandmeyer

unread,
Oct 29, 2010, 8:07:07 AM10/29/10
to Clojure
Hi,

On 29 Okt., 06:58, Brian Ericson <li...@curvybits.org> wrote:

> (map #(.start %) threads)

map is not a loop. It creates a lazy sequences which does - nothing.
At least not until it is realised. Here you throw it away
immediatelly. Hence, no work is done. Use doseq instead when your main
objective are side-effects like .start or .join. It works in the repl,
because the repl prints the seq and thus realises it forcing the work
to be done.

Hope this helps.

Sincerely
Meikel

Brian Ericson

unread,
Oct 30, 2010, 10:59:51 PM10/30/10
to clo...@googlegroups.com
Oh.  The "P" in "REPL" makes it non-lazy and the "let" meant that the printer need only print the last line.  Ugh.  That is completely obvious in hindsight!  Thanks!

I surrounded a few expressions with "dorun" and now both of my Clojure implementations "run" -- or crawl, given the results of timing the first and the second:
(first)
counter = 2560000 and took 139949 ms.

real 2m37.599s
user 4m48.874s
sys 2m23.225s

(second)
counter = 2560000 and took 34895 ms.

real 0m35.868s
user 1m14.625s
sys 0m24.035s

Not that I should be reading *too* much into my micro-benchmark
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply all
Reply to author
Forward
0 new messages