Advice getting started with concurrency and parallelism in Clojure

233 views
Skip to first unread message

Chris White

unread,
Apr 5, 2016, 9:24:08 PM4/5/16
to Clojure
I was doing some reading of code recently to help me get up to speed with Clojure. One of the libraries I randomly came across dealt with parallelism and I had a hard time following along with it. To try and wrap my head around things I did a quick search and found this article:

http://www.thattommyhall.com/2014/02/24/concurrency-and-parallelism-in-clojure/

I'm not sure how authoritative this is based on my current experience, but needless to say I was a bit overwhelmed. That said is there any sort of introductory material that list members have used to help get them into how Clojure deals with concurrency and parallelism? I also don't mind anything that's not specifically using Clojure but will at least help me understand the concepts behind how Clojure does it. Thanks again for any and all help!

- Chris White (@cwgem)

Timothy Baldridge

unread,
Apr 5, 2016, 9:51:59 PM4/5/16
to clo...@googlegroups.com
If it all seems confusing, do not despair, there's two things that will handle the vast majority of the use cases you may have: 

1) `future` - spawns a thread that runs the body of the future (https://clojuredocs.org/clojure.core/future)
2) `atom` and `swap!` - Used to store data that needs to be shared between threads and updated concurrently (https://clojuredocs.org/clojure.core/atom) these are built on top of CAS, which itself is foundation upon which most of concurrent programming is built. (https://en.wikipedia.org/wiki/Compare-and-swap)

Those two primitives alone will handle 90% of the use cases you will run into as a new clojure developer. The rest of the stuff (agents, thread pools, refs, vars, cps/core.async) can all come in time, but you will use them much less often than threads and atoms. So read up on those two and feel free to come back with any questions you may have. 

Timothy


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Mars0i

unread,
Apr 6, 2016, 9:11:52 AM4/6/16
to Clojure
Maybe people forget about pmap, pcalls, and pvalues because they're just too easy.

Chris White

unread,
Apr 7, 2016, 8:53:05 AM4/7/16
to Clojure


On Tuesday, April 5, 2016 at 6:51:59 PM UTC-7, tbc++ wrote:
If it all seems confusing, do not despair, there's two things that will handle the vast majority of the use cases you may have: 

1) `future` - spawns a thread that runs the body of the future (https://clojuredocs.org/clojure.core/future)
2) `atom` and `swap!` - Used to store data that needs to be shared between threads and updated concurrently (https://clojuredocs.org/clojure.core/atom) these are built on top of CAS, which itself is foundation upon which most of concurrent programming is built. (https://en.wikipedia.org/wiki/Compare-and-swap)

Those two primitives alone will handle 90% of the use cases you will run into as a new clojure developer. The rest of the stuff (agents, thread pools, refs, vars, cps/core.async) can all come in time, but you will use them much less often than threads and atoms. So read up on those two and feel free to come back with any questions you may have. 


Okay I've been taking a look at these docs and some articles around them. I think where most of my confusion arises from expecting to see some form of threading or process spawning. Instead I see something like:

user=> (def a (atom #{}))
#'user/a

user=>(swap! a conj :tag)
#{:tag}

user=> @a
#{:tag}

Which is showing a mutable value in a language that (from what I understand) values immutability. The thing that's throwing me off is that none of the examples I'm finding actually shows threaded code. I guess what I'm looking for is that kind of example to see how it all fits together.

 

Niels van Klaveren

unread,
Apr 7, 2016, 9:00:39 AM4/7/16
to Clojure
The biggest problem with pmap I have is ordering, ie. it will process in batches of (+ 2 (.. Runtime getRuntime availableProcessors)), and only take a new batch when the slowest of the old batch has been evaluated. With functions dependent on IO, parallel gains are only a fraction of what they could be. I used to solve this by creating my own code to process in futures and delays, but when I found the claypoole library, especially it's unordered pmap and for, I never had to touch these again.

Gary Verhaegen

unread,
Apr 7, 2016, 9:56:44 AM4/7/16
to clo...@googlegroups.com
The two resources that helped me most with concurrency and parallelism
are "Java Concurrency in Practice" and "ZeroMQ — The Guide".
Introductory Go books are also enlightening.

Once you have a clear understanding of the underlying concepts in
general, understanding how they are accessible in Clojure is really
just about knowing what is available in the standard library. Remember
that that includes all of Java, too. Also keep in mind that a
0-argument Clojure fn is a java.lang.Runnable and a
java.util.concurrent.Callable.

Here's an example of using an atom with multiple threads:

(def counter (atom 0))

(let [threads (repeatedly 10 (fn [] (Thread.
#(dotimes [_ 100] (swap! counter inc)))))]
(->> threads
(map #(.start %))
doall)
(->> threads
(map #(.join %))
doall)
(println @counter))

The point of the atom should be slightly clearer: it is indeed a
"mutable value" (we don't really use these two words together in
Clojure; things are either "mutable reference" or "immutable value")
like in any other language, except that you can change it from
multiple threads without any problem, because the updates are atomic —
hence the name.

The above code is using bare Java threads, which is not very
idiomatic. Usually, there are better options within Clojure itself
(for this simple example, using future instead of bare threads would
yield slightly more compact code), but it's hard to know which one to
suggest without more information on your use-case.

You basically have two ways of coordinating threads: message-passing
or shared memory. The point of atoms is to support a safe way to share
memory between threads, by providing a reference to a (supposedly)
immutable value. If you're more of the message-passing inclination,
you can use core.async channels and >!! (blocking put) and <!!
(blocking take). If you're using real threads, you don't even need to
dig into the go macro.

Clojure also offers a lot of additional functions and types for more
specific concurrency use-cases. For example, the future-call function
takes a function of zero arguments, creates a thread, starts the
thread, and returns a reference that can be later dereferenced to get
the value returned by the function that runs in the separate thread;
it also implements caching, should the future value be dereferenced
multiple times. For example:

(defn slow-fn []
(Thread/sleep 1000)
(println "hey!")
42)

(def fut (future-fn slow-fn)) ;; Thread has started.

(deref fut) ;; Would be a blocking call

;; A second after the def, the message is printed

(deref fut) ;; is not a blocking call anymore, and returns 42 directly

You have lots of small functions and macros like that — delay,
promise, future, ref, agent, pmap, reducers, etc, but without more
information about either which function/concept you're trying to
understand or what problem you're trying to solve it's hard to help
you more than that.



On 7 April 2016 at 15:00, Niels van Klaveren

Timothy Baldridge

unread,
Apr 7, 2016, 10:22:17 AM4/7/16
to clo...@googlegroups.com
Exactly. Clojure's strength is constraining mutability. How each primitive constrains mutability is different. Note: that many of these defaults can be overridden if the user really knows what they are doing. 

atoms - given a mutable cell, provide a function to update the data, may be run multiple times if there are conflicts with other updates

agents - given a mutable cell, provide a function to update the data, this function is put into a queue and run once

vars - mutable cell only within the context of a single thread, other threads will see their own version of the same cell

refs - given a user defined set of cells, update them all via a single transaction, retrying as needed if conflicts occur

core.async channels - a first-in-first-out mult-reader multi-writer queue. 

promise - define a cell, the value can be set later by any thread, but can only be set once. Readers may wait for the promise to be delivered. Any thread trying to deliver a value to a already delivered promise will receive an exception. 

Notice the pattern here. Each one of these deals with how updates happen, who can see the updates, and how to deal with conflicts. Each one of these is a different way approaching the same issue. 

And this is actually something you'll see a lot in clojure. When should I use maps, or reify, or defrecord, or deftype, or genclass? Each one of these  has its own set of tradeoffs and benefits. 

Mars0i

unread,
Apr 7, 2016, 11:09:39 AM4/7/16
to Clojure
Niels-- Ah, interesting.  My uses of pmap haven't been I/O bound.  I didn't know about the claypoole library.  Will keep that in mind.

Johannes Staffans

unread,
Apr 8, 2016, 8:37:34 AM4/8/16
to Clojure
I found the introductory talk on Claypoole pretty informative with regards to parallelism in Clojure in general: https://www.youtube.com/watch?v=BzKjIk0vgzE
Reply all
Reply to author
Forward
0 new messages