organizing multiple dosyncs and ref-set vs. commute

Stuart Halloway

unread,

Aug 19, 2008, 11:06:05 PM8/19/08

to clo...@googlegroups.com

Hi all,

In order to deeply appreciate how Clojure removes the pain of lock-
based concurrency, I am re-implementing the example code from Java
Concurrency in Practice [http://www.javaconcurrencyinpractice.com/] in
Clojure. Consider Listing 2.8, which demonstrates creating fine-
grained locks around a simple cached calculation:

// elided for brevity
public class CachedFactorizer extends GenericServlet implements
Servlet {
@GuardedBy("this") private BigInteger lastNumber;
@GuardedBy("this") private BigInteger[] lastFactors;
@GuardedBy("this") private long hits;
@GuardedBy("this") private long cacheHits;

public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
BigInteger[] factors = null;
synchronized (this) {
++hits;
if (i.equals(lastNumber)) {
++cacheHits;
factors = lastFactors.clone();
}
}
if (factors == null) {
factors = factor(i);
synchronized (this) {
lastNumber = i;
lastFactors = factors.clone();
}
}
encodeIntoResponse(resp, factors);
}
}

And in Clojure (stripping out the Servlet noise):

(def *last-number* (ref nil))
(def *last-factors* (ref nil))
(def *count* (ref 0))
(def *cache-hits* (ref 0))

(defn cached-factor [n]
(dosync (commute *count* inc))
(or (dosync (if (= n @*last-number*)
(do (commute *cache-hits* inc) @*last-factors*)))
(let [factors (factor n)]
(dosync
(do (ref-set *last-number* n)
(ref-set *last-factors* factors))))))

A few questions/comments:

(1) Are the ref-sets overly strict? Could commute be used instead? I
am happy with last-in wins, so long as *last-number* and *last-
factors* change in tandem.
(2) In the Java example there are separate synchronized blocks to
increase possible concurrency. I have tried to write the Clojure
transactions similarly, in particular making sure the "expensive"
calculation (factor) is outside a transaction. How can it be done
better?
(3) Bonus: How idiomatic is my Clojure?

Thanks! I am truly enjoying Clojure.

Stuart

Rich Hickey

unread,

Aug 25, 2008, 2:57:05 PM8/25/08

to Clojure

On Aug 19, 11:06 pm, Stuart Halloway <stuart.hallo...@gmail.com>
wrote:

This is the most efficient all-ref approach, from a contention
perspective:

(def hits (ref 0))
(def cache (ref {:n nil :factors nil}))
(def cache-hits (ref 0))

(defn cached-factor [n]
(dosync (commute hits inc))
(let [cached @cache]
(if (= n (cached :n))
(dosync (commute cache-hits inc)
(cached :factors))

(let [factors (factor n)]
(dosync

(commute cache (fn [_] {:n n :factors factors})))
factors))))

It leverages a few aspects of Clojure's refs:

'Wild' reads are supported, i.e. you can read a ref outside a
transaction as long as you don't need to see a coordinated view with
other refs.

So, in this case, it is better to use a composite object for the
cache, so that a wild read will return a coordinated n and its
factors. N.B. that the cache is read only once, in the let. Sprinkling
multiple wild @ reads around won't work since each deref could return
a different value.

The ref-sets were needed in your example, but with a composite cache,
you can use the commute trick above (commute with a fn that ignores
its arg) to get last-one-in-wins, since this is just a last-value-only
cache.

As written above, none of the transaction in cached-factor will ever
retry, and the cache check is non-transactional - cache readers will
not wait for each other (as they will with the synchronized blocks in
the original example).

Your code is fine, and what's nice is it's easy to write multiple
correct versions, we're just optimizing here. I'm not sure I like the
Lisp *blah* convention for Clojure if the var is not going to be used
for dynamic rebinding.

Note that Clojure's immutable data structures work well with
java.util.concurrent, and that for this simple last-value cache, this
is the fastest possible way:

(def #^AtomicLong hits (AtomicLong. 0))
(def #^AtomicReference cache (AtomicReference. {:n nil :factors nil}))
(def #^AtomicLong cache-hits (AtomicLong. 0))

(defn cached-factor [n]
(.incrementAndGet hits)
(let [cached (.get cache)]
(if (= n (cached :n))
(do (.incrementAndGet cache-hits)
(cached :factors))
(let [factors (factor n)]
(.set cache {:n n :factors factors})
factors))))

A Java programmer might not think at first of using an immutable
composite for the cache, but a Clojure programmer (eventually) should,
at which point you can go for the lock-free version above. Note how it
has the same structure as the transactional Clojure version, and the
same caveat on reading the cache only once.

Rich

Shawn Hoover

unread,

Aug 25, 2008, 3:33:58 PM8/25/08

to clo...@googlegroups.com

On Mon, Aug 25, 2008 at 11:57 AM, Rich Hickey <richh...@gmail.com> wrote:

(def hits (ref 0))
(def cache (ref {:n nil :factors nil}))

(def cache-hits (ref 0))

(defn cached-factor [n]

(dosync (commute hits inc))
(let [cached @cache]
(if (= n (cached :n))
(dosync (commute cache-hits inc)
(cached :factors))

(let [factors (factor n)]
(dosync

(commute cache (fn [_] {:n n :factors factors})))
factors))))

The ref-sets were needed in your example, but with a composite cache,
you can use the commute trick above (commute with a fn that ignores
its arg) to get last-one-in-wins, since this is just a last-value-only
cache.

Is there a stylistic or correctness advantage to using commute over ref-set on the composite cache? As in:
(dosync (ref-set cache {:n n :factors factors}))

--Shawn

Rich Hickey

unread,

Aug 25, 2008, 3:53:21 PM8/25/08

to Clojure

On Aug 25, 3:33 pm, "Shawn Hoover" <shawn.hoo...@gmail.com> wrote:

ref-set is more 'correct' in that the new value is not a function of
the old, and I recommend it over what I showed when that is the
general case. In this specific case, the wholesale replacement was
last-one-in-wins with no in-transaction dependent data (i.e. the basis
for the new state was not derived from refs in the transaction, so
commute was possible. commute is always just an optimization vs.
alter, and alter is preferred over ref-set when the new state is a
function of the old, which is normally the case.

Rich

Reply all

Reply to author

Forward