Atoms

125 views
Skip to first unread message

Rich Hickey

unread,
Dec 4, 2008, 8:02:51 PM12/4/08
to Clojure
I've added a new reference type - atom.

Docs here:

http://clojure.org/atoms

Feedback welcome,

Rich

Mark Engelberg

unread,
Dec 4, 2008, 9:26:05 PM12/4/08
to clo...@googlegroups.com
Didn't commute essentially give this behavior for refs?  How is this different?

Larrytheliquid

unread,
Dec 4, 2008, 11:01:35 PM12/4/08
to clo...@googlegroups.com
One difference would be if a ref is already inside of a bigger transaction that failed to commit for other reasons. With atoms it seems like the "transaction" is implicitly isolated to the atom (instead of explicitly wrapping around a ref.)
--
Respectfully,
Larry Diehl
www.larrytheliquid.com

Krukow

unread,
Dec 5, 2008, 1:51:16 AM12/5/08
to Clojure
On Dec 5, 2:02 am, Rich Hickey <richhic...@gmail.com> wrote:
> I've added a new reference type - atom.

Looks useful as a kind of high-level interface to
java.util.concurrent.AtomicReference. Am I correct

Krukow

unread,
Dec 5, 2008, 1:59:04 AM12/5/08
to Clojure


On Dec 5, 7:51 am, Krukow <karl.kru...@gmail.com> wrote:
> Looks useful as a kind of high-level interface to
> java.util.concurrent.AtomicReference. Am I correct
to think of this as being (semantically) equivalent to combining send-
off and await with agents?

E.g.,

(defn memoize [f]
(let [mem (agent {})]
(fn [& args]
(if-let [e (find @mem args)]
(val e)
(let [ret (apply f args)] (send-off mem assoc args ret)
(await mem)
ret)))))

under the hood, the first is running of a queue in a separate thread
and the other is doing a in-thread spin-wait?

- Karl

(sorry for making two post, I accidentally triggered a send-message
shortcut ;-))

bOR_

unread,
Dec 5, 2008, 5:51:49 AM12/5/08
to Clojure
Are there any screencasts planned which will feature atoms? (I found
that the screencasts are an excellent way of learning clojure).

Parth Malwankar

unread,
Dec 5, 2008, 6:12:32 AM12/5/08
to Clojure
Are the following equivalent or is one recommended over
the other? The first (using atoms) is definitely more convenient
and less verbose.

user=> (def a (ref {:a 1 :b 2 :c 3}))
#'user/a
user=> (def a (atom {:a 1 :b 2 :c 3}))
#'user/a
user=> (swap! a assoc :a 5)
{:c 3, :b 2, :a 5}
user=> a
#<AtomicReference$IRef {:c 3, :b 2, :a 5}>

OR

user=> (def b (ref {:a 1 :b 2 :c 3}))
#'user/b
user=> (dosync (ref-set b (assoc @b :a 5)))
{:c 3, :b 2, :a 5}
user=> b
#<Ref clojure.lang.Ref@1b6101e>
user=> @b
{:c 3, :b 2, :a 5}
user=>

Parth

Rich Hickey

unread,
Dec 5, 2008, 8:24:16 AM12/5/08
to Clojure


On Dec 5, 5:51 am, bOR_ <boris.sch...@gmail.com> wrote:
> Are there any screencasts planned which will feature atoms? (I found
> that the screencasts are an excellent way of learning clojure).

Screencasts are generally a side-effect of a speaking engagement. I
imagine next time I give a talk, I'll talk about atoms too.

To address the general question about when to use atoms/refs/agents,
it helps to think of things this way:

First, note that in talking about atoms/refs/agents we are talking
about Clojure's reference types that allow changes to be seen by
multiple threads, so these three reference types are all shared
reference types.

There are two dimensions to the choice about using them, the first is
- will the changes be synchronous or asynchronous, and the second is,
will a change to this reference ever need to be coordinated with a
change to another reference or references.

Chouser made a nice diagram after I described this on IRC:

http://clojure.googlegroups.com/web/clojure-conc.png

As you can see, for coordinated changes, refs + transactions are the
only game in town, and asynchrony (beyond a set of commutes) doesn't
make much sense, so no reference type is coming to replace the X.

For independent change, you have two choices, agents and atoms.

Atoms are synchronous, the change happens on the calling thread. They
are as close to a plain variable as you get from Clojure, with a
critical benefit - they are thread safe, in particular, they are not
subject to read-modify-write race conditions. Your writes don't happen
unless they are a function of what was read. But modifications to
atoms are side effects, and thus need to be avoided in transactions.

Agents are asynchronous, and that can have important benefits. In
particular, it means actions get queued, and the sender can proceed
immediately. They provide a transparent interface to the threading
system and thread pools. Agents also cooperate with transactions in
ways that atoms cannot - e.g. agent sends are allowed in transactions
and get held until commit.

What's nice is the unified model underlying the reference types. All
can be read via deref/@, all are designed to refer to an immutable
data value, and to model change as a function of that value. All
support validators.

What that means is that, if you build your state transformation
functions as pure functions, you can freely choose/switch between the
different reference types, even using the same logic for two different
reference types.

However, they are different, and they have not been unified in the
areas in which they differ, in particular, they each have a unique
modification vocabulary - ref-set/alter/commute/send/send-off/swap!/
compare-and-set!.

In the end, Clojure is a tool, and will never be able to make
architectural decisions for you. Hopefully the above will help you
make informed choices.

The memoization example is a prime motivating case for atoms - a local
cache. It's also one that people routinely get multithread-wrong when
trying to implement with simple mutable variables.

Rich

Parth Malwankar

unread,
Dec 5, 2008, 8:38:04 AM12/5/08
to Clojure
Thanks for taking to time for such a detailed explanation Rich.
This makes things much clear. And thanks Chouser for the
pictorial representation.

Parth

Julian Morrison

unread,
Dec 5, 2008, 8:50:05 AM12/5/08
to Clojure
It seems like a pure efficiency optimization - used alone it doesn't
change semantics from dosync and alter over one ref.

It makes me feel wary. What if I changed my design and wanted to do
more in the same transaction? What if I later wanted to call a
function that uses it in the scope of a wider transaction? Or a
transaction that might retry? What if this was library code beyond my
power to alter?

On the other hand I understand how it could be hugely simpler and
quicker, and sometimes an immediate synchronous change is the right
design.

Perhaps it ought to have a warning sticker. "This is a side-effecting,
transaction-breaking blunt instrument contagious to any function that
calls it. Prefer using a ref, especially in library code".

Rich Hickey

unread,
Dec 5, 2008, 8:57:03 AM12/5/08
to Clojure


On Dec 5, 8:50 am, Julian Morrison <julian.morri...@gmail.com> wrote:
> It seems like a pure efficiency optimization - used alone it doesn't
> change semantics from dosync and alter over one ref.
>
> It makes me feel wary. What if I changed my design and wanted to do
> more in the same transaction? What if I later wanted to call a
> function that uses it in the scope of a wider transaction? Or a
> transaction that might retry? What if this was library code beyond my
> power to alter?
>
> On the other hand I understand how it could be hugely simpler and
> quicker, and sometimes an immediate synchronous change is the right
> design.
>
> Perhaps it ought to have a warning sticker. "This is a side-effecting,
> transaction-breaking blunt instrument contagious to any function that
> calls it. Prefer using a ref, especially in library code".
>

Well, for most of its intended uses, it isn't that, and that advice
doesn't hold. Take a memoization or similar cache - perfectly fine in
a transaction. Or a one-time init - also fine.

The ! is the sticker, I guess.

Rich

Rich Hickey

unread,
Dec 5, 2008, 9:02:24 AM12/5/08
to Clojure
Julian's post highlighted a point I need to make clear about the
above:

"Modifications to atoms are side effects, and thus need to be avoided
in transactions"

should be qualified with:

"unless you are ok with having them run more than once. For many
intended uses of atoms, like memoization caches, that's perfectly
fine."

Rich

Randall R Schulz

unread,
Dec 5, 2008, 9:09:57 AM12/5/08
to clo...@googlegroups.com
On Friday 05 December 2008 05:24, Rich Hickey wrote:
> On Dec 5, 5:51 am, bOR_ <boris.sch...@gmail.com> wrote:
> > Are there any screencasts planned which will feature atoms? (I
> > found that the screencasts are an excellent way of learning
> > clojure).
>
> Screencasts are generally a side-effect of a speaking engagement. I
> imagine next time I give a talk, I'll talk about atoms too.

Are you ever going to get out to the Silicon Valley area to give a talk?


Randall Schulz

Rich Hickey

unread,
Dec 5, 2008, 9:33:47 AM12/5/08
to Clojure
I hope to get a slot at Java One in SF this spring.

Rich

Randall R Schulz

unread,
Dec 5, 2008, 9:39:48 AM12/5/08
to clo...@googlegroups.com
On Friday 05 December 2008 06:33, Rich Hickey wrote:
> On Dec 5, 9:09 am, Randall R Schulz <rsch...@sonic.net> wrote:
> > ...

> >
> > Are you ever going to get out to the Silicon Valley area to give a
> > talk?
>
> I hope to get a slot at Java One in SF this spring.

Anything less pricey?

Maybe you could swing by the SVJUG while you're out here?


> Rich


RRS

Stuart Sierra

unread,
Dec 5, 2008, 10:49:47 AM12/5/08
to Clojure
On Dec 4, 8:02 pm, Rich Hickey <richhic...@gmail.com> wrote:
> I've added a new reference type - atom.

I like it; it greatly simplifies a common use for Refs.

"Clojure. Sometimes you just need to mutate."

"Clojure. Mutate safely."

-Stuart Sierra

Stuart Sierra

unread,
Dec 5, 2008, 11:01:49 AM12/5/08
to Clojure
On Dec 4, 8:02 pm, Rich Hickey <richhic...@gmail.com> wrote:
> I've added a new reference type - atom.
> Feedback welcome,

A request, if it's possible: allow watchers to be set on atoms and
refs in addition to agents.

I'd like to experiment with "reactive" programming using the different
transaction models.

-Stuart Sierra

Rich Hickey

unread,
Dec 5, 2008, 11:07:41 AM12/5/08
to clo...@googlegroups.com

I'm working on that.  It has utility even outside traditional reactive contexts, in moving the imperative part of your logic outside of your state transformation function. I think it's a good model.

Chouser recently went through an interesting exercise doing just that, perhaps he'll chime in with his experiences.

Rich


Mark Engelberg

unread,
Dec 5, 2008, 2:42:07 PM12/5/08
to clo...@googlegroups.com
So, earlier, I asked how atoms differ from using commute on refs.

It sounds like the answer is that if you use atoms in a larger
transaction, then as soon as the atom set is encountered, it actually
changes instantly, so if you rollback, and do the transaction again,
it's already been set, and will do so again, so your code surrounding
the atom set better not make assumptions about whether the atom
has/has not been set.

On the other hand, a ref participates in the larger transaction, so
any modification to the ref will rollback if the larger transaction is
rolled back, so when the larger transaction retries, code before the
ref set can safely assume that the ref has not yet been set.

Is this understanding correct?

--Mark

Chouser

unread,
Dec 5, 2008, 7:39:38 PM12/5/08
to clo...@googlegroups.com
On Fri, Dec 5, 2008 at 11:07 AM, Rich Hickey <richh...@gmail.com> wrote:
>
> I'm working on that. It has utility even outside traditional reactive
> contexts, in moving the imperative part of your logic outside of your state
> transformation function. I think it's a good model.
>
> Chouser recently went through an interesting exercise doing just that,
> perhaps he'll chime in with his experiences.

Spoiler warning -- this is about a http://projecteuler.net/
problem. If you don't follow the links, I think you'll be
able to understand what I'm talking about without learning
anything specific enough to ruin any particular puzzle. If
you want to know which specific puzzle I'm discussing (to
see if you've already done it, for example) you can go to
http://tinyurl.com/6b528n My solutions are at
http://gist.github.com/32494 but the problem number isn't
mentioned there.

For this puzzle, I had a grid of cells, each of which had a
value that depends on the values of its neighbors in a way
that guaranteed a stable solution. The value of one cell
was given.

My initial solution [single-threaded.clj] represented the
grid as a vector of vectors of Integers, and maintained a
PersistentQueue of cells that needed to be updated, with a
single loop to work through the queue. For each iteration
of the loop, a cell would be popped off the queue, and a new
value for that cell computed. If the new value was
different from the old value, the 'recur' then updated the
cell in the vector and pushed the neighboring cells onto the
work queue. When the queue was empty, a stable state had
been reached and the answer value could be read.

This ran fairly fast, used no mutable state, and the
implementation seemed relatively clean to me. I was quite
pleased with myself.

But my friend Aaron Brooks who had already solved the
problem was encouraging me to created a multi-threaded
solution. Note the solution I had was already thread-safe,
but only used one of my two processor cores.

My second solution [using-agents.clj] represented the grid
as a vector of vectors of agents. The action function
computed a new value for a given agent, and then used 'send'
to queue up the same action for neighboring agents. It also
maintained other shared state to keep track of how many cell
agents were running so that it could detect when a stable
state for the whole grid had been reached.

Despite the complexity one might expect from all that, the
agent solution was only 3 lines longer than the
single-threaded solution. It also ran about 30% faster.

But it had a bug -- often returning the right answer, but
sometimes returning an incorrect number. It also seemed
more imperative than the first solution, because of the
'send' calls, updating shared counters, etc. When I
mentioned this on IRC, it was recommended I try watchers.

So I added a watcher to every agent before kicking off the
computation process. The watcher did no computation as
such, but it was the perfect place to 'send' to neighboring
agents, update the running count, etc. Moving this code to
the watcher also meant the action function was now pure,
with no side-effects.

It was during the process of separating the stateless and
state-management code that I discovered my bug -- the code
to manage global state had obscured an error in the
computation logic. With them completely separate, it was
easier to think about the specific responsibilities of each.

It was also now easy to see that the pure computation
function was almost exactly the same whether I was using
agents or just a simple loop. I finished factoring out this
duplication and ended up with code that could use either
mechanism to solve the same problem. [with-watchers.clj]

--Chouser

Reply all
Reply to author
Forward
0 new messages