STM - a request for "war stories"

Paul Butcher

unread,

Dec 2, 2012, 11:03:53 AM12/2/12

to clo...@googlegroups.com

All,

I have a request which I hope the members of this group are uniquely positioned to help with. I have recently started working on a new book for The Pragmatic Programmers with the working title "Seven Concurrency Models in Seven Weeks" (it follows on from their existing "Seven Languages" and "Seven Databases" titles).

One of the approaches that I'll be covering is STM, and I'll be presenting it in Clojure.

What I'd like to solicit are "war stories" about problems you've solved using STM, which demonstrate the strengths of the technique over and above (say) threads and locks.

I'm looking for real-world examples instead of presenting yet another hackneyed atomically-make-a-bank-account-withdrawal :-)

Very many thanks in advance for your help!

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Marko Topolnik

unread,

Dec 10, 2012, 5:39:02 AM12/10/12

to clo...@googlegroups.com

The very fact that there has been no reply to this for five days may mean something. I can personally attest to STM being very difficult to put to real-life use because there is always that one thing you absolutely need for your problem, that is mutable and not transactional. Most of the time it will have to do with an existing Java library, JDK not excluded. The property of STM that it is an all-or-nothing commitment has been a show-stopper for me every time I tried to use it.

My guess is, if your task is something purely computational and amenable to massive parallelization, you may have a go with STM; if it's just about business logic accessible concurrently by many clients, you won't find it workable.

Chas Emerick

unread,

Dec 10, 2012, 7:56:08 AM12/10/12

to clo...@googlegroups.com

On Dec 10, 2012, at 5:39 AM, Marko Topolnik wrote:

> The very fact that there has been no reply to this for five days may mean something. I can personally attest to STM being very difficult to put to real-life use because there is always that one thing you absolutely need for your problem, that is mutable and not transactional. Most of the time it will have to do with an existing Java library, JDK not excluded. The property of STM that it is an all-or-nothing commitment has been a show-stopper for me every time I tried to use it.

I'd be surprised if Paul doesn't hear from people directly; people aren't always keen to talk about their work publicly (and in many cases, they are simply barred from doing so), so one shouldn't presume that on-list responses (or not) are representative.

I personally have never used STM in nontrivial ways (AFAIC), but that's due more to the demands of the problems I run into more than anything else. On the other hand, I have used, abused, and benefitted from agents in umpteen ways. Actually, I have often done things using agents that might otherwise been done using STM or other similar approaches, simply to ensure that:

(a) the processing involved can be readily parallelized, and
(b) if necessary, the system can be partitioned/distributed with minimal impact to the architecture, since — if you're careful about things — it doesn't matter whether a send is evaluated in an in-process agent or one housed in a different server/VM/whatever

Yes, only a subset of the things you can do with STM can be done safely with agents, etc. (See: monotonic logic, and the increasingly-popular concepts of lattices, semilattices, and CRDTs.) But, I've been lucky to be able to characterize many problems within that subset.

It's true that STM is "all or nothing", but it is so over the scope of refs you choose. If there's some side-effecting bit you need to do somewhere, then clearly that's not going to fit within a transaction…but that bit will often fit just fine in a send-off to an agent provoked _by_ a transaction. And, if you can implement e.g. 2 of the 5 parts of your system using refs and STM, you just cut your thread-and-locking problems by 40%. :-P

> My guess is, if your task is something purely computational and amenable to massive parallelization, you may have a go with STM; if it's just about business logic accessible concurrently by many clients, you won't find it workable.

If your task is purely computational and amenable to massive parallelization, you _should_ use agents whenever possible. STM provides for coordination in order to enforce consistency; unless all of your operations are commutative (in which case, you should probably be using agents anyway), a program using STM _will_ provoke retries and other means to route around ref contention. This is acceptable because STM is all about maintaining correctness in the face of concurrent mutation, and not necessarily about performance, aggregate throughput, and so on. On the other hand, ref readers are _never_ blocked (regardless of what's going on on the write side), so the data in such refs is always accessible. This sounds like an ideal combination for "business logic" (as nebulous a term as that is) to me.

Cheers,

- Chas

--
http://cemerick.com
[Clojure Programming from O'Reilly](http://www.clojurebook.com)

Marko Topolnik

unread,

Dec 10, 2012, 8:37:08 AM12/10/12

to clo...@googlegroups.com

On Monday, December 10, 2012 1:56:08 PM UTC+1, Chas Emerick wrote:

On Dec 10, 2012, at 5:39 AM, Marko Topolnik wrote:

I personally have never used STM in nontrivial ways (AFAIC), but that's due more to the demands of the problems I run into more than anything else. On the other hand, I have used, abused, and benefitted from agents in umpteen ways. Actually, I have often done things using agents that might otherwise been done using STM or other similar approaches, simply to ensure that:

(a) the processing involved can be readily parallelized, and
(b) if necessary, the system can be partitioned/distributed with minimal impact to the architecture, since — if you're careful about things — it doesn't matter whether a send is evaluated in an in-process agent or one housed in a different server/VM/whatever

The argument (b) is an even better fit (or, should we say, perfect fit) for Actors, as implemented in Erlang.

It's true that STM is "all or nothing", but it is so over the scope of refs you choose. If there's some side-effecting bit you need to do somewhere, then clearly that's not going to fit within a transaction…but that bit will often fit just fine in a send-off to an agent provoked _by_ a transaction.

send-off fails to be useful whenever you need the results within the transaction (quite often, that is).

> My guess is, if your task is something purely computational and amenable to massive parallelization, you may have a go with STM; if it's just about business logic accessible concurrently by many clients, you won't find it workable.

If your task is purely computational and amenable to massive parallelization, you _should_ use agents whenever possible. STM provides for coordination in order to enforce consistency; unless all of your operations are commutative (in which case, you should probably be using agents anyway), a program using STM _will_ provoke retries and other means to route around ref contention. This is acceptable because STM is all about maintaining correctness in the face of concurrent mutation, and not necessarily about performance, aggregate throughput, and so on.

But concurrency is all about performance and throughput. So where is the benefit of using correct, slow concurrent mutation? I guess in a write-seldom, read-often scenario.

On the other hand, ref readers are _never_ blocked (regardless of what's going on on the write side), so the data in such refs is always accessible. This sounds like an ideal combination for "business logic" (as nebulous a term as that is) to me.

Business logic almost always involves communication with outside systems (since it's usually about integration of many existing systems). Even if not, a scalable solution must be stateless (a prerequisite for cluster deployment) and any durable state must go into a single datasource common to all cluster nodes. Again, these datasources don't participate in an STM transaction. Maybe this would be a major route of improvement: integrate the STM with external datasource transactions. But this is still quite removed from the present.

Paul Butcher

unread,

Dec 10, 2012, 9:08:27 AM12/10/12

to clo...@googlegroups.com

On 10 Dec 2012, at 12:56, Chas Emerick <ch...@cemerick.com> wrote:

I'd be surprised if Paul doesn't hear from people directly

I wish that that were true, but no, I've not had anyone get in touch off-list.

Many thanks, Marko, for resurrecting the thread - I'm still definitely keen to hear of first-hand experiences!

Paul Butcher

unread,

Dec 10, 2012, 9:15:04 AM12/10/12

to clo...@googlegroups.com

On 10 Dec 2012, at 13:37, Marko Topolnik <marko.t...@gmail.com> wrote:

But concurrency is all about performance and throughput. So where is the benefit of using correct, slow concurrent mutation? I guess in a write-seldom, read-often scenario.

I'm not at all sure that that's true. There are plenty of occasions where concurrency is about being able to do more than one thing at a time, and not necessarily about making something faster.

For example, your mobile 'phone is concurrent because, while it's playing music to you, it also wants to notice when you poke the screen and listen for incoming calls/messages from the network. And your IDE is concurrent so that it can check the syntax of your code in the background while the UI remains responsive.

I'm not, of course, saying that performance isn't important - even in cases such as the above. It would be a major problem if everything was an order of magnitude slower just because I tried to do two things at the same time. But there are certainly plenty of occasions where we might choose to write concurrent code without our focus being on performance per-se.

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Chas Emerick

unread,

Dec 10, 2012, 9:17:27 AM12/10/12

to clo...@googlegroups.com

On Dec 10, 2012, at 8:37 AM, Marko Topolnik wrote:

It's true that STM is "all or nothing", but it is so over the scope of refs you choose. If there's some side-effecting bit you need to do somewhere, then clearly that's not going to fit within a transaction…but that bit will often fit just fine in a send-off to an agent provoked _by_ a transaction.

send-off fails to be useful whenever you need the results within the transaction (quite often, that is).

I'm not aware of any system that provides transactional semantics in the face of in-transaction side-effecting actions. If you can refer me to any, that'd be great.

> My guess is, if your task is something purely computational and amenable to massive parallelization, you may have a go with STM; if it's just about business logic accessible concurrently by many clients, you won't find it workable.

If your task is purely computational and amenable to massive parallelization, you _should_ use agents whenever possible. STM provides for coordination in order to enforce consistency; unless all of your operations are commutative (in which case, you should probably be using agents anyway), a program using STM _will_ provoke retries and other means to route around ref contention. This is acceptable because STM is all about maintaining correctness in the face of concurrent mutation, and not necessarily about performance, aggregate throughput, and so on.

But concurrency is all about performance and throughput. So where is the benefit of using correct, slow concurrent mutation? I guess in a write-seldom, read-often scenario.

Fundamentally, concurrency is about simultaneous independent computation. Depending on the domain and computations involved, single-thread performance and aggregate throughput can vary significantly.

Anyway, read-heavy applications are still the norm in most industrial settings, despite the rise in popularity of write-scalable architectures.

On the other hand, ref readers are _never_ blocked (regardless of what's going on on the write side), so the data in such refs is always accessible. This sounds like an ideal combination for "business logic" (as nebulous a term as that is) to me.

Business logic almost always involves communication with outside systems (since it's usually about integration of many existing systems). Even if not, a scalable solution must be stateless (a prerequisite for cluster deployment) and any durable state must go into a single datasource common to all cluster nodes. Again, these datasources don't participate in an STM transaction. Maybe this would be a major route of improvement: integrate the STM with external datasource transactions. But this is still quite removed from the present.

I'm certain that particular set of requirements holds in certain settings, but they are hardly universal.

If I may make a tenuous inference, it sounds like you're trying to fit every state transition within an application into a transaction. If so, I'd recommend the opposite: decomposing applications and their processes into modular bags of state and treating them separately will lead to big wins — including potentially being able to use e.g. STM in one place, and agents in another, each interacting with the other as necessary.

Re: getting disparate datasources to participate in transactions, you might want to take a look at Avout:

http://avout.io

I can't say I've used it, but it is at least an existence proof of the ability of the Clojure STM model to be distributable.

Cheers,

- Chas

Marko Topolnik

unread,

Dec 10, 2012, 9:37:00 AM12/10/12

to clo...@googlegroups.com

On Monday, December 10, 2012 3:15:04 PM UTC+1, Paul Butcher wrote:

On 10 Dec 2012, at 13:37, Marko Topolnik <marko.t...@gmail.com> wrote:

But concurrency is all about performance and throughput. So where is the benefit of using correct, slow concurrent mutation? I guess in a write-seldom, read-often scenario.

I'm not at all sure that that's true. There are plenty of occasions where concurrency is about being able to do more than one thing at a time, and not necessarily about making something faster.

For example, your mobile 'phone is concurrent because, while it's playing music to you, it also wants to notice when you poke the screen and listen for incoming calls/messages from the network. And your IDE is concurrent so that it can check the syntax of your code in the background while the UI remains responsive.

My thinking always assumes the existence---and prevalence---of lock-based concurrency. Problems with non-critical performance are usually not too hard to do with locks: just use some simple, coarse locking scheme. I'd need a quite convincing case where locks are an obvious disaster and an STM-based approach comes to rescue. These are hard to find, and personally I have tried several times to start out with STM, only to end up falling back to locks.

Marko Topolnik

unread,

Dec 10, 2012, 9:55:48 AM12/10/12

to clo...@googlegroups.com

On Monday, December 10, 2012 3:17:27 PM UTC+1, Chas Emerick wrote:

On Dec 10, 2012, at 8:37 AM, Marko Topolnik wrote:

It's true that STM is "all or nothing", but it is so over the scope of refs you choose. If there's some side-effecting bit you need to do somewhere, then clearly that's not going to fit within a transaction…but that bit will often fit just fine in a send-off to an agent provoked _by_ a transaction.

send-off fails to be useful whenever you need the results within the transaction (quite often, that is).

I'm not aware of any system that provides transactional semantics in the face of in-transaction side-effecting actions. If you can refer me to any, that'd be great.

I am comparing this with a mutex-based solution, which is still the default way to implement thread safety. Obviously, no problems with side effects there.

But concurrency is all about performance and throughput. So where is the benefit of using correct, slow concurrent mutation? I guess in a write-seldom, read-often scenario.

Fundamentally, concurrency is about simultaneous independent computation. Depending on the domain and computations involved, single-thread performance and aggregate throughput can vary significantly.

Anyway, read-heavy applications are still the norm in most industrial settings, despite the rise in popularity of write-scalable architectures.

So again, I would like to see the benefit of an STM over a lock-based solution. Read-heavy scenarios behave well with read/write locks and if there's not much writing around, it's usually not too complex to be kept under control with locks. So an STM-based solution would have to offer a) less complexity due to no locks and b) not incur its own complexity while dealing with side effects.

Business logic almost always involves communication with outside systems (since it's usually about integration of many existing systems). Even if not, a scalable solution must be stateless (a prerequisite for cluster deployment) and any durable state must go into a single datasource common to all cluster nodes. Again, these datasources don't participate in an STM transaction. Maybe this would be a major route of improvement: integrate the STM with external datasource transactions. But this is still quite removed from the present.

I'm certain that particular set of requirements holds in certain settings, but they are hardly universal.

If I may make a tenuous inference, it sounds like you're trying to fit every state transition within an application into a transaction. If so, I'd recommend the opposite: decomposing applications and their processes into modular bags of state and treating them separately will lead to big wins — including potentially being able to use e.g. STM in one place, and agents in another, each interacting with the other as necessary.

Usually you have a unit of work to complete. If any part of it involves side effects, you'll need a mutex around it, and at that point the STM brings nothing. Another frequent problem is having any kind of time-heavy action, which you must make sure to execute only once (even if it is retryable by nature). Every new problem I start to work with, I first think long and hard how STM could fit into the picture; I have failed every time. Mind that my company is an early Clojure adopter ("we remember when #clojure channel had only 6 people in it" :)

Re: getting disparate datasources to participate in transactions, you might want to take a look at Avout:

http://avout.io

I can't say I've used it, but it is at least an existence proof of the ability of the Clojure STM model to be distributable.

I'll definitely check it out, maybe it gives me good ideas for the future. Thanks!

Marko Topolnik

unread,

Dec 11, 2012, 5:25:58 AM12/11/12

to clo...@googlegroups.com

To give the full story, I should add that atoms are very natural to use and many concurrent use cases are covered by them alone. The combination of atom and immutable vector/map goes a long way and they are also useful even with mutable data, such as lazy-initialized singletons, resources that need to be re-acquired after failure, and other similar cases.

In general I can say that STM transactions of very fine granularity are easy to work with because they effortlessly intertwine with mutable data and side effects.

Patrick Logan

unread,

Dec 11, 2012, 1:34:08 PM12/11/12

to clo...@googlegroups.com

I am unsure whether you are writing about STM in general or in Clojure specifically.

I worked for Gemstone Systems for five years on the object engine as well as applications of the distributed, multi-user, garbage-collected STM that is the centerpiece of Gemstone Smalltalk. During that time I worked with several customer applications where STM had both positive and negative contributions.

If this is of interest, you can contact me directly.

I can say briefly that Gemstone Smalltalk and its multi-user STM was and is being used for:

1. tracking nearly every container shipped across the Pacific ocean.

2. used to quickly develop cutting-edge financial trading instruments.

3. used to quickly develop mobile communications billing policies.

4. used to control and monitor large semiconductor fabs.

5. dispatching utility repair equipment throughout the southeastern U.S.

6. pharmaceuticals, ...

7. insurance policies, ...

8. ad hoc, distributed workflows, ...

-Patrick

Stuart Halloway

unread,

Dec 11, 2012, 2:41:06 PM12/11/12

to clo...@googlegroups.com

Hi Paul,

If it isn't too late to change your chapter title, I would encourage emphasizing Clojure's model of references and values in general, and the option of implementing a variety of different reference semantics that all conform to the same basic API shape.

That general approach has been game-changing for me, and the STM occupies a rather small niche in the overall space.

Datomic stores the entire database in an atom (not an STM ref), and updates it with a call to swap! It is literally no more complex than a trivial hackneyed book example. :-)

Cheers,

Stu

Stuart Halloway

unread,

Dec 11, 2012, 2:53:40 PM12/11/12

to clo...@googlegroups.com

Hi Paul,

Here is a real-world, production software example of the advantage of values+refs over mutable objects and locks. A Datomic user reported the following stack trace as a potential bug:

12:45:43.480 [qtp517338136-84] WARN c.v.a.s.p.e.UnknownExceptionHandler - UnknownExceptionHandler: null
java.util.ConcurrentModificationException: null
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[na:1.7.0_07]
at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[na:1.7.0_07]
at clojure.core.protocols$fn__5871.invoke(protocols.clj:76) ~[clojure-1.4.0.jar:na]
at clojure.core.protocols$fn__5828$G__5823__5841.invoke(protocols.clj:13) ~[clojure-1.4.0.jar:na]
at clojure.core$reduce.invoke(core.clj:6030) ~[clojure-1.4.0.jar:na]

I immediately had 99% confidence that the bug was in user code, and even a pretty good idea what went wrong. A call to "reduce" is a functional transformation, and it expects to be passed values. The exception clearly indicates a violation of that contract, and is caused by cross-thread aliasing and mutation in the calling code.

Regards,

Stu

On Sun, Dec 2, 2012 at 11:03 AM, Paul Butcher <pa...@paulbutcher.com> wrote:

Message has been deleted

Timothy Baldridge

unread,

Dec 11, 2012, 4:11:36 PM12/11/12

to clo...@googlegroups.com

I want to +1 what Stuart said. In my research on the subject, almost every implementation of STM that allows for mutable-by-default data has ended up as a miserable failure.

Specifically see the results from Microsoft's research: http://www.infoq.com/news/2010/05/STM-Dropped

Clojure's implementation of STM hinges on the fact that must data is immutable. Thus a transaction that reads 100 items from one ref and writes the results to two other refs, needs only track updates to 3 pointers instead of 103.

The PyPy STM model is a bit different (http://morepypy.blogspot.com/2012/05/stm-update-back-to-threads.html). Here they assume that all threads are serial (global lock) and then selectively enable STM in places where it is known that threads are unlikely to clobber each other's data. It has resulted in slight success, but that's mostly due to the fact that Python currently is locked to a single core, so any concurrency is better than nothing in that regard. That being said, the PyPy guys have a 10+ year history of proving scoffers wrong, so there's hope.

So as Stuart said, it's worth pointing out that in a OO mutable-by-default language, STM is basically worthless.

Timothy Baldridge

On Tue, Dec 11, 2012 at 1:30 PM, Marko Topolnik <marko.t...@gmail.com> wrote:

Just curious, how did you immediately eliminate the possibility that the reducing function was mutating the list that is being reduced? No concurrency involved. In regular Java the 95% leading cause of CME is precisely that.

Anyway, this applies to immutable structures per se, whether combined with atoms, refs, or none. But a full wartime story must also cover how the solution avoids the pitfalls of retryable transactions. This is the real sore point in my experience, and the one which makes STM an all-or-nothing enterprise.

On Tuesday, December 11, 2012 8:53:40 PM UTC+1, stuart....@gmail.com wrote:
Hi Paul,

Here is a real-world, production software example of the advantage of values+refs over mutable objects and locks. A Datomic user reported the following stack trace as a potential bug:

12:45:43.480 [qtp517338136-84] WARN c.v.a.s.p.e.UnknownExceptionHandler - UnknownExceptionHandler: null
java.util.ConcurrentModificationException: null
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[na:1.7.0_07]
at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[na:1.7.0_07]
at clojure.core.protocols$fn__5871.invoke(protocols.clj:76) ~[clojure-1.4.0.jar:na]
at clojure.core.protocols$fn__5828$G__5823__5841.invoke(protocols.clj:13) ~[clojure-1.4.0.jar:na]
at clojure.core$reduce.invoke(core.clj:6030) ~[clojure-1.4.0.jar:na]

I immediately had 99% confidence that the bug was in user code, and even a pretty good idea what went wrong. A call to "reduce" is a functional transformation, and it expects to be passed values. The exception clearly indicates a violation of that contract, and is caused by cross-thread aliasing and mutation in the calling code.

Regards,
Stu

--

You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Stuart Halloway

unread,

Dec 11, 2012, 4:53:08 PM12/11/12

to clo...@googlegroups.com

Is it possible to write a reducing function that mutates a list in Clojure? Sure. But I think it is absurdly unlikely that it would happen by accident. My 1% chance wasn't hedging against that case -- I was hedging against a bug in reduce itself.

I don't really see even a 1% likelihood of either of those, but I played D&D as a kid and learned that even the most unlikely things happen one time in twenty. :-)

Stu

On Tue, Dec 11, 2012 at 3:30 PM, Marko Topolnik <marko.t...@gmail.com> wrote:

Just curious, how did you immediately eliminate the possibility that the reducing function was mutating the list that is being reduced? No concurrency involved. In regular Java the 95% leading cause of CME is precisely that.

Anyway, this applies to immutable structures per se, whether combined with atoms, refs, or none. But a full wartime story must also cover how the solution avoids the pitfalls of retryable transactions. This is the real sore point in my experience, and the one which makes STM an all-or-nothing enterprise.

On Tuesday, December 11, 2012 8:53:40 PM UTC+1, stuart....@gmail.com wrote:

Hi Paul,

Here is a real-world, production software example of the advantage of values+refs over mutable objects and locks. A Datomic user reported the following stack trace as a potential bug:

12:45:43.480 [qtp517338136-84] WARN c.v.a.s.p.e.UnknownExceptionHandler - UnknownExceptionHandler: null
java.util.ConcurrentModificationException: null
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:819) ~[na:1.7.0_07]
at java.util.ArrayList$Itr.next(ArrayList.java:791) ~[na:1.7.0_07]
at clojure.core.protocols$fn__5871.invoke(protocols.clj:76) ~[clojure-1.4.0.jar:na]
at clojure.core.protocols$fn__5828$G__5823__5841.invoke(protocols.clj:13) ~[clojure-1.4.0.jar:na]
at clojure.core$reduce.invoke(core.clj:6030) ~[clojure-1.4.0.jar:na]

I immediately had 99% confidence that the bug was in user code, and even a pretty good idea what went wrong. A call to "reduce" is a functional transformation, and it expects to be passed values. The exception clearly indicates a violation of that contract, and is caused by cross-thread aliasing and mutation in the calling code.

Regards,
Stu

--

Marko Topolnik

unread,

Dec 12, 2012, 2:24:31 AM12/12/12

to clo...@googlegroups.com

Yes, upon second thought I saw exactly what you mean.

I think you make an important point: when talking about the STM we need to look at the wider picture. Where a mutable-by-default language needs the STM, Clojure gets by with just atoms because a single swap! call can do any number of mutations at once. We naturally tend to design things so that everything related fits into a single structure and the result simply doesn't need the STM. Therefore the threshold of complexity beyond which you need an STM solution in Clojure is quite a bit higher than usual.

An important angle to my experience with Clojure that I now see must be added is that I never got frustrated with Clojure's STM because it failed to deliver: I just found that my solutions worked perfectly well without it.

john

unread,

Dec 12, 2012, 3:50:31 AM12/12/12

to clo...@googlegroups.com

So is the bottom line: STM Should have not been added to clojure ( because it is not pratical)

Many Grettings
John

Marko Topolnik

unread,

Dec 12, 2012, 4:46:06 AM12/12/12

to clo...@googlegroups.com

I wouldn't say that it should not have been added since its presence isn't harming anything. You could say, though, that rarely anyone would realize something was missing if Clojure didn't have the STM.

Warren Lynn

unread,

Dec 12, 2012, 11:21:49 AM12/12/12

to clo...@googlegroups.com

On Wednesday, December 12, 2012 4:46:06 AM UTC-5, Marko Topolnik wrote:

I wouldn't say that it should not have been added since its presence isn't harming anything. You could say, though, that rarely anyone would realize something was missing if Clojure didn't have the STM.

Although I am convinced that STM can solve things that locks cannot (See the claim "lock-based programs do not compose" on Wikipedia page http://en.wikipedia.org/wiki/Software_transactional_memory), I feel this feature is so much over-sold. Whenever you read someone raves about Clojure on the web, they mention "STM" as a key feature and how wonderful it is. My own experience is similar to yours, atoms work most of the time and I also need to use locks. I benefit more from the fact that Clojure clearly marks what is mutable and what is not than any of those "advanced" features.

Paul Butcher

unread,

Dec 12, 2012, 12:42:35 PM12/12/12

to clo...@googlegroups.com

Hey Stuart,

Thanks for the response.

What I'm trying to do is keep each chapter focussed on an approach, rather than a language. For example, in the chapter on Actors, I'll be showing examples in Scala, but the discussion won't be (I hope!) particularly Scala-specific. I hope to leave the reader with general lessons which could be applied to Scala, Erlang, or any other language with Actor support. Similarly, when talking about Threads and Locks, I'll be showing examples in Java, but the lessons should be equally applicable to C/C++, etc.

I completely take your point about Clojure's approach being a great deal more than STM. I guess that I chose STM as the title because it's got visibility - people are talking about it, and there will be an expectation on the part of the reader that any book that covers concurrency will spend some time talking about it.

I'd be very interested to hear any suggestions for an alternative chapter title. I guess what best sums up Clojure's approach is that it separates state from identity - but "Separating State from Identity" isn't exactly pithy, and I fear won't mean much at first glance to most readers.

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Ryan Kelker

unread,

Dec 13, 2012, 4:14:46 PM12/13/12

to clo...@googlegroups.com

@Paul Butcher

I would argue that Clojure's STM implementation is very similar or based on the design of Apache CouchDB's Multi-Version Concurrency Model.

1. Immutable by default.
2. You can't corrupt a completed transaction.
3. Conflict resolution essentially gives the previous revision before the conflict occured and then gives the latest revision from there on out.
4. Depending on the type of transaction, the transaction will attempt to restart on failure

Let me know if this is no longer accurate.

Ben Mabey

unread,

Dec 13, 2012, 4:47:32 PM12/13/12

to clo...@googlegroups.com

> Datomic stores the entire database in an atom (not an STM ref), and
> updates it with a call to swap! It is literally no more complex than a
> trivial hackneyed book example. :-)
>

A lot of my systems have evolved into something similar and I've
wondered what the implications of this approach are. As more and more
state is added to this single atom and with multiple threads performing
swap!s (CASs) how will performance be effected? i.e. How will write
contention play out in a system designed like this? I'm sure the answer
to this depends on the details of the system but at what point does this
become a problem?

In the epic thread about the STM between Rich and Cliff Click[1] the
main argument against the STM was that it didn't help solve the problem
of where to place guards around the data. From one of Cliff's arguments:

In a trivial example I can say ï¿½go up one call level and atomic thereï¿½,
but in the Real Program ï¿½ I canï¿½t do that.
Go up how many layers and add atomic? 1 layer? 10 layers? 100 layers?
Yes, I see unique call-stacks
with >100 layers. I canï¿½t put atomic around main because that makes my
program single-threaded.

I believe Cliff is arguing here that when a program pushes all of the
state into a single atom where a lot of writes occur that app
effectively is single-threaded. (Please correct me if I am
misunderstanding!) Thoughts?

-Ben

1. http://www.azulsystems.com/blog/cliff/2008-05-27-clojure-stms-vs-locks

Raoul Duke

unread,

Dec 13, 2012, 4:58:54 PM12/13/12

to clo...@googlegroups.com

> the design of Apache CouchDB's Multi-Version Concurrency Model.

because haskell got it from apache, i'm sure ;-)

Wei Hsu

unread,

Dec 13, 2012, 6:45:11 PM12/13/12

to Clojure

To add to the conversation, I wrote an agent-based website load tester
earlier this year for work. Happy to share my thoughts with Paul
offline if he thinks it's useful, although I wouldn't be able to share
the code itself.

Patrick Logan

unread,

Dec 13, 2012, 7:22:31 PM12/13/12

to clo...@googlegroups.com

Paul,

Another concurrency model I've used a great deal is the tuplespace model, specifically javaspaces. This is an often forgotten model that has a lot to offer with a high expressiveness to complexity ratio.

Not closure specific, so feel free to contact me again directly if you're interested.

kovas boguta

unread,

Dec 13, 2012, 7:30:27 PM12/13/12

to clo...@googlegroups.com

My recommendation is either "Persistent Datastructures" or "Database as a Value"

Its shocking and amazing that an entire database (eg, the most
concurrent stateful thing you can imagine) requires just a handful of
atoms. Check out
http://www.infoq.com/presentations/Datomic-Database-Value

Persistent datastructures are really the core of clojure's
concurrency. If you can't incorporate novelty without cloning the
entire datastructure, thats not that useful.

Rick Moynihan

unread,

Dec 13, 2012, 9:21:15 PM12/13/12

to clo...@googlegroups.com

On 12 December 2012 16:21, Warren Lynn <wrn....@gmail.com> wrote:

Although I am convinced that STM can solve things that locks cannot (See the claim "lock-based programs do not compose" on Wikipedia page http://en.wikipedia.org/wiki/Software_transactional_memory), I feel this feature is so much over-sold. Whenever you read someone raves about Clojure on the web, they mention "STM" as a key feature and how wonderful it is. My own experience is similar to yours, atoms work most of the time and I also need to use locks. I benefit more from the fact that Clojure clearly marks what is mutable and what is not than any of those "advanced" features.

Whilst I don't have any strong use-cases; I always felt the composability and flexibility of refs is the important feature, and is something to consider on a case by case basis.

When you choose an atom you are effectively saying that nobody else will ever need to ensure consistency between this identity and another. This is fine if you're writing an application, where you can stake a claim and create an atom for global state, but what when a library you use does that? Or maybe you're writing a library? If you choose an atom, might you be forcing a decision that belongs with the application?

Yes, I know Clojure libraries typically don't expose identities and state in this manner, as typically you'd just want to provide the functional stuff and let the application manage the state, but surely there are some libraries where this is done and perhaps desirable?

In these cases, (do they exist?) where you want to expose an identity to others, should you not be using a ref?

Rick.

Paul Butcher

unread,

Dec 14, 2012, 12:51:50 AM12/14/12

to clo...@googlegroups.com

On 14 Dec 2012, at 00:22, Patrick Logan <patric...@gmail.com> wrote:

Another concurrency model I've used a great deal is the tuplespace model, specifically javaspaces. This is an often forgotten model that has a lot to offer with a high expressiveness to complexity ratio.

Ah! That brings back memories :-) I wrote my PhD thesis on Linda back in the early 90s.

I agree that it's a cute model (I thought that it was cute enough to spend 3 years of my life on it!) but I'm not aware of anyone using it in anger now? I'd be very interested to hear about live projects.

I know about Javaspaces/JINI/River, but I've not really heard about anyone using them?

Paul Butcher

unread,

Dec 14, 2012, 12:55:39 AM12/14/12

to clo...@googlegroups.com

On 14 Dec 2012, at 00:30, kovas boguta <kovas....@gmail.com> wrote:

My recommendation is either "Persistent Datastructures" or "Database as a Value"

Interesting. I'd be interested to hear others thoughts on this. In particular Rich's

Rich - what is the "soundbite description" of Clojure's concurrency model you're happiest with?

If you can't incorporate novelty without cloning the
entire datastructure, thats not that useful.

I'm not 100% sure what you mean by this - can you expand?

Ryan Kelker

unread,

Dec 14, 2012, 2:42:16 AM12/14/12

to clo...@googlegroups.com

I don't really care where good ideas come from. Feel free to expand your mind.

2012年12月14日金曜日 6時58分54秒 UTC+9 raould:

kovas boguta

unread,

Dec 14, 2012, 3:57:51 AM12/14/12

to clo...@googlegroups.com

On Fri, Dec 14, 2012 at 12:55 AM, Paul Butcher <pa...@paulbutcher.com> wrote:
> On 14 Dec 2012, at 00:30, kovas boguta <kovas....@gmail.com> wrote:

> If you can't incorporate novelty without cloning the
> entire datastructure, thats not that useful.
>
>
> I'm not 100% sure what you mean by this - can you expand?

The principle failing of locks is they don't compose. The principle
failing of objects w.r.t. concurrency is that locking recipes don't
compose.

Being able to do concurrent operations on composite objects is a big
win for clojure. Very often we want concurrent operations on something
that isn't just a simple number, but a map, array, or some more
complex nested structure.

Values ensure correctness, but the performance would be terrible if
every time you changed 1 element in the array, we had to create a new
array. Persistent datastructures allow for structural sharing, thus
giving you correctness and good performance. Without that, you could
never build an database as a value for example.

Recommend the value of values, the database as a value, and any early
clojure talk.

Also this thread
https://groups.google.com/forum/?hl=en&fromgroups#!topic/clojure/XHqWLMcsH-c

mentions the same themes. It's interesting because Rich makes multiple
comparisons between clojure's concurrency model and database systems.

>
> --
> paul.butcher->msgCount++
>
> Snetterton, Castle Combe, Cadwell Park...
> Who says I have a one track mind?
>
> http://www.paulbutcher.com/
> LinkedIn: http://www.linkedin.com/in/paulbutcher
> MSN: pa...@paulbutcher.com
> AIM: paulrabutcher
> Skype: paulrabutcher
>

Marko Topolnik

unread,

Dec 14, 2012, 7:04:37 AM12/14/12

to clo...@googlegroups.com

In the epic thread about the STM between Rich and Cliff Click[1] the
main argument against the STM was that it didn't help solve the problem
of where to place guards around the data. From one of Cliff's arguments:

In a trivial example I can say ï¿½go up one call level and atomic thereï¿½,
but in the Real Program ï¿½ I canï¿½t do that.
Go up how many layers and add atomic? 1 layer? 10 layers? 100 layers?
Yes, I see unique call-stacks
with >100 layers. I canï¿½t put atomic around main because that makes my
program single-threaded.

I believe Cliff is arguing here that when a program pushes all of the
state into a single atom where a lot of writes occur that app
effectively is single-threaded. (Please correct me if I am
misunderstanding!) Thoughts?

Actually, this is one of Cliff's weaker points. Note that when he says "atomic", he really means "dosync". He speaks from experience with HotSpot where this was a constant source of bugs. It doesn't translate directly into the same ptifalls with Clojure's STM because 1) Clojure has strict control on what needs a transaction to mutate and 2) the points of mutation are much more focused when you are dealing with immutable structures.

Cliff's strongest argument comes from experience with MPI, where he raises very valid points against any implementation of a high-performance concurrent library:

This is exactly the trap MPI fell into; and you have to do it anyways. Double-unsmiley. :-( :-(
Here’s the deal:
I write a Large Complex Program, one that Really Needs STM to get it right.
But performance sucks.
So I do a deep dive into the STM runtime, and discover it has warts.
So I hack my code to work around the warts in the STM.
Crap like: at an average of 5 cpus in this atomic block the STM ‘works’, but at an average of 7 cpus in the same atomic block I get a continous fail/retry rate that’s so bad I might as well not bother. So I guard the STM area with a “5-at-a-time” lock and a queue of threads waiting to enter. Bleah (been there; done that – for a DB not an STM but same-in-priniciple situation). A thousand thousand variations of the same crap happens, each requiring a different hack to my code to make it performant.
Meanwhile the STM author (You: Rich) hacks out some warts & hands me a beta version.
I hack my code to match the STM’s new behavior, and discover some new warts.
Back & Forth we go – and suddenly: my app’s “performance correctness” is intimately tied to the exact STM implementation. Any change to the STM’s behavior kills my performance – and you, Rich, have learned a lot about the building of a robust good STM. You (now) know the mistakes you made and knowit’s time to restart the STM implementation from scratch.

On the other hand, this is a problem occuring only at the most demanding level of load on the code. People may still benefit from the STM to write simple, correct concurrent programs. As I already explained, in my experience atoms cover 98% of that need and locks are still unavoidable. If I wrote a whole system from scratch and based everything on the STM, then it could replace locks. This will never happen in a JDK-based Clojure.

Rich Hickey

unread,

Dec 14, 2012, 8:52:07 AM12/14/12

to clo...@googlegroups.com

On Dec 14, 2012, at 12:55 AM, Paul Butcher wrote:

> On 14 Dec 2012, at 00:30, kovas boguta <kovas....@gmail.com> wrote:
>
>> My recommendation is either "Persistent Datastructures" or "Database as a Value"
>
> Interesting. I'd be interested to hear others thoughts on this. In particular Rich's
>
> Rich - what is the "soundbite description" of Clojure's concurrency model you're happiest with?

Ah, soundbites, the foundation of modern programmer education :)

How about this:

"Clojure doesn't need a concurrency model. It has a state model that can be realized in multiple concurrency-safe ways."

For the state model:

"Separate identities and values"

The best thing about Clojure's reference types is that they exist. The best thing about Clojure is that you rarely need them. Certainly, pigeonholing Clojure as STM-based is way off the mark.

Also, a chapter on STM that doesn't distinguish Clojure and Haskell's functional STM approach from the ordinary "wrap your old imperative code in transactions" approach is going to miss the biggest point. People reading a chapter on the 'STM model' independent of the functional approach of the languages in which it has succeeded are bound to be disappointed and ill-informed.

I'd argue that the 'concurrency model' of Clojure and Haskell is 'functional programming' + reference types. Their STMs are a subset of that. As Kovas has pointed out, the symbiosis of persistent data structures and this reference approach is fundamental.

I guess this would be my alternative chapter title proposal:

Functional Programming + Reference Types

Rich

Paul Butcher

unread,

Dec 14, 2012, 10:04:10 AM12/14/12

to clo...@googlegroups.com

On 14 Dec 2012, at 13:52, Rich Hickey <richh...@gmail.com> wrote:

On Dec 14, 2012, at 12:55 AM, Paul Butcher wrote:
Rich - what is the "soundbite description" of Clojure's concurrency model you're happiest with?

Ah, soundbites, the foundation of modern programmer education :)

Maybe I should have said "least unhappy"? :-)

Certainly, pigeonholing Clojure as STM-based is way off the mark.

Agreed 100%.

Also, a chapter on STM that doesn't distinguish Clojure and Haskell's functional STM approach from the ordinary "wrap your old imperative code in transactions" approach is going to miss the biggest point.

Also agreed 100%. That's a (very) large part of why I've chosen Clojure as the language for this chapter.

I guess this would be my alternative chapter title proposal:

Functional Programming + Reference Types

I can see that logic. Unfortunately, as well as this chapter, I also plan to have a chapter on functional programming using Haskell's Par and Eval monads, so I need some kind of title that draws the distinction. I know that there are similarities between the approaches, but they also have very distinct flavours (as you say in your "The Database as a Value" talk, it's thoroughly unclear how to represent an immutable value that performs IO as a monad).

For the avoidance of doubt, I don't for one second disagree that Clojure is more than STM. Nor do I disagree that it's important that I cover the separation of values from identity, atoms and agents as well as refs/STM. I'm just trying to come up with a pithy chapter title.

How about "transactional state"? "Immutable state"?

--
paul.butcher->msgCount++

Snetterton, Castle Combe, Cadwell Park...
Who says I have a one track mind?

http://www.paulbutcher.com/
LinkedIn: http://www.linkedin.com/in/paulbutcher
MSN: pa...@paulbutcher.com
AIM: paulrabutcher
Skype: paulrabutcher

Marko Topolnik

unread,

Dec 14, 2012, 10:48:47 AM12/14/12

to clo...@googlegroups.com

When you choose an atom you are effectively saying that nobody else will ever need to ensure consistency between this identity and another. This is fine if you're writing an application, where you can stake a claim and create an atom for global state, but what when a library you use does that? Or maybe you're writing a library? If you choose an atom, might you be forcing a decision that belongs with the application?

Yes, I know Clojure libraries typically don't expose identities and state in this manner, as typically you'd just want to provide the functional stuff and let the application manage the state, but surely there are some libraries where this is done and perhaps desirable?

In these cases, (do they exist?) where you want to expose an identity to others, should you not be using a ref?

Using refs throughout a library only makes sense if the library has absolutely no non-transactional side-effects (such as using any mutable Java API). There is a very narrow space where both 1) there are side effects and 2) all of them are transactional. One such example is the lamina library, which optionally supports fully transactional objects. Do note that the primary reason why the library supports both transactional and non-transactional modes is performance. It seems like you can't have both performance and transactions in today's Clojure.

Patrick Logan

unread,

Dec 14, 2012, 7:52:11 PM12/14/12

to clo...@googlegroups.com

Contact the apache river folks for details. There have been several published accounts, but it is definitely the case that many jini/javaspaces users felt it was in their own interest not to draw attention to this technology as it was determined to be a competitive advantage.

http://river.apache.org/

You can also contact the gigaspaces commercial effort, where they are very willing to talk: http://www.gigaspaces.com/

Reply all

Reply to author

Forward