Are explicit locking designs free from such anomalies? I wouldn't think
so.
Randall Schulz
paraphrase: the behaviour under load gets weird and tricky.
well, in some ways maybe that doesn't happen with locks: the question
of (depending on which STM system we're looking at) figuring out what
caused all the rollbacks vs. with locks, you can at least generally
quickly see that a given lock has 90% of all threads waiting on it
kind of thing. whereas you don't know what variable in the venn
diagram of overlapping transactions caused the retry?
tho i believe Rich previously said that Clojure would actually have a
way of finding that out!
http://groups.google.com/group/clojure/browse_thread/thread/2f73b80fdd91705a
I'm chipping in out of interest and to improve my understanding,
apologies if I sound like an idiot:
cliffc wrote:
> On May 24, 10:51 am, Rich Hickey <richhic...@gmail.com> wrote:
>> As we add locks, performance of both approaches improves, but a new
>> source of errors comes into play for the manual approach - lock
>> acquisition order and the resultant deadlock risk. At this point STM
>> becomes dramatically less complex than manual locking.
>
> Not interesting *in practice*. Because in practice, deadlocks are
> easy to find & fix.
From personal experience, this is far from the truth in complex
systems. Deadlocks happening only on "in the wild" systems, appearing in
the form of heisenbugs, etc. Not fun at all. There's too much in the way
of implicit contracts going on, which blows up in your face if you're
trying to extend undocumented software that was written by someone who
left the company before you arrived. (and we all know the "well then
document it" approach just doesn't happen in practice)
Maybe it's just the implementations I've used (pthreads, Win32, OpenMP)
and others give you higher-level diagnostics, but at the level I was
working it could get very painful.
> You get a crash dump, crawl the stacks, discover the lock cycle &
> reorganize.
> Sure the situation could be better, but deadlocks are a 'crash once'
> kind of bug.
You need a reasonable amount of infrastructure in place for that, plus
you're relying on absence of evidence rather than proof that it can't
deadlock.
> Dev's don't like 'em, but they don't lose sleep over them either.
The people who lose sleep over software quality are probably the kind
who try to avoid complex locking schemes like the plague in the first
place. :)
> Ok, the long sought after counter example.: STM's do not compose w/
> correctness.
> Bad Java syntax follows, because I don't 'speak' Clojure. I'm using
> the classic example.
>
> class Account {
> long _money;
> add( long m ) { atomic { _money = _money + m; } }
> }
>
> transfer( Account a1, Account a2, long m ) {
> a1.add( m);
> a2.add(-m);
> }
>
> While 'add' alone is STM'd & safe, the two calls to 'add' that need to
> be wrapped in an 'atomic' are not - so there is a race where the
> accounts do not balance. Deciding at what level to add the next
> 'atomic' (should it be around 'transfer' or at a higher level or not
> at all?) demands the programmer have domain knowledge not already
> encapsulated in the Account class. THIS is the source of vast numbers
> of bugs.
My understanding is that this is exactly the kind of situation where STM
excels: you wrap the two add calls in a transaction rather than making
them individually atomic. The way Clojure handles this (I've been
spending 99.9% of my time in Clojure on non-threaded things, so I could
easily have missed something) is that your _money would be a ref, and
any attempt at modifying it will fail unless you're in a transaction.
Wrapping the 'add' around the transaction would be the anti-pattern
here, you want to make the 'transfer' a transaction.
> Less trivial examples quickly spiral out of control. Sure I know
> Ref's A, B & C are all Ref's and thus only updated in 'dosync' blocks
> - but can I split any collection of Ref updates into more than one
> atomic region? The ad-absurdum answer is to wrap a dosync around
> *all* Ref's - which means the whole program, and that means a single-
> threaded program. Imagine the program unrolled in front of you, as an
> endless sequence of actions (all control flow flattened out)...
> interspersed with other actions are updates to Ref's. Now: dissect
> this list into groups with 'atomic' or 'lock' as you see fit,
> preserving the program meaning as another unrolled thread of execution
> is interspersed. How do you decide where it's OK to split the stream
> and where it's not OK?
>
> Why, in the Grand Scheme of things is: "... { read _money ... write
> _money } ... " a correct splitting of the action stream,
> while "... { read _money ... write _money } ... { read _money ...
> write _money } ..." is broken,
> but "... {{ read _money ... write _money } ... { read _money ... write
> _money }} ..." is correct again?
Okay, you kind of lost me with what you're trying to say here.
I get the impression you're mixing up atomic operations on the low level
with a high-level STM implementation like Clojure's. The latter
presumably won't be efficient unless the implementation uses the former,
but my understanding is that the programmer using an STM implementation
is largely supposed to be isolated from such details, as long as he/she
is able to determine what operations should be wrapped in a single
transaction.
~phil
it would be pretty nifty if you two had the chance to get together to
try out & blog about Clojure STM on an Azul box :-)
Thanks
>> (and we all know the "well then document it" approach just doesn't happen in practice)
>
> Yup, but You could make a difference. HotSpot probably has the
> highest level of 'asserts' per line of code (and a pretty darned high
> level of comments per line of code) of anything Out There. Docs are
> cheaper than debugging. But it's an aside.... back to deadlocks-in-
> practice:
Yeah, as far as I can tell, the concurrency infrastructure of the
JVM/Java is quite robust.
>> Maybe it's just the implementations I've used (pthreads, Win32, OpenMP) and others give you higher-level diagnostics, but at the level I was
>> working it could get very painful.
>>
>>> You get a crash dump, crawl the stacks, discover the lock cycle & reorganize.
>>> Sure the situation could be better, but deadlocks are a 'crash once' kind of bug.
>> You need a reasonable amount of infrastructure in place for that,
>
> Crash dump? Core file? 'ctrl-backslash' to a JVM to get a thread-
> dump?
> This stuff is commonly available.
> If you don't have a proper tool chain, then Job One for you should be
> to go get one - and I'm very serious here. Drop everything else until
> you get a functional 'path' (protocol, implementation, business
> process, etc) that allows you do debug a crash from a customer in the
> field. You might need a willing beta-partner for this, hand him a
> broken program & let it crash, figure out he needs disk-space to
> capture the core & logs, needs training to hit 'ctrl-backslash' when
> suspecting a deadlock, etc - plus FTP the files back to Home Base,
> plus you need to be able to capture the files per-customer-per-bug
> (ala Bugzilla), then decrypt the file (gdb for core, eyeballs or a
> little perl for stack-trace logs), etc, etc, etc, .... No Rocket
> Science, but annoying bits of Engineering along with some Customer
> Management.
I absolutely agree with you there, and yeah, some environments make this
a lot easier than others. 'Environments' can be in the context of either
programming or work environment of course...
I guess most of my "I've been bitten by this too often" cases in
concurrency have been those where the environment has been unhelpful in
some way, and I've mostly been debugging other people's code.
The stuff I'm doing right now isn't actually using any concurrency
features yet, although it probably will at some point, so I'm interested
in learning how to do it right when I do get round to introduce it.
>> plus you're relying on absence of evidence rather than proof that it can't deadlock.
>
> Err.... No.
>
> I'm not shooting for perfection here; I'm shooting for "good enough to
> ship". Which means that if - In Practice - each possible deadlock
> happens once, then gets fixed - that's almost always Good Enough.
> Maybe not to run a nuclear reactor, but definitely good enough to run
> large businesses. In practice, most possible deadlocks never happen -
> but a few that do require several go-'rounds before getting fixed
> properly. And then the deadlock is fixed, and remains fixed.
This is true, there's a reason proper scientific program verification
isn't exactly mainstream. Although at with an STM system you probably
ought to stand a better chance of getting it right than with explicit
locking. The question then becomes what you're giving up for it in
return for that safety. Which is what this thread is all about, right.
>> Okay, you kind of lost me with what you're trying to say here.
>
> Sigh - we mentally missed here.
>
> Trivial examples are.... trivial. They can be fixed in a thousand
> obvious ways. We need to extrapolate to the non-trivial, because
> that's the only place where the STM-vs-Locks argument becomes
> interesting.
Agreed.
Okay, that makes a lot more sense. So really, what you want is a way to
formalise, say, invariants, and detect violations as early as possible.
I think the Clojure STM has a couple of training wheels hinting at this
kind of thing already. We would probably also want some kinds of other
"ref relationship" primitives, because invariants are awful at
describing most things. Is there maybe something comparable in database
systems?
I wonder, though, how well this can work. When laying down the rules
comes close to being as much work as describing what to do within those
rules, the temptation will always be not to formalise any rules. Complex
systems will probably breed complex rules, so how do you make the
specification of the rules straight-forward?
If it's possible to pull this off effectively, maybe it could even aid
the performance issue. Given some kind of relationship graph of the
various parts of mutable data, the STM runtime could attempt to arrange
and represent things optimally in memory. You know, the "more
information, more scope for optimisation" argument that never really
worked when it was applied to high-level languages. ;)
~phil
all things have a range of options; there are some approaches (e.g.
'skeletons', i think) where you have a very limited way to do
concurrency, the supposed advantage being that it is much harder to do
something broken in the long run. they won't let you go 1000 mph so to
speak, but maybe they help prevent you from going 1000 mpg into a
concrete wall, so to speak. http://www.youtube.com/watch?v=AB4IEa7jTJw