tay
...@template.com wrote:
> In article <360B74AF.90715
...@zko.dec.com>,
> Dave Butenhof <buten
...@zko.dec.com> wrote:
> > As long as your code is otherwise correct, however, you can use a recursive
> > mutex anywhere you can use a normal mutex, and everything will work just
> > fine. I used to think this wasn't true, and argued against the UNIX98
> > language that allows an implementation to make mutexes recursive
> > BY DEFAULT, because, in particular, waiting on a condition variable with a
> > recursively locked mutex results in a deadlock. However, it was correctly
> > pointed out to me by others on the committee that this is only the case if
> > the mutex is actually locked recursively, which you can't legally do unless
> > you ASKED for a recursive mutex. So it's not a problem for correctly coded
> > programs: either they wanted a recursive mutex, (and can be assumed to
> > "know what they're doing"), or they won't ever lock it recursively.
> I don't understand exactly what you're saying about the semantics of doing
> a wait with a recursively locked mutex. I'm not sure what they should be,
> and the Single UNIX spec seems to avoid the question. (I'm looking at
> http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_timedwai...)
> What you seem to be saying is that it is an error to do a wait if a mutex is
> recursively locked. But it seems like it might also be valid to define it
> to work the way you're saying that Java locks work: when you wait, it saves
> the current lock count for this thread, completely unlocks the lock, then
> waits, then on wakeup, restores the previous lock count. I can see that
> this would probably be a very dangerous thing to do, as it essentially
> ignores the outer locks, but is it possibly useful and valid if you're
> sure you know what you're doing? Also, if implementing it this way is
> incorrect, are the keepers of Java aware of this? Should Java instead
> throw an exception if a wait is done on a recursively locked object?
The UNIX98 spec doesn't "avoid the question", though it doesn't address it
specifically. A condition wait unlocks the mutex; if the mutex is recursively
locked, the nesting count is reduced by one, and then increased when the wait
continues. To "spin out" the nesting count by more than one level would be
extraordinary, and could not reasonably be assumed without explicit wording to
that effect. (And, as I was there at the time, I can assure you that no such odd
behavior was intended.)
One holds a mutex when one is evaluating or changing a shared data predicate.
The point of holding the mutex is to prevent any other thread from evaluating or
changing that predicate at the same time, which could lead to data corruption
or, at best, inaccurate program decisions.
And you're saying that it's OK to violate that most basic definition of what it
means "to hold a mutex" by arbitrarily UNLOCKING the mutex without the knowledge
or consent of the code that locked the mutex? Hmm. Does that really make sense
to you?
Either the calling code IS making an assumption of atomicity with respect to the
shared data predicate, or it shouldn't be holding the mutex.
As I've said before, the name "mutex" is misleading, because it doesn't have
nearly strong enough negative connotations, and people use them far too
casually. We should have called them "bottlenecks". Because that's what they
are. The purpose of a mutex is to prevent parallelism. The purpose of threads is
to allow parallelism. All parallel code needs synchronization, somewhere -- but
good threaded code keeps synchronization to a minimum, and hold locks over the
shortest possible regions of code. You should (almost) never make any call while
holding a mutex to begin with, because (in general) when you make a call you're
giving up control, and potentially dragging out your program's bottleneck to an
arbitrary, uncontrollable, (and likely unknowable) length.
BUT when you do (or must) make a call while holding a mutex, you are making an
explicit statement that the current state of the shared data predicates is
important to you, and must be protected. I don't care whether you INTENDED to
make that statement; it's true all the same. You can easily retract the
statement simply by releasing the mutex.
The real danger of the Java model is this. First, there is no documentation
warning programmers of this contradiction to the basic concepts of
synchronization. And, second, even if there was, Java does not provide a strong
enough syntactic representation of a "monitor" to detect programmer errors. The
synchronized keyword tells the language that you're "in a monitor", but the
concept is loose and fuzzy, and gives no indication of which data is protected,
or what states represent broken predicates, or whether the code is making
invalid assumptions about atomicity of predicates across a call. So if you
happen to erroneously call something that happens to wait, you're busted. You
don't know it, and the language doesn't know it. The same error in C/C++ with
UNIX98 threads (and a recursive mutex) will result in a deadlock, because the
wait will leave the mutex locked, and nobody else can ever change the
controlling predicate in order to have a reason to awake the waiter. Deadlock is
the best possible multithreaded program failure mode, because all the state sits
around waiting for you to observe and analyze. Java's violation of basic
synchronization ettiquette results in a race, instead, which is the worse
possible failure mode.
Is it "possibly valid if you know what you're doing". Yeah, and if you know what
you're doing, you'd want a real mutex instead of a recursive mutex, and you
wouldn't be making calls while holding the mutex anyway except in very rare
instances, and if you did, the locking protocol would be carefully designed and
verified. The problem is that Java doesn't provide enough syntax to allow the
language (or JVM) to understand the semantics, and doesn't provide the
programmer with information about the risks, or any tools to avoid them.
If Java's going to do this, it should have the semantic/syntactic strength to
know what predicates are broken, and what atomicity assumptions are made, by the
callers. Of course, if it could do this, there would be no need for recursive
mutex semantics, and the issue of "single unlock" vs "spin out" would be
irrelevant. Given, though, that it doesn't, and can't, know, then it MUST trust
that the programmer understands what she's doing, and refuse to violate the
atomicity implied by the mutex. Without either knowledge or trust, the model is
broken. I don't care that it "can be made to work" if the programmers are all
careful to avoid Java's synchronization traps: it's still broken. Period.
/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation buten...@zko.dec.com |
| 110 Spit Brook Rd ZKO2-3/Q18 http://members.aol.com/drbutenhof |
| Nashua NH 03062-2698 http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----------------[ Better Living Through Concurrency ]----------------/