Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

pthrerad_mutex_lock recursive

39 views

Skip to first unread message

Jeremy Levine

unread,

Sep 24, 1998, 3:00:00 AM9/24/98

Under DEC unix 4.0d it is possible to set a pthread_mutex_lock to allow
the same thread to relock a locked mutex. I need some way to do this
undes Solaris 2.6. I would accept a return value but at the moment all
that seems to happen is the program deadlocks ..

Casper H.S. Dik - Network Security Engineer

unread,

Sep 24, 1998, 3:00:00 AM9/24/98

[[ PLEASE DON'T SEND ME EMAIL COPIES OF POSTINGS ]]

Jeremy Levine <lev...@amarex.com> writes:

Recursive mutexs are not supported in Solaris.

It's now also held by the DEC engineers to be a "bad" thing.

If you need them, your code is broken. Really.

ASk over in comp.programming,threads.

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

William King

unread,

Sep 24, 1998, 3:00:00 AM9/24/98

Casper H.S. Dik - Network Security Engineer wrote:
>

>
> Recursive mutexs are not supported in Solaris.
>

UNIX98 extended POSIX96 with a couple of different mutex
types, including recursive mutexes. UNIX98 branded
implementations, including Solaris, support this mutex
type.

Recursive mutexes are created by assigning the
PTHREAD_MUTEX_RECURSIVE value to the type argument
in a call to pthread_mutexattr_settype(). See the
UNIX98 man page for pthread_mutexattr_settype() at
http://www.unix-systems.org/online.html for more info.

-W

Erik GOLLOT

unread,

Sep 25, 1998, 3:00:00 AM9/25/98

to Jeremy Levine

Recursive mutex is a well-known design pattern.

If you need it, you should create a class (if you're in C++) RMutex that
stores and checks the thread_id of the caller of its lock() method.
The first time the RMutex is used, the thread_id is stored and the
pthread_mutex_lock method is called. The second time, if the caller has the
same thread_id, the pthread_mutex_lock function is not called otherwies it
is.

Dave Butenhof

unread,

Sep 25, 1998, 3:00:00 AM9/25/98

"Casper H.S. Dik - Network Security Engineer" wrote:

> Jeremy Levine <lev...@amarex.com> writes:
>
> >Under DEC unix 4.0d it is possible to set a pthread_mutex_lock to allow
> >the same thread to relock a locked mutex. I need some way to do this
> >undes Solaris 2.6. I would accept a return value but at the moment all
> >that seems to happen is the program deadlocks ..
>

> Recursive mutexs are not supported in Solaris.

At least, not until Solaris supports UNIX98, which has, (unfortunately),
standardized my intellectual hackery.

> It's now also held by the DEC engineers to be a "bad" thing.

Oh, heck, I never really thought it was a "good" thing. It is, however,
often a convenient thing. The reason I regret recursive mutexes is that,
like even more dangerous constructs such as suspend/resume, recursive
mutexes are "seductive". They seem to be a real solution to so many
problems, when in fact they're not really a "solution" to anything; and are
more likely to be the mechanism of worse problems.

In the early days of POSIX, there was resistance to the flexibility
provided by mutex creation attributes. Mutexes should be fast, with support
for "inline" locking, rather than forcing each mutex lock and unlock into a
"slow path" just because SOME mutex attributes might require complicated
semantics. We were sure this wasn't true, and I decided to prove it by
building a "heavyweight" mutex type (recursive) that allowed "normal"
mutexes to be locked "inline" without generating separate locking code for
each type. The original implementation was simplistic: I just kept the
"heavy" mutexes locked all the time and managed the actual lock using extra
hidden state. The result was that any lock or unlock attempt for a
recursive mutex would be forced into the "slow (contention) path" to handle
all the bookkeeping.

Once I had it coded, there seemed "no reason not to ship it". (I've made
the same mistake in other instances, though my irrational and destructive
tendencies towards foolish "generousity" have been slightly tempered over
the years by hard experience supporting, or fighting to have killed,
various "expedient monsters" that should never have been released.) Thus,
the capability became part of DCE threads, as well as our DECthreads
product. And, predictably, (though regrettably), people liked it, and
became dependent on it, and the consortium of vendors that gathered to form
the basis for the UNIX98 thread extensions almost "automatically" included
my mutex type attribute. That's life. It's often one's worst accidents that
make history. ;-)

> If you need them, your code is broken. Really.

I used to say that, too, but I've (slightly) tempered that extremism. No,
your code isn't necessarily broken if you use recursive mutexes. However,
any code that uses them cannot ever hope to perform anywhere near as well
as properly written code that doesn't require a recursive mutex. It's a
performance issue, not a correctness issue. (Well, not entirely "not a
correctness issue" -- code that relies on recursive mutexes will usually be
harder to maintain, and harder to understand in general, posing future
complications that may outweigh the initial "convenience".)

(This issue, by the way, is one of my big gripes with Java, because its
synchronization uses recursive lockis, and you can't avoid them even if you
write properly modular code. It doesn't even use recursive locking
correctly, because a wait ignores the recursion and unlocks completely.
Sure, one should never make a call with broken invariants, or depend on
them being the same when the call returns -- but that's just another way of
saying that recursive locks are pointless and wasteful, and, worse, code
that depends on them is broken. [Ooops, I said I wasn't going to say that,
didn't I?])

As long as your code is otherwise correct, however, you can use a recursive
mutex anywhere you can use a normal mutex, and everything will work just
fine. I used to think this wasn't true, and argued against the UNIX98
language that allows an implementation to make mutexes recursive
BY DEFAULT, because, in particular, waiting on a condition variable with a
recursively locked mutex results in a deadlock. However, it was correctly
pointed out to me by others on the committee that this is only the case if
the mutex is actually locked recursively, which you can't legally do unless
you ASKED for a recursive mutex. So it's not a problem for correctly coded
programs: either they wanted a recursive mutex, (and can be assumed to
"know what they're doing"), or they won't ever lock it recursively.

The main issue is performance, because a recursive mutex is always going to
be slower than a "simple mutex" ("normal", in UNIX98 terms, or "fast" in
DCE thread terms). Because thread-safe code typically locks and unlocks
mutexes a lot, the performance of individual lock operations can have an
often unexpectedly high impact on the entire application. For example, I
"made a boo boo" once and shipped Digital UNIX "fast mutex" code that
turned the statically initialized mutex used by libc's malloc() into a
"slow mutex", calling into the block/unblock path on each malloc/free
rather than locking inline. The problem was noticed when the malloc()
maintainer ran a profile of a heap-intensive application, the developers of
which had complained bitterly about terrible malloc performance. Fixing the
bug [adding a "(" and a ")" to an &/&& expression] resulted in a
surprisingly enormous improvement.

I also think that using a recursive mutex is "lazy". There's never any real
need, because you can always structure your code in a modular fashion that
doesn't require recursive locks. Such code will be "cleaner", easier to
maintain and understand, and of course also faster. Anyone who uses a
recursive mutex in NEW code is not just asking for, but INSISTING on,
serious performance problems later on. There's really no excuse. Anyone is
(of course) welcome to disagree with me... but please don't bother arguing,
because I've heard enough on the subject!

Finally, and "on the other hand", remember that the first rule of
optimization is to avoid wasting your time optimizing code that's not on
the critical path. If you're dealing with existing code that's badly
structured, where it's difficult to manage (or keep track of) a mutex
locking model that doesn't rely on recursive mutexes, AND if that code is
well off the critical path of the application... then you might as well use
the recursive mutex if its available. Otherwise, you're wasting your time
rewriting the original code to support a real locking model. But if any of
that code ever moves into the critical path, absolutely the FIRST thing you
should do is get rid of those CPU-suckers.

> ASk over in comp.programming,threads.

(Actually, he did, although he apparently ALSO asked in comp.unix.solaris
;-) )

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation bute...@zko.dec.com |
| 110 Spit Brook Rd ZKO2-3/Q18 http://members.aol.com/drbutenhof |
| Nashua NH 03062-2698 http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----------------[ Better Living Through Concurrency ]----------------/

Alan Coopersmith

unread,

Sep 27, 1998, 3:00:00 AM9/27/98

In article <360AC9FE...@fc.net>, William King <w...@fc.net> wrote:
>UNIX98 extended POSIX96 with a couple of different mutex
>types, including recursive mutexes. UNIX98 branded
>implementations, including Solaris, support this mutex
>type.

No currently released version of Solaris is UNIX98 branded.
(Solaris 2.7 is listed as being the first UNIX98 version of Solaris
on http://www.opengroup.org/regproducts/sun.htm, but it's not out yet.)

--
________________________________________________________________________
Alan Coopersmith al...@godzilla.EECS.Berkeley.EDU
Univ. of California at Berkeley http://soar.Berkeley.EDU/~alanc/
aka: alanc@{CSUA,OCF,CS,BMRC,ucsee.eecs,cory.eecs,server}.Berkeley.EDU

Robert Garskof

unread,

Sep 29, 1998, 3:00:00 AM9/29/98

[SNIP]

Casper H.S. Dik - Network Security Engineer wrote:
> It's now also held by the DEC engineers to be a "bad" thing.
>

> If you need them, your code is broken. Really.
>

> ASk over in comp.programming,threads.
>
> Casper

Why?

--
Robert Garskof
rgar...@snet.net

tay...@template.com

unread,

Oct 1, 1998, 3:00:00 AM10/1/98

In article <360B74AF...@zko.dec.com>,

Dave Butenhof <bute...@zko.dec.com> wrote:
> (This issue, by the way, is one of my big gripes with Java, because its
> synchronization uses recursive lockis, and you can't avoid them even if you
> write properly modular code. It doesn't even use recursive locking
> correctly, because a wait ignores the recursion and unlocks completely.
> Sure, one should never make a call with broken invariants, or depend on
> them being the same when the call returns -- but that's just another way of
> saying that recursive locks are pointless and wasteful, and, worse, code
> that depends on them is broken. [Ooops, I said I wasn't going to say that,
> didn't I?])
>
> As long as your code is otherwise correct, however, you can use a recursive
> mutex anywhere you can use a normal mutex, and everything will work just
> fine. I used to think this wasn't true, and argued against the UNIX98
> language that allows an implementation to make mutexes recursive
> BY DEFAULT, because, in particular, waiting on a condition variable with a
> recursively locked mutex results in a deadlock. However, it was correctly
> pointed out to me by others on the committee that this is only the case if
> the mutex is actually locked recursively, which you can't legally do unless
> you ASKED for a recursive mutex. So it's not a problem for correctly coded
> programs: either they wanted a recursive mutex, (and can be assumed to
> "know what they're doing"), or they won't ever lock it recursively.

I don't understand exactly what you're saying about the semantics of doing
a wait with a recursively locked mutex. I'm not sure what they should be,
and the Single UNIX spec seems to avoid the question. (I'm looking at
http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_timedwait.html)
What you seem to be saying is that it is an error to do a wait if a mutex is
recursively locked. But it seems like it might also be valid to define it
to work the way you're saying that Java locks work: when you wait, it saves
the current lock count for this thread, completely unlocks the lock, then
waits, then on wakeup, restores the previous lock count. I can see that
this would probably be a very dangerous thing to do, as it essentially
ignores the outer locks, but is it possibly useful and valid if you're
sure you know what you're doing? Also, if implementing it this way is
incorrect, are the keepers of Java aware of this? Should Java instead
throw an exception if a wait is done on a recursively locked object?

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Dave Butenhof

unread,

Oct 2, 1998, 3:00:00 AM10/2/98

tay...@template.com wrote:

> In article <360B74AF...@zko.dec.com>,
> Dave Butenhof <bute...@zko.dec.com> wrote:

> > As long as your code is otherwise correct, however, you can use a recursive
> > mutex anywhere you can use a normal mutex, and everything will work just
> > fine. I used to think this wasn't true, and argued against the UNIX98
> > language that allows an implementation to make mutexes recursive
> > BY DEFAULT, because, in particular, waiting on a condition variable with a
> > recursively locked mutex results in a deadlock. However, it was correctly
> > pointed out to me by others on the committee that this is only the case if
> > the mutex is actually locked recursively, which you can't legally do unless
> > you ASKED for a recursive mutex. So it's not a problem for correctly coded
> > programs: either they wanted a recursive mutex, (and can be assumed to
> > "know what they're doing"), or they won't ever lock it recursively.
>

> I don't understand exactly what you're saying about the semantics of doing
> a wait with a recursively locked mutex. I'm not sure what they should be,
> and the Single UNIX spec seems to avoid the question. (I'm looking at
> http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_timedwait.html)
> What you seem to be saying is that it is an error to do a wait if a mutex is
> recursively locked. But it seems like it might also be valid to define it
> to work the way you're saying that Java locks work: when you wait, it saves
> the current lock count for this thread, completely unlocks the lock, then
> waits, then on wakeup, restores the previous lock count. I can see that
> this would probably be a very dangerous thing to do, as it essentially
> ignores the outer locks, but is it possibly useful and valid if you're
> sure you know what you're doing? Also, if implementing it this way is
> incorrect, are the keepers of Java aware of this? Should Java instead
> throw an exception if a wait is done on a recursively locked object?

The UNIX98 spec doesn't "avoid the question", though it doesn't address it
specifically. A condition wait unlocks the mutex; if the mutex is recursively
locked, the nesting count is reduced by one, and then increased when the wait
continues. To "spin out" the nesting count by more than one level would be
extraordinary, and could not reasonably be assumed without explicit wording to
that effect. (And, as I was there at the time, I can assure you that no such odd
behavior was intended.)

One holds a mutex when one is evaluating or changing a shared data predicate.
The point of holding the mutex is to prevent any other thread from evaluating or
changing that predicate at the same time, which could lead to data corruption
or, at best, inaccurate program decisions.

And you're saying that it's OK to violate that most basic definition of what it
means "to hold a mutex" by arbitrarily UNLOCKING the mutex without the knowledge
or consent of the code that locked the mutex? Hmm. Does that really make sense
to you?

Either the calling code IS making an assumption of atomicity with respect to the
shared data predicate, or it shouldn't be holding the mutex.

As I've said before, the name "mutex" is misleading, because it doesn't have
nearly strong enough negative connotations, and people use them far too
casually. We should have called them "bottlenecks". Because that's what they
are. The purpose of a mutex is to prevent parallelism. The purpose of threads is
to allow parallelism. All parallel code needs synchronization, somewhere -- but
good threaded code keeps synchronization to a minimum, and hold locks over the
shortest possible regions of code. You should (almost) never make any call while
holding a mutex to begin with, because (in general) when you make a call you're
giving up control, and potentially dragging out your program's bottleneck to an
arbitrary, uncontrollable, (and likely unknowable) length.

BUT when you do (or must) make a call while holding a mutex, you are making an
explicit statement that the current state of the shared data predicates is
important to you, and must be protected. I don't care whether you INTENDED to
make that statement; it's true all the same. You can easily retract the
statement simply by releasing the mutex.

The real danger of the Java model is this. First, there is no documentation
warning programmers of this contradiction to the basic concepts of
synchronization. And, second, even if there was, Java does not provide a strong
enough syntactic representation of a "monitor" to detect programmer errors. The
synchronized keyword tells the language that you're "in a monitor", but the
concept is loose and fuzzy, and gives no indication of which data is protected,
or what states represent broken predicates, or whether the code is making
invalid assumptions about atomicity of predicates across a call. So if you
happen to erroneously call something that happens to wait, you're busted. You
don't know it, and the language doesn't know it. The same error in C/C++ with
UNIX98 threads (and a recursive mutex) will result in a deadlock, because the
wait will leave the mutex locked, and nobody else can ever change the
controlling predicate in order to have a reason to awake the waiter. Deadlock is
the best possible multithreaded program failure mode, because all the state sits
around waiting for you to observe and analyze. Java's violation of basic
synchronization ettiquette results in a race, instead, which is the worse
possible failure mode.

Is it "possibly valid if you know what you're doing". Yeah, and if you know what
you're doing, you'd want a real mutex instead of a recursive mutex, and you
wouldn't be making calls while holding the mutex anyway except in very rare
instances, and if you did, the locking protocol would be carefully designed and
verified. The problem is that Java doesn't provide enough syntax to allow the
language (or JVM) to understand the semantics, and doesn't provide the
programmer with information about the risks, or any tools to avoid them.

If Java's going to do this, it should have the semantic/syntactic strength to
know what predicates are broken, and what atomicity assumptions are made, by the
callers. Of course, if it could do this, there would be no need for recursive
mutex semantics, and the issue of "single unlock" vs "spin out" would be
irrelevant. Given, though, that it doesn't, and can't, know, then it MUST trust
that the programmer understands what she's doing, and refuse to violate the
atomicity implied by the mutex. Without either knowledge or trust, the model is
broken. I don't care that it "can be made to work" if the programmers are all
careful to avoid Java's synchronization traps: it's still broken. Period.

sa...@bear.com

unread,

Oct 3, 1998, 3:00:00 AM10/3/98

In article <3614B477...@zko.dec.com>,
Dave Butenhof <bute...@zko.dec.com> wrote:
[snipped]

> The real danger of the Java model is this. First, there is no documentation
> warning programmers of this contradiction to the basic concepts of
> synchronization. And, second, even if there was, Java does not provide a strong
> enough syntactic representation of a "monitor" to detect programmer errors. The
> synchronized keyword tells the language that you're "in a monitor", but the
> concept is loose and fuzzy, and gives no indication of which data is protected,
> or what states represent broken predicates, or whether the code is making
> invalid assumptions about atomicity of predicates across a call.

[snipped]

I do not understand your statement that 'synchronized' gives no indication
of which data is protected. A mutex also does not indicate which data are
protected. So are you implying something else?

Thank you,
Saroj Mahapatra

tay...@template.com

unread,

Oct 5, 1998, 3:00:00 AM10/5/98

In article <3614B477...@zko.dec.com>,
Dave Butenhof <bute...@zko.dec.com> wrote:

> tay...@template.com wrote:
> > I don't understand exactly what you're saying about the semantics of doing
> > a wait with a recursively locked mutex. I'm not sure what they should be,
> > and the Single UNIX spec seems to avoid the question. (I'm looking at
> > http://www.opengroup.org/onlinepubs/7908799/xsh/pthread_cond_timedwait.html)
> > What you seem to be saying is that it is an error to do a wait if a mutex is
> > recursively locked. But it seems like it might also be valid to define it
> > to work the way you're saying that Java locks work: when you wait, it saves
> > the current lock count for this thread, completely unlocks the lock, then
> > waits, then on wakeup, restores the previous lock count. I can see that
> > this would probably be a very dangerous thing to do, as it essentially
> > ignores the outer locks, but is it possibly useful and valid if you're
> > sure you know what you're doing? Also, if implementing it this way is
> > incorrect, are the keepers of Java aware of this? Should Java instead
> > throw an exception if a wait is done on a recursively locked object?
>
> The UNIX98 spec doesn't "avoid the question", though it doesn't address it
> specifically. A condition wait unlocks the mutex; if the mutex is recursively
> locked, the nesting count is reduced by one, and then increased when the wait
> continues. To "spin out" the nesting count by more than one level would be
> extraordinary, and could not reasonably be assumed without explicit wording to
> that effect. (And, as I was there at the time, I can assure you that no such
odd
> behavior was intended.)

The spec says that pthread_cond_wait will "release" the mutex. To me,
"release" implies that another thread would then be able to acquire the
mutex, and thus, it is quite reasonable to assume a complete "spin out"
of a multiply-nested lock, regardless of how "odd" or "extradordinary"
this might seem. The spec is ambiguous, and it would cause no harm to
condescend to those of us who might be confused by clarifying it.

> One holds a mutex when one is evaluating or changing a shared data predicate.
> The point of holding the mutex is to prevent any other thread from evaluating
or
> changing that predicate at the same time, which could lead to data corruption
> or, at best, inaccurate program decisions.
>
> And you're saying that it's OK to violate that most basic definition of what
it
> means "to hold a mutex" by arbitrarily UNLOCKING the mutex without the
knowledge
> or consent of the code that locked the mutex? Hmm. Does that really make sense
> to you?

I'm not saying that anything's OK. I'm asking what the behavior is and why.
I understand the semantics of simple non-recursive mutexes and condition
waits, but it was not obvious to me what the semantics would be for recursive
mutexes. In a way (and it seems you would agree), allowing a mutex to be
recursive at all is a violation of the basic concept of a mutex, since it
allows the owning thread to re-enter a critical section with (possibly)
broken invariants. Since we're allowing the danger of making mutexes
recursive, how am I to assume that we won't also allow the danger of
completely unlocking multiple levels of recursive locking on a wait? If
we're letting people who "know what they're doing" to use recursive mutexes,
why not let these same people "fully" release these mutexes on a wait? I
don't see why this would necessarily mean that the mutex is "arbitrarily
unlocked without the knowledge or consent of the code" that locked it. If
the person who wrote the code is quite clear on where they are allowing
recursive locks to occur, then they should have full knowledge of how waits
would affect these locks.

> Java does not provide a strong
> enough syntactic representation of a "monitor" to detect programmer errors.
The
> synchronized keyword tells the language that you're "in a monitor", but the
> concept is loose and fuzzy, and gives no indication of which data is
protected,

I don't see how a POSIX mutex is any less "fuzzy" about what it's protecting.
Do you mean that all of the data in a Java object is protected by a single
lock associated with that object? That's not necessarily the case if instead
you synchronize on separate objects associated with each data member.

> Deadlock is
> the best possible multithreaded program failure mode, because all the state
sits
> around waiting for you to observe and analyze. Java's violation of basic
> synchronization ettiquette results in a race, instead, which is the worse
> possible failure mode.

Agreed, and this is the justification I was seeking for why the UNIX98
semantics are what you say they are. Although it would certainly help
to guarantee that pthread_cond_wait returns an error code indicating that
you've tried to wait on a recursively locked mutex instead of just quietly
deadlocking. By choosing a recursive mutex, the programmer has indicated
that the best performance is not important, so why not add the expense
of doing additional error checking?

As for Java, I certainly don't intend to try to defend Java's synchronization
model because I agree that there are problems with it. However, I'm
wondering if you are aware of any OO languages or modifications to Java,
proposed or otherwise, whose synchronization model would have the kind
of power and clarity you'd prefer? If so, I'd be interested in reading
about them.

Thanks for your comments!

Charles Stephens

unread,

Oct 5, 1998, 3:00:00 AM10/5/98

>>>>> "JL" == Jeremy Levine <lev...@amarex.com> writes:

JL> Under DEC unix 4.0d it is possible to set a pthread_mutex_lock to
JL> allow the same thread to relock a locked mutex. I need some way
JL> to do this undes Solaris 2.6. I would accept a return value but
JL> at the moment all that seems to happen is the program deadlocks
JL> ..

Ewww..

Although there is no direct support, you can simulate them by wrapping
the existing mutex calls with ones that record the owner of the lock.
Then if the thread requesting the lock matches the owner, you just
return w/ success. You will probably need a lock for the lock so that
you can safely record thread information.

Messy, slow and probably not worth it (I agree with Casper, it is
evil).

cfs
--
Charles F. Stephens = cfs AT eng.sun.com
Software Psychic and Illuminary =
Solaris Network Sustaining = "We're the phone company: we don't care,
Solaris Software = because we don't have to." -- AT&T
Sun Microsystems, Inc. =
Menlo Park, California, USA =

Dave Butenhof

unread,

Oct 5, 1998, 3:00:00 AM10/5/98

sa...@bear.com wrote:

> In article <3614B477...@zko.dec.com>,
> Dave Butenhof <bute...@zko.dec.com> wrote:
> [snipped]

>
> > The real danger of the Java model is this. First, there is no documentation
> > warning programmers of this contradiction to the basic concepts of
> > synchronization. And, second, even if there was, Java does not provide a strong
> > enough syntactic representation of a "monitor" to detect programmer errors. The
> > synchronized keyword tells the language that you're "in a monitor", but the
> > concept is loose and fuzzy, and gives no indication of which data is protected,
> > or what states represent broken predicates, or whether the code is making
> > invalid assumptions about atomicity of predicates across a call.
>

> [snipped]
>
> I do not understand your statement that 'synchronized' gives no indication
> of which data is protected. A mutex also does not indicate which data are
> protected. So are you implying something else?

Nothing in in the behavior of POSIX (or UNIX98) implies, or requires, that the
implementation know anything about what data is protected by a mutex. The programmer
is always completely responsible. POSIX doesn't provide any mechanism by which a
routine might easily get itself into trouble by unlocking the caller's mutex;
there's no supported way to multiply unlock a recursively locked mutex. Even with a
recursive mutex, the code that locked it is always responsible for unlocking it. And
that's how it must be, because only that code knows what the lock really means to
that code.

Java, however, takes on the responsibility of multiply unlocking a recursively
locked mutex on a wait, without any understanding at all of what the lock means to
the caller. My point is that, because it doesn't and can't know, the behavior is
broken. It is assuming that callers hold the lock for no reason at all. (In other
words, that the programmer doesn't know what she is doing.) While that may, in many
cases, be a reasonable assumption, ignoring the consequences isn't a good solution.
It should either fail explicitly, or deadlock, rather than subject the program to
race conditions that may easily result in serious data corruption and will, at best,
be difficult to diagnose.

Michael Dalpee

unread,

Oct 6, 1998, 3:00:00 AM10/6/98

Charles Stephens <dev...@dobbs.eng.sun.com> wrote in message
afkzpba...@dobbs.eng.sun.com...

Hi,

I am developing threaded code to run under both Digital Unix 4.0 and Solaris
2.6. I have created a small portable library of C++ encapsulations of basic
POSIX thread capabilities as well as some higher-level enhancements, such as
a RecursiveMutex class. My RecursiveMutex class is simulated using a
pthread_mutex and a pthread_cond, so it is not as efficient as if it were
directly coded into the kernel, but if blazing fast performance is not an
issue for you, it is quite serviceable. Let me know if you are interested,
and I'll try to make it available to you, with the caveat that may since
the code currently resides in a secure installation, I may only be able to
snail-mail you some listings.

Cheers,

Mike

David Holmes

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

Dave Butenhof <bute...@zko.dec.com> wrote in article
<36189C36...@zko.dec.com>...

> Java, however, takes on the responsibility of multiply unlocking a
recursively
> locked mutex on a wait, without any understanding at all of what the lock

> means to the caller.

Java is not just a procedural language with some built in synchronisation
primitives - it is a concurrent object-oriented language with concurrency
support integrated within the object model. That model is similar to the
Monitor concept but is not exactly the same. This concurrency model
provides the "meaning" that Dave seems to think is lacking - locking and
synchronisation are packaged within objects.

For those not familiar with the details of Java:
In Java every object has associated with it a lock ( a private mutex if you
like). Locks can only be interacted with using synchronized statements or
methods. A synchronised statement has the form:
synchronized(foo){
// code
}
When a thread tries to enter a synchronized statement it must acquire the
lock
of the object foo. When execution comes to the end of the synchronized
statement the lock is automatically released (this is true even if an
exception
causes the synchronized block to be left). A method declared as
synchronized is
a syntactic shorthand for putting the entire method body in a synchronized
statement where 'this' is the object whose lock must be acquired.

Every Java object also has an associated wait-set which is like a private
condition variable that is always associated with the lock of that object.
A conditional wait is performed by invoking the wait() method on the object
(with that objects lock held) and when you invoke wait() the objects lock
gets released. Signalling is done using the notify() and notifyAll()
methods, which also require you to hold the objects lock when calling them
(this is different to POSIX condvars).

One of the key aspects of object-oriented programming is the notion of
incremental modification via inheritance. We define classes with methods
and then derive new classes that provide new functionality through new
methods, specialise functionality by replacing (overriding) methods and
enhance functionality by performing the base method plus some extra work.
This is what is supposed to make OO so good.

In a concurrent OO language you have to consider concurrency issues when
designing classes and methods. If a base class implements a particular
synchronisation policy then the derived class must adhere to that policy.
This means that the policy must be known and the mechanisms by which to
enforce the policy must be accessible. If a base class enforces a
monitor-like mutual exclusion policy then the derived class must ensure
that policy is maintained. For new methods and replacement methods this is
easy to do - just make them synchronized. But what about methods that
invoke the base and then do more work? We would like to just make them
synchronized too, but that means a synchronized method would be calling a
synchronized method? And the same situation arises if a method can be
called both publically and internally.

Now we have seen from other posts that by refactoring the methods in to
public synchronized parts and internal unsynchronised ones we can achieve
the right semantics. And this is what must be done in C++. But Java
integrates concurrency with the object model - doing this sort of thing
should be much simpler (at least in the simple case). So it is. The way to
view a synchronized statement in Java is not as an outright instruction to
acquire a lock, but as a declaration that the following code must execute
as a critical section with respect to the given object. This is the same as
the "required transaction" semantics of some transaction systems: the
following must execute in the context of a transaction - if no transaction
is active then start one, else continue in the current transaction context.
This is what a synchronised statement means: if the current thread is not
in a critical section with respect to this object then start one, else
continue in the current critical section.

So how we can implement this ability to extend a critical section? Two ways
have
been mentioned. The first is obviously recursive locks - a lock that can be
acquired many times by the same thread. The second option described by Kaz
was to have extra parameters which indicated whether a method was actually
required to grab the lock or not. Such an approach works for a language
that
implements a strict monitor model but Java does not fit that model. In Java
not
all methods on an object need to be synchronized; synchronisation does not
have to be at the method level; the lock involved does not have to be that
of the
current object; multiple locks can be used by a single method. With all
this
flexibility the extra argument approach simply isn't viable as you would
need to
analyse the call chain at compile time to see which locks were already
held.
The only practical solution is recursive locks.

If you don't buy the above then stop reading. Otherwise hopefully you see
that
recursive locks are an appropriate tool within Java. But now we have to
consider
what some term the "broken" part of Java - the use of wait-sets.

In POSIX the cond_var only releases the associated mutex once - even if it
is
a recursive mutex. In Java when you invoke wait() the objects lock is
released
completely, the lock count is ignored for unlocking purposes but is
restored when the lock is reacquired on the return from wait(). Which
semantics are correct? Both! But not for each other.

Suppose a base method is synchronized and has a conditional wait at the
start, then performs some state changes. Now suppose a derived method
extends that base method to perform additional state changes. The base
method is synchronized and the derived method is synchronized - the wait()
must release the lock otherwise no other method of the object could cause
the conditional wait to complete. If wait() only released the lock once
then the ability to extend concurrent methods would be lost and Java would
be broken.

But POSIX is not about concurrent objects; it doesn't support
extension of critical sections (directly) - it is a set of powerful, and
hopefully very fast primitives. You don't need POSIX cond_vars to
completely release a recursive mutex becuase if you need that functionality
you can build it on top of the primitives - just as you can build recursive
mutexes. So POSIX semantics in Java would be wrong and Java semantics in
POSIX are unnecessary.

But what about the flip side - are their circumstances in Java where
releasing
a recursively held mutex, in a wait(), would be wrong? It's possible (you
can
always construct a pathological example), but very unlikely.

The example cited (this issue has been debated before) is as follows.
Suppose
we have the following code sequence:

someMethod() {
lock our mutex
do some stuff that leaves invariants broken
foo()
other stuff that fixes invariants
unlock mutex
}
If foo() does a conditional wait and uses the *same* mutex/lock as we do
then the data will be exposed while invariants are broken. In POSIX the
scenario would simply lock up the waiting thread as no one would be able to
signal us, and possibly the signalling threads. Thus in Java you get an
exposed invalid 'object' whilst in POSIX you get a locked up thread. The
latter
is certainly easier to observe and safer. But how did you get into this
predicament?

Did you know that foo() was going to do a conditional wait()?
If yes then did you know that foo() uses the same mutex as you do?
If yes then it was an error to invoke foo() whilst the invariants were
broken
If no then how was that possible? How could foo() share our mutex without
us
knowing about it? If foo() is in the same class then we should
certainly
know; if it's not in the same class then at some point we had to cause
the
sharing to come about! You can't accidently share a lock/mutex across
objects!
If you didn't know that foo() was going to do a conditional wait() - why
not?
If foo() does a conditional wait internally that forms no part of the
visible
action of foo() then that is fine - but such an invisible action can't
use a
shared mutex otherwise it is not invisible. Any class that allowed a
shared
lock to be given to it but didn't declare that one of it's methods was
going
to do a wait() involving that lock is pure and simply broken.

In short the example we gave indicated a programmer error at some level.
In the two years (plus some) that I have been involved with Java, teaching
it's concurrency model, presenting tutorials at conferences, talking to
practitioners, conversing on newsgroups - I have not once come across this
problem. From a purist perspective we would like that error to lead to a
locked
up thread rather than an exposed invalid object - so Java is not perfect.
If you think that Java is broken because wait() unlocks a recursively
held lock then the fix is not to make wait() unlock it only once as that
really would break things. The real fix is to go back and find another
mechanism that allows for the extension of critical sections in a nice
neat concurrent object-oriented manner.

And if you do I'd love to hear about it - once it's fully thought out.

That's my view - take it or leave it. :-)

David

Dave Butenhof

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

David Holmes wrote:

> Dave Butenhof <bute...@zko.dec.com> wrote in article
> <36189C36...@zko.dec.com>...
> > Java, however, takes on the responsibility of multiply unlocking a
> recursively
> > locked mutex on a wait, without any understanding at all of what the lock
>
> > means to the caller.
>
> Java is not just a procedural language with some built in synchronisation
> primitives - it is a concurrent object-oriented language with concurrency
> support integrated within the object model. That model is similar to the
> Monitor concept but is not exactly the same. This concurrency model
> provides the "meaning" that Dave seems to think is lacking - locking and
> synchronisation are packaged within objects.

On the contrary, Java's concurrency support is nothing but thin "syntactic
sugar" on top of something vaguely resembling the POSIX synchronization model.
It adds nothing but a few methods that directly map POSIX features (thread
create, wait, etc.), and a "synchronized" language keyword that locks an
anonymous mutex within the block/method using the keyword. There is no
meaning.

Don't get me wrong -- that's a fine beginning towards concurrency support. Of
course, it doesn't much impress me that the anonymous mutex gets automatically
released on an exception, because you'll usually need to deal with the
exception directly anyway in order to restore shared data invariants. (The
problem is that the implied automatic recovery dangerously tempts programmers
to ignore their invariants.) With a REAL "integrated concurrency" OO model,
the language could handle all that for you, but, yeah, Java makes a decent
first baby step.

I'm not criticizing Java because it hasn't solved the entire problem. Rather,
I'm criticizing the implication (and, in some cases, explicit statements!)
that it HAS solved anything at all. Java has made it slightly easier for
programmers to use basic concurrency features; but it hasn't made it any
easier to use it correctly or safely. The trivial support makes it easier for
beginners to get themselves into trouble... and a little harder for experts to
avoid trouble. And all of this cruft about "integrated concurrency support" is
doing nobody any favors.

> When a thread tries to enter a synchronized statement it must acquire the
> lock of the object foo. When execution comes to the end of the synchronized
> statement the lock is automatically released (this is true even if an
> exception causes the synchronized block to be left). A method declared as
> synchronized is a syntactic shorthand for putting the entire method body in
> a synchronized statement where 'this' is the object whose lock must be
> acquired.

Right. It's a "syntactic shorthand" -- nothing more. And, as I said,
automatically unlocking a mutex on an exception isn't necessarily a favor. You
still need to be agressively aware of broken invariants, and those need to be
repaired before the mutex is unlocked. The language can't help you with this,
because it doesn't know anything about invariants or even shared data. That's
nothing new -- anyone using C and POSIX threads is familiar with those
problems. The point is that Java implies that it handles this for you, while
it actually leaves you entirely on your own.

> Every Java object also has an associated wait-set which is like a private
> condition variable that is always associated with the lock of that object.
> A conditional wait is performed by invoking the wait() method on the object
> (with that objects lock held) and when you invoke wait() the objects lock
> gets released. Signalling is done using the notify() and notifyAll()
> methods, which also require you to hold the objects lock when calling them
> (this is different to POSIX condvars).

POSIX recommends that you signal/broadcast with the mutex locked, but, yeah,
it's not required. (I suspect it's not actually in Java, either, unless it
uses the object's mutex to manage dynamic allocation from the condition
variable pool... which it might.) You just get much less predictable
scheduling behavior if the mutex isn't locked, and far more "spurious wakeups"
due to "predicate stealing" (another thread that locks the mutex and processes
the predicate before the awakened thread can).

There's nothing significantly, (or interestingly), different in the Java model
here. It's not even integrated language, but function calls on the object. In
other words, more syntactic sugar. (And, again, that's not a criticism, but
rather a response to people who seem to want to consider it some sort of
magic.)

> One of the key aspects of object-oriented programming is the notion of
> incremental modification via inheritance. We define classes with methods
> and then derive new classes that provide new functionality through new
> methods, specialise functionality by replacing (overriding) methods and
> enhance functionality by performing the base method plus some extra work.
> This is what is supposed to make OO so good.
>
> In a concurrent OO language you have to consider concurrency issues when
> designing classes and methods. If a base class implements a particular
> synchronisation policy then the derived class must adhere to that policy.
> This means that the policy must be known and the mechanisms by which to
> enforce the policy must be accessible. If a base class enforces a
> monitor-like mutual exclusion policy then the derived class must ensure
> that policy is maintained. For new methods and replacement methods this is
> easy to do - just make them synchronized. But what about methods that
> invoke the base and then do more work? We would like to just make them
> synchronized too, but that means a synchronized method would be calling a
> synchronized method? And the same situation arises if a method can be
> called both publically and internally.

And if the base class changes, all the derived classes are broken. In a clean
OO programming model, you shouldn't need to know anything about the base
classes. You should be encouraged NOT to know anything about them, beyond the
interface. (Remember "reusable components"? I should be able to derive from a
generic interface, and continue functioning even if the base object
implementation is replaced by someone's new & improved version.) But, as you
say, this is not the case in Java. Too bad. Again, not a criticism... but
maybe a lamentation over missed opportunity. If it HAD been designed as an
"integrated concurrent OO language", there'd have been no need for recursive
mutexes just to avoid a locking model, there'd be no need to "completely
unlock" those recursive mutexes for a wait. The language could automatically
restore invariants on an exception, and all sorts of other things would become
easy. Now that's a lot to ask from one language, and it'd be really hard to
solve all of the problems. The result would be worthwhile. It's not Java.

> Now we have seen from other posts that by refactoring the methods in to
> public synchronized parts and internal unsynchronised ones we can achieve
> the right semantics. And this is what must be done in C++. But Java
> integrates concurrency with the object model - doing this sort of thing
> should be much simpler (at least in the simple case). So it is. The way to
> view a synchronized statement in Java is not as an outright instruction to
> acquire a lock, but as a declaration that the following code must execute
> as a critical section with respect to the given object. This is the same as
> the "required transaction" semantics of some transaction systems: the
> following must execute in the context of a transaction - if no transaction
> is active then start one, else continue in the current transaction context.
> This is what a synchronised statement means: if the current thread is not
> in a critical section with respect to this object then start one, else
> continue in the current critical section.

Yeah, and thinking in terms of "critical (code) sections" is almost always a
bad way to design for concurrency. You should be thinking about shared data
invariants, instead. Java does not promote this healthier and more powerful
view. Too bad. But, again, it's because Java does not have "integrated
concurrency" support... just a "synchronized" keyword. The keyword makes a
critical section, and that's all Java can know, so it forces programmers into
that narrow and limiting view.

It SHOULD have "transactions", but all it has is "synchronized". It should
provide syntax to define shared data invariants, and how to recover those
invariants on an exception. This would allow dynamic analysis to avoid the
need for recursive locks. Waiting on an object would always either be safe, or
would expose broken invariants -- in which case it could fail or (perhaps,
given proper syntax), automatically recover invariants before the wait. Or
even provide ways to simply deal dynamically with the changed invariants when
the wait returns. Lots of opportunities for research here!

> So how we can implement this ability to extend a critical section? Two ways
> have been mentioned. The first is obviously recursive locks - a lock that
> can be
> acquired many times by the same thread. The second option described by Kaz
> was to have extra parameters which indicated whether a method was actually
> required to grab the lock or not. Such an approach works for a language
> that implements a strict monitor model but Java does not fit that model. In
> Java
> not all methods on an object need to be synchronized; synchronisation does
> not
> have to be at the method level; the lock involved does not have to be that
> of the current object; multiple locks can be used by a single method. With
> all
> this flexibility the extra argument approach simply isn't viable as you
> would
> need to analyse the call chain at compile time to see which locks were
> already
> held. The only practical solution is recursive locks.

Except that the real root problem is that Java has only syntactic sugar to
declare a "critical section", and no support for managing shared data
invariants. You're right, recursive mutexes are an easy workaround for the
mess into which Java gets itself, where "synchronization" is just an anonymous
block of code with no decypherable meaning to the language system.

Up above, you talked about "starting a new" critical section or "continuing
the current section". That's untrue. Java ALWAYS starts a new (possibly
nested) critical section. It has to, because it doesn't know what the current
section means. If it understood the invariants, the meaning of the shared
data, then it could do the analysis you describe, and make the decisions you
suggest. It can't, though, because it has only trivial syntax to suggest the
programmer's intentions.

> If you don't buy the above then stop reading. Otherwise hopefully you see
> that recursive locks are an appropriate tool within Java. But now we have to
>
> consider what some term the "broken" part of Java - the use of wait-sets.

There's nothing wrong with your factual descriptions. You're just assuming a
lot that's not there. That's not The Great And Powerful Oz... just a little
man behind the curtain. He may be a very great man, but he's a very poor
wizard.

> In POSIX the cond_var only releases the associated mutex once - even if it
> is a recursive mutex. In Java when you invoke wait() the objects lock is
> released completely, the lock count is ignored for unlocking purposes but is
>
> restored when the lock is reacquired on the return from wait(). Which
> semantics are correct? Both! But not for each other.

Java can be described as correct. But when you make that description, you're
saying that the programmer is responsible for never making a call that might
wait unless all invariants are stable and ready for external consumption, and
that no assumptions will be made about the state on return. In such case, the
mutex should be unlocked. Holding mutexes for extra cycles reduces
concurrency, and that's bad. You're holding it across a call when you have no
need at all to be holding the mutex. That's a bad concurrent design; by
definition, because you're reducing concurrency with no benefit.

I've joked that we should have called mutexes "bottlenecks", because it would
give everyone a better mind set about how and when to use them. For the same
reason, Java's keyword should probably have been "bottleneck" or "reduce
concurrency" rather than the misleading "synchronized". Java encourages you to
hold mutexes when you don't need them, and one consequence of this is the
fabricated necessity of using recursive locks. (Maybe we should call it an
"integrated anti-concurrency OO language?)

You're defending the wrong features of the language, for the wrong reasons.

> Suppose a base method is synchronized and has a conditional wait at the
> start, then performs some state changes. Now suppose a derived method
> extends that base method to perform additional state changes. The base
> method is synchronized and the derived method is synchronized - the wait()
> must release the lock otherwise no other method of the object could cause
> the conditional wait to complete. If wait() only released the lock once
> then the ability to extend concurrent methods would be lost and Java would
> be broken.
>
> But POSIX is not about concurrent objects; it doesn't support
> extension of critical sections (directly) - it is a set of powerful, and
> hopefully very fast primitives. You don't need POSIX cond_vars to
> completely release a recursive mutex becuase if you need that functionality
> you can build it on top of the primitives - just as you can build recursive
> mutexes. So POSIX semantics in Java would be wrong and Java semantics in
> POSIX are unnecessary.

First, POSIX doesn't have recursive mutexes. But UNIX98 does, so we'll let it
go at that. POSIX doesn't allow "complete release" of a recursively locked
mutex because that would be wrong. Period. It's not right in Java. Sure, Java
can blindly announce that it does so, and require that all programmers deal
with the consequences of that. Java is its own specification, and deserves
full autonomy. Still, that doesn't make it objectively defensible in any
technical terms. It's wrong. At best, this whole thing is a silly hackery to
support trivial concurrency syntax without all of the (really substantial, and
extremely challenging) effort to make a truly "integrated concurrency OO
language". Nothing wrong with that... but let's recognize the truth.

> But what about the flip side - are their circumstances in Java where
> releasing a recursively held mutex, in a wait(), would be wrong? It's
> possible (you
> can always construct a pathological example), but very unlikely.

Let's put it this way, as I already did in an earlier post. If there's any
justifiable need in the caller to have the mutex locked, then unlocking it for
the wait is an error. Java requires that any call to a method that might
invoke a "wait" method must exist outside of a "synchronized" block. At the
same time, though, it goes to great lengths to allow it to exist within a
"synchronized" block.

The simple fact is that either the programmer "lied" (the lock isn't
necessary, and really, in the name of concurrency, shouldn't be held), or else
the unlock is wrong. That's it. That's the root of this whole argument. Yes,
Java encourages the programmer to "lie", and makes it difficult to tell the
truth -- you'd have to almost always avoid using the "synchronized" keyword on
a method, and structure your "sychronized" blocks in awkward and unnatural
ways. Still, recursive locks and implicit "complete unlocks" are not the
concurrent programmer's friends. But, if you're very careful and follow all of
the implicit, subtle, and undocumented rules, that hackery allows Java to
provide simplistic concurrency support without forcing it to understand
anything about concurrency.

> In short the example we gave indicated a programmer error at some level.
> In the two years (plus some) that I have been involved with Java, teaching
> it's concurrency model, presenting tutorials at conferences, talking to
> practitioners, conversing on newsgroups - I have not once come across this
> problem. From a purist perspective we would like that error to lead to a
> locked up thread rather than an exposed invalid object - so Java is not
> perfect.

Exactly. It's a programmer error that this supposedly "inherently concurrent
OO language" cannot detect or prevent. To avoid it requires knowing the
detailed implementation of all methods you call that might involve a wait on
the calling object. And, to bring this back around to the topic, I might
cynically point out that if you know enough to be sure that the problem cannot
occur in your particular call tree, then you also know enough to avoid the
overhead and complication of recursive mutexes.

Sure, it doesn't happen all the time. That doesn't change the fact that it's a
design bug and can have severe consequences unless every Java programmer is
aware of this complication that the language chooses to completely ignore. I
don't like that. If you do, fine. Enjoy.

> If you think that Java is broken because wait() unlocks a recursively
> held lock then the fix is not to make wait() unlock it only once as that
> really would break things. The real fix is to go back and find another
> mechanism that allows for the extension of critical sections in a nice
> neat concurrent object-oriented manner.

Uh huh. A real monitor (where the syntax specified what the lock MEANT, not
merely where it was taken) would be a good start. But that would be a new
language, that supported "integrated concurrency".

> And if you do I'd love to hear about it - once it's fully thought out.

Yeah, if I was inclined to go into pure research, this would be a really
interesting project that could probably keep me busy for 5 or 10 years.

> That's my view - take it or leave it. :-)

I guess I'll leave it.

Dave Butenhof

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

tay...@template.com wrote:

> In article <3614B477...@zko.dec.com>,
> Dave Butenhof <bute...@zko.dec.com> wrote:
> > tay...@template.com wrote:
>
> The spec says that pthread_cond_wait will "release" the mutex. To me,
> "release" implies that another thread would then be able to acquire the
> mutex, and thus, it is quite reasonable to assume a complete "spin out"
> of a multiply-nested lock, regardless of how "odd" or "extradordinary"
> this might seem. The spec is ambiguous, and it would cause no harm to
> condescend to those of us who might be confused by clarifying it.

Unfortunately, it's hard to catch every possible inference when writing something
as big as UNIX98. Things change, and a full review of every word and nuance is
impractical: and as the authors become more and more familiar with what it means to
say, they become less capable of recognizing that it actually doesn't always say
that very well. Too bad, but that's the way people work.

Lots of things should be clarified. Doesn't mean they will be. I'm afraid I don't
have the time or inclination to champion this particular cause.

> I'm not saying that anything's OK. I'm asking what the behavior is and why.
> I understand the semantics of simple non-recursive mutexes and condition
> waits, but it was not obvious to me what the semantics would be for recursive
> mutexes. In a way (and it seems you would agree), allowing a mutex to be
> recursive at all is a violation of the basic concept of a mutex, since it
> allows the owning thread to re-enter a critical section with (possibly)
> broken invariants. Since we're allowing the danger of making mutexes
> recursive, how am I to assume that we won't also allow the danger of
> completely unlocking multiple levels of recursive locking on a wait? If
> we're letting people who "know what they're doing" to use recursive mutexes,
> why not let these same people "fully" release these mutexes on a wait? I
> don't see why this would necessarily mean that the mutex is "arbitrarily
> unlocked without the knowledge or consent of the code" that locked it. If
> the person who wrote the code is quite clear on where they are allowing
> recursive locks to occur, then they should have full knowledge of how waits
> would affect these locks.

People like recursive mutexes, and standards are often more a reflection of what
people like than of what they should have. Right or wrong.

Recursive mutexes aren't "a danger", except in that they encourage loose and sloppy
solutions to synchronization issues. They're slow and awkward, like trying to run
over an ant with a tank. You can do it, but why would you want to, unless you just
happen to have a tank handy and feel like showing you can do it? (As I said
already, this is basically why the recursive mutex attribute exists, because I just
wanted to show that I could do it.)

The only time it's ever valid to "fully release" a recursive mutex on a wait is
when you know that it shouldn't be locked. (Because it's not protecting anything.)
Given that holding a lock prevents concurrency; that locking is relatively
expensive, and recursive mutexes are more expensive than normal locks; this sounds
pretty dumb to me. The programmer has prevented her concurrent program from running
concurrently, and wasted CPU time doing it, and now is gleefully admitting to wait
that she knew it all along. Does that means it's "wrong"? It's a model that makes
it easy to write broken code, and hard to detect the failures. I don't see any
possible good in that.

> > The synchronized keyword tells the language that you're "in a monitor", but the
>
> > concept is loose and fuzzy, and gives no indication of which data is protected,
>
> I don't see how a POSIX mutex is any less "fuzzy" about what it's protecting.

I didn't say POSIX is "less fuzzy". POSIX provided primitive synchronization
mechanisms without any intent at all to do more. My whole point about Java is that,
despite advertising, IT doesn't do any more than POSIX. Nothing wrong with that,
but I'd prefer the advertising to be honest. (On the other hand, we all know that
honest advertising doesn't sell -- in computers or politics.)

> Agreed, and this is the justification I was seeking for why the UNIX98
> semantics are what you say they are. Although it would certainly help
> to guarantee that pthread_cond_wait returns an error code indicating that
> you've tried to wait on a recursively locked mutex instead of just quietly
> deadlocking. By choosing a recursive mutex, the programmer has indicated
> that the best performance is not important, so why not add the expense
> of doing additional error checking?

I had the same idea. Unfortunately, I can't recall why the working group refused
the idea of an error return for this. Oh well. Life goes on. I was involved in
POSIX for too long to ever again believe that standards are, can be, or even
necessarily SHOULD be perfect. Like Java, "it's a start".

tay...@template.com

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

> Dave Butenhof <bute...@zko.dec.com> wrote:
> David Holmes wrote:

Gentlemen, I'd like to thank you both for your lengthy comments.
This debate has been very educational. Now if only James Gosling
would join in...

Patrick Taylor

Bil Lewis

unread,

Oct 7, 1998, 3:00:00 AM10/7/98

Yow! You guys are brutal! "Syntactic Sugar" is hitting below the API.

I find that I agree with both of you at different times and
not quite sure what I really think. (You realize that you guys
are down at a pretty abstract level of argument and most hackers
will never think about this at all.)

One detail: We started talking about recursive mutexes in 'pthrerads'
and have gotten a bit off. All of us (myself included) talk about
Java using recursive mutexes, but that's not REALLY the case.
Java assures multiple synchronized sections will nest. The most
obvious implementation is recursive mutexes, but a *really* smart
Java compiler *could* analyze the code and give correct results with
simple mutexes.

The underlying question of how mutexes are related to the data
protected is a very interesting one. (Is there much good work in
this area? references?)

Personally, I never believed that Java would ever make it out of the
lab. So I'm content to use it as being 90% of what I want in a language.

-Bil

David Holmes

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

Dave Butenhof <bute...@zko.dec.com> wrote in article

<361B4FBB...@zko.dec.com>...

> On the contrary, Java's concurrency support is nothing but thin
"syntactic
> sugar" on top of something vaguely resembling the POSIX synchronization
model.

In that sense the whole Monitor concept is nothing but syntactic sugar on
top of something vageuly resembling the posix synchronisation model. But
then all synchronisation issues come down to a few key concepts.

Java does lack the features you mention - it has no notion of invariants
etc, nor how to restore them. Did you expect it to? Have you read somewhere
that Java is a new magic bullet for all concurrent programmings ills? I
certainly haven't and I never push Java as being such. To me it's a step in
the right direction from C++, and adds some basic concurrency support. It's
an integration of threading primitives with synchronisation primitives and
memory synchronisation all bundled to support a notion of concurrent
objects in the form of Monitors.

That is what Java's synchronisation primitives support: object-oriented
monitors. But Java does not restrict the programmer to thinking only in
terms of monitors as that would be far too restrictive. So you have the
flexibility to do other things - things that you could do if dealing with
raw mutexes and condition variables (some things easier than others). That
does shift responsibility back onto the programmer, but no more so that
using any other language/thread-API combination.

> course, it doesn't much impress me that the anonymous mutex gets
automatically
> released on an exception, because you'll usually need to deal with the
> exception directly anyway in order to restore shared data invariants.

Actually I agree with you here. But what I find just as odd is that
everytime I see some C++ code posted here it always hides mutexes behind
smart pointers that acquire the mutex on construction and release the mutex
when the smart pointer goes out of scope - of course that too implies that
exceptions cause mutexes to be released. 'C' of course does not have to
worry about this. Perhaps the problem here is that the OO folk haven't
really worked out how exceptions and synchronisation/threading really fit
together.

> There's nothing significantly, (or interestingly), different in the Java
model
> here. It's not even integrated language, but function calls on the
object.

I believe the term I used was "integrated within the object model" which is
different to being integrated within the language (which implies keyword
support).

> In other words, more syntactic sugar. (And, again, that's not a
criticism, but
> rather a response to people who seem to want to consider it some sort of
> magic.)

I agree it is syntactic sugar. The synchronized keyword simply makes it
impossible to forget to release a lock. That's the only advantage it has
over lock()/unlock() method calls. Who is suggesting anything magic?

> Yeah, and thinking in terms of "critical (code) sections" is almost
always a
> bad way to design for concurrency. You should be thinking about shared
data
> invariants, instead.

Perhaps a misleading choice of words. I look at which sections of code
within the methods of the object deal with shared data that would be
exposed in an invalid state, if those sections of code were to execute
concurrently. Such sections of code are thus critical sections.

> Up above, you talked about "starting a new" critical section or
"continuing
> the current section". That's untrue. Java ALWAYS starts a new (possibly
> nested) critical section. It has to, because it doesn't know what the
current
> section means.

In that sense "nested critical sections" are simply the mechanism by which
Java determines if it is already in a critical section or needs to start a
new one.

> There's nothing wrong with your factual descriptions. You're just
assuming a
> lot that's not there. That's not The Great And Powerful Oz... just a
little
> man behind the curtain. He may be a very great man, but he's a very poor
> wizard.

But I don't expect a wizard - though apparently you do. ;-)

> You're holding it across a call when you have no
> need at all to be holding the mutex. That's a bad concurrent design; by
> definition, because you're reducing concurrency with no benefit.

You've lost me there - where are we holding a mutex across a call that we
don't need to hold?

> I've joked that we should have called mutexes "bottlenecks", because it
would
> give everyone a better mind set about how and when to use them.

All synchronisation control is about reducing concurrency.

> Let's put it this way, as I already did in an earlier post. If there's
any
> justifiable need in the caller to have the mutex locked, then unlocking
it for
> the wait is an error.

Why does wait release the mutex in the first place? Because someone else
will need that mutex to cause the wait to terminate. So the caller of wait
has to hold a mutex to protect that state that it inspects to determine
whether or not it will need to wait. Now if the calling method is part of
the same object, using the same mutex, then it holds the mutex to protect
access to state that needs to occur as an atomic action with the method
that will wait. The caller knows/expects/requires the mutex to be released
during the wait. These are methods of the same object - we know what these
methods to, we probably know how they do it, in this circumstance doing a
wait is not a little private implementation detail that nobody needs to
know about - it is a critical part of the design of this object to operate
in a concurrent environment.

As I have said before the circumstances you describe only occur if you
share locks and don't know about the wait. My argument is that you can't
share locks by accident and any wait that involves a shared lock is one you
must know about by definition.

> The simple fact is that either the programmer "lied" (the lock isn't
> necessary, and really, in the name of concurrency, shouldn't be held), or
else
> the unlock is wrong. That's it. That's the root of this whole argument.

The lock is necessary to form an atomic action and the unlock is right
because we need another thread to make the state change that releases the
wait. All methods are defined as monitor methods and a wait releases the
monitor. That's the way it needs to work.

> Yeah, if I was inclined to go into pure research, this would be a really
> interesting project that could probably keep me busy for 5 or 10 years.

At least. The past 40 years haven't really come up with much.

> I guess I'll leave it.

I knew you would. :-)

Regards,
David

David Holmes

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

Bil Lewis <B...@LambdaCS.com> wrote in article
<361C05...@LambdaCS.com>...

> I find that I agree with both of you at different times and
> not quite sure what I really think. (You realize that you guys
> are down at a pretty abstract level of argument and most hackers
> will never think about this at all.)

Ahh but that's the whole crux of concurrent programming - most hackers
never think of half the problems they will run into :-)

> The underlying question of how mutexes are related to the data
> protected is a very interesting one. (Is there much good work in
> this area? references?)

Some languages support expression of invariants eg. Eiffel, but I've never
seen anything that tries to relate it back to concurrency issues.
Interestingly enough one of Meyers key arguments for not allowing
intra-object concurrency is that it makes invariants impossible to check.
Eiffels concurrency model is strictly active-object. You never do a wait
rather you indicate that you want exclusive access to an object when it is
in a particular state - internally this can be implemented with the usual
mutexes and condvars but the programming model doesn't expose this.

Other models simply disallow shared data so synchronisation is purely a
matter of inter-thread communication.

Concurrent OO languages which allow you create thread arbitrarily typically
go for some form of monitor concept or simply provide locking and
suspension primitives. Active objects systems are just another form of
monitor though again with no shared data in many cases. Such systems tend
to preclude internal suspension. They also give you very little control
(often none!) over synchronisation policies.

David

Xiaochun

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

I've some problems in Solaris2.5-2.6:
(1) the DNS is damaged, do I need to reinstore the whole system? Or how
do I restore it step by step?
(2) How do I change the default web browser (say, from Netscape to Hot
Java)? Where is the directory for the default web browser?
(3) Is there a special discussion group for HP OpenView?

Thanks a lot.

Xiaochun

Chris Smith

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

Hi,

[I snipped comp.unix.solaris as a newsgroup because I'm gonna talk about
Java, not Solaris. Apologies to any Solaris people who are interested.]

Dave Butenhof wrote:
>
> Java, however, takes on the responsibility of multiply unlocking a recursively
> locked mutex on a wait, without any understanding at all of what the lock means to

> the caller. My point is that, because it doesn't and can't know, the behavior is
> broken. It is assuming that callers hold the lock for no reason at all. (In other
> words, that the programmer doesn't know what she is doing.) While that may, in many
> cases, be a reasonable assumption, ignoring the consequences isn't a good solution.
> It should either fail explicitly, or deadlock, rather than subject the program to
> race conditions that may easily result in serious data corruption and will, at best,
> be difficult to diagnose.

Someone let me know if I'm making an idiot of myself here and I'll
stop. Okay, if we're hypothetically rewriting Java, I don't see why it
has to do any of the above (fail or deadlock or cause a race
condition). Seems to me that Java's problem is that there are two
conceptual sets of data and only one mutex. The object's lock is meant
to synchronize access to the object data. I completely agree that
synchronizing with respect to invariants would be a good thing, but
unfortunately, as David Holmes points out, Java essentially thinks along
the lines of critical sections... I'm accepting that for the purposes of
this discussion. But then the same lock is used to synchronize access
to the object's associated condition variable. I don't see how these
are really the same data at all. What I would do to fix Java is to
abolish any dependence on the object's automatic data lock to manipulate
the condition variable. wait and notify and notifyAll should use their
own internal lock (which would not need to be recursive) for the purpose
of synchronizing access to the condition variable.

So I agree that Java is very confusing and misleading as is (that is,
you have to think about what the methods do that you call, which
counteracts information hiding... and the syntax indicates that an
operation is atomic even when it's not). But I disagree on how it ought
to be fixed... it needs to be fixed by removing the "there's a lock...
why don't we use that?" theory and replacing it with a clear delineation
of what an object's lock is supposed to protect.

Chris Smith (cd_s...@ou.edu)
Sophomore, Computer Science, University of Oklahoma
DigitalThink, Inc. Java Tutor

Jerry Leichter

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

to Chris Smith

| Okay, if we're hypothetically rewriting Java, I don't see why it
| has to do any of the above (fail or deadlock or cause a race
| condition). Seems to me that Java's problem is that there are two
| conceptual sets of data and only one mutex. The object's lock is

| meant to synchronize access to the object data.... But then the same

| lock is used to synchronize access to the object's associated
| condition variable. I don't see how these are really the same data at
| all. What I would do to fix Java is to abolish any dependence on the
| object's automatic data lock to manipulate the condition variable.
| wait and notify and notifyAll should use their own internal lock
| (which would not need to be recursive) for the purpose of
| synchronizing access to the condition variable.

The problem with that is that it misses the whole point of having wait
and friends!

Let's take a trivial example: We have a class containing a one-
character buffer, and a boolean indicating whether the buffer is full or
empty. There is a thread that periodically fills the buffer if it's
empty, and there is a thread that periodically reads the buffer if it's
full. Since the buffer and the boolean are accessed from multiple
threads - and, further, since they only make sense *together* - the
methods called by the two threads have to be synchronized. Each of them
will check to see if the boolean is in the "right" state, waiting
otherwise. When they see the boolean is "right", they access the
buffer, flip the state, and do a notify in case the other thread is
waiting for the boolean to change. (This is a simplified version of the
classic bounded-buffer problem; in general, you have an n-byte buffer,
and an indication of how many bytes are currently stored.)

Now suppose wait() had its own internal lock. So the reading thread
enters the synchronized method, locking the object; finds there's
nothing in the buffer; and calls wait(). What can ever wake it? The
object is still locked, so the writing thread can't enter *its*
synchronized method - and it's the only thing in the system that can
possibly change boolean. Deadlock.

The whole *point* of wait() is to let you wait for *someone else* to
modify the state of the object. If it doesn't release the object's
lock, that will never happen. (Conversely, if *you* release the
object's lock before waiting, you can miss notifies. I'll leave that
one "as an exercise to the reader".)
-- Jerry

Chris Smith

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

Jerry Leichter wrote:
> The whole *point* of wait() is to let you wait for *someone else* to
> modify the state of the object. If it doesn't release the object's
> lock, that will never happen. (Conversely, if *you* release the
> object's lock before waiting, you can miss notifies. I'll leave that
> one "as an exercise to the reader".)

Oops. That last bit was my misconception. I was thinking in Windows
"events" rather than condition variables, which is what wait/notify
really are. So I missed to the point that waiting after the condition
variable has been notified will block. I still lapse into Windows here
and there... seems I would have developed that mental block of a
traumatic experience by now, but oh well. So please disregard the
above. I realize now that the mutex protecting a condition variable and
the associated data need to be the same.

So I guess that means I *was* making an idiot of myself... :-(

Chris Smith

tay...@template.com

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

In article <01bdf273$79874170$1bf56f89@dholmes>,

"David Holmes" <dho...@mri.mq.edu.au> wrote:
> Dave Butenhof <bute...@zko.dec.com> wrote in article

> > You're holding it across a call when you have no
> > need at all to be holding the mutex. That's a bad concurrent design; by
> > definition, because you're reducing concurrency with no benefit.
>
> You've lost me there - where are we holding a mutex across a call that we
> don't need to hold?

This goes back to your example:

> Suppose we have the following code sequence:
>
> someMethod() {
> lock our mutex
> do some stuff that leaves invariants broken
> foo()
> other stuff that fixes invariants
> unlock mutex
> }
> If foo() does a conditional wait and uses the *same* mutex/lock as we do
> then the data will be exposed while invariants are broken.

If you fix this error by making sure that the invariants are not
broken when you call foo(), then you don't need to hold the mutex
when you call foo. You only need to hold the mutex when the
invariants are broken. You could change your code to something
like this:

someMethod() {
synchronize (this) {
do stuff that breaks but then fixes invariants
}
foo();
synchronize (this) {
do more stuff that breaks but then fixes invariants

Bil Lewis

unread,

Oct 8, 1998, 3:00:00 AM10/8/98

> > course, it doesn't much impress me that the anonymous mutex gets
> automatically
> > released on an exception, because you'll usually need to deal with the
> > exception directly anyway in order to restore shared data invariants.
>
> Actually I agree with you here. But what I find just as odd is that
> everytime I see some C++ code posted here it always hides mutexes behind
> smart pointers that acquire the mutex on construction and release the mutex
> when the smart pointer goes out of scope - of course that too implies that
> exceptions cause mutexes to be released. 'C' of course does not have to
> worry about this. Perhaps the problem here is that the OO folk haven't
> really worked out how exceptions and synchronisation/threading really fit
> together.

I'm not at all surprised to see that. In some fashion I certainly will want
to unlock when exiting that critical section.

For what it's worth, this really doesn't have anything to do with threads.
If you're exiting a closure of any kind, via return, named throw, or exception,
you need the state of the world to be consistant. Somehow this has to be
accomplished. By some kind of "finally" (unwind-protect) clause? I can't
imagine a higher-level construct working.

-Bil
--
================
B...@LambdaCS.com

http://www.LambdaCS.com
Lambda Computer Science
555 Bryant St. #194
Palo Alto, CA,
94301

Phone/FAX: (650) 328-8952

David Holmes

unread,

Oct 9, 1998, 3:00:00 AM10/9/98

tay...@template.com wrote in article <6vj10u$f9a$1...@nnrp1.dejanews.com>...

> This goes back to your example:

The example showed an erroneous situation where the programmer does not
know that foo() will release the mutex - hence to them the whole method is
one big critical section.

If they know that foo() does a wait and releases the mutex then there are
various "corrections" to avoid the problem.

David

Patrick TJ McPhee

unread,

Oct 9, 1998, 3:00:00 AM10/9/98

In article <01bdf273$79874170$1bf56f89@dholmes>,
David Holmes <dho...@mri.mq.edu.au> wrote:

% Why does wait release the mutex in the first place? Because someone else
% will need that mutex to cause the wait to terminate. So the caller of wait
% has to hold a mutex to protect that state that it inspects to determine
% whether or not it will need to wait. Now if the calling method is part of
% the same object, using the same mutex, then it holds the mutex to protect
% access to state that needs to occur as an atomic action with the method
% that will wait. The caller knows/expects/requires the mutex to be released
% during the wait. These are methods of the same object - we know what these

I have a problem with this idea. It seems to me that you don't have to
be part of the same object -- you could be holding a mutex on some other
object, and call a method of that object. Perhaps when you write the
code that does this, the method doesn't do a wait, but later it ends up
having to do a wait for some reason. This could be done after the
calling method is written and compiled, and the original programmer dies
in a car accident. You essentially have to document evey method that
gets called from a synchronised block of code, and ensure that you
either never do a wait that affects any synchronisation, or that a wait
won't break the calling code.

The alternative is for the wait to raise an exception if it's called
while the mutex is recursively locked. This seems to me to be very much
in keeping with the Java way of doing things. If nothing else, it prevents
you from adding a wait without changing the interface.
--

Patrick TJ McPhee
East York Canada
pt...@interlog.com

sa...@bear.com

unread,

Oct 9, 1998, 3:00:00 AM10/9/98

In article <361B556F...@zko.dec.com>,

Dave Butenhof <bute...@zko.dec.com> wrote:
> tay...@template.com wrote:
>

There has been a lot of discussion about "concurrent object oriented"
features of Java. I understand Java thread-model and pthreads fairly
well (I hope), so let me express my opinions (hopefully short).

The 'synchronized' keyword in Java is a syntactic convenience, but it
does not make it a "concurrent object oriented" language. Do not fall
for Sun's marketing hypes.

In fact, I prefer to keep the objects separate
from the mutex and threads (they are orthogonal). That is why my
favorite is pthreads with C++. pthreads provides very simple, but powerful
mechanisms for mutual exclusion, wait and threads. They are abstract in
the sense that a mutex does not care what you want to do in the critical
section. Cond. vars. do not care what the predicates are. Now this
integrates very nicely with the object-oriented features of C++. You can
also get the convenience of 'synchronized' by using a MutexLock object
which locks a Mutex in the constructor and unlocks in the destructor.

Sometimes I think that 'synchronized' encourages the misuse, because you
have an anonymous lock and you can make a method synchronized by just adding
a keyword. Look at how java.util.vector has many methods synchronized.
There may be other reasons such as security (against rogue applets ..),
but a good thread programmer will not make theose methods synchronizd.
Instead, he will leave it to the user of the container to lock it if needed.

Dave Butenhof has already said why recursive locks in Java show a confused
model. So I will not dwell upon it (by saying objects are orthogonal to
locking, hopefully I have made my preference clear).

In C++, I routinely use a MutexLock object (similar to synchronized, but
with explicit lock) to lock and unlock automatically. Some people have
mentioned why it may be dangerous (it may not leave the invariant true on
exception). I am aware of it and when you use such 'automatic unlock', you
have to make sure that all the invariants are left true. In C++, you have
to use class destructors to do clean-up on exception and in Java you have
to use 'finally' to do the clean-up. It takes some time getting used to
this way of resource acquision and release, but once you get used to it, you
will never want to go back to the C model of checking every return code
and doing the clean up on failure. An interesting excercise for the reader
will be to take Dave Butenhof's book and convert the C programs to C++ with
Mutex, MutexLock object (with exceptions on error).

Hope it helps,
Saroj Mahapatra

tay...@template.com

unread,

Oct 9, 1998, 3:00:00 AM10/9/98

In article <01bdf335$837178f0$1bf56f89@dholmes>,

But don't all of them involve restoring the invariants before calling foo()?
If so, then the mutex does not need to be locked, which was Butenhof's point.

David Holmes

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

tay...@template.com wrote in article <6vlnio$9d0$1...@nnrp1.dejanews.com>...

> But don't all of them involve restoring the invariants before calling
foo()?
> If so, then the mutex does not need to be locked, which was Butenhof's
point.

I don't see what is being argued here. If there really is no need to hold
the lock when calling foo() then the "wrong behaviour" will never arise.
The assumption is that for some reason the lock either must be held over
the call to foo(), or the lock() is held because there is no obvious need
to release it, and thus because foo() releases the lock the "wrong
behaviour" would occur. My whole point is that such wrong behaviour is very
unlikely to occur.

David

David Holmes

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

Patrick TJ McPhee <pt...@interlog.com> wrote in article
<6vk0eq$7c0$1...@news.interlog.com>...

> I have a problem with this idea. It seems to me that you don't have to
> be part of the same object -- you could be holding a mutex on some other
> object, and call a method of that object. Perhaps when you write the
> code that does this, the method doesn't do a wait, but later it ends up
> having to do a wait for some reason. This could be done after the
> calling method is written and compiled, and the original programmer dies
> in a car accident. You essentially have to document evey method that
> gets called from a synchronised block of code, and ensure that you
> either never do a wait that affects any synchronisation, or that a wait
> won't break the calling code.

You can always change a class in such a way that existing clients must be
modified to not be broken by the change. The scenario you describe can be
applied to both Java and C++/POSIX. In the former you would expose an
invalid object, whilst in the latter you would deadlock.

It seems unlikely that a method that did not need to wait, now needs to do
so, but lets take that it is so. Does the method that holds the lock reside
in the object whose lock we hold? If yes then the change to the first
method should be examined for its impact on all others in the object. If
no, then why is the second objects lock being held? If we are imposing a
protocol that states that sequences of methods can be applied to the object
atomically if its lock is held, then changing one of those methods to do a
wait breaks that protocol and consequently will require modification of the
clients.

> The alternative is for the wait to raise an exception if it's called
> while the mutex is recursively locked. This seems to me to be very much
> in keeping with the Java way of doing things. If nothing else, it
prevents
> you from adding a wait without changing the interface.

In the context of implementing the Monitor concept you do not want a
wait(), whilst holding the lock more than once, to raise an exception. When
inside the monitor a wait means release the monitor. If it didn't then
super calls, recursive calls and other self calls would all be broken if a
wait was involved.

David

David Holmes

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

sa...@bear.com wrote in article <6vl6un$dmb$1...@nnrp1.dejanews.com>...

> but a good thread programmer will not make theose methods synchronizd.
> Instead, he will leave it to the user of the container to lock it if
needed.

When synchronising access to shared data/objects there are two basic
approaches:

1. Synchronise the activities/threads with respect to the shared data

This is what you have advocated above and it's perfectly valid. However it
relies on all clients cooperating. Using this approach you can come up with
some simple protocols for a range of synchronisation problems.

You can even encapsulate this more by storing the lock within the object
(and allow a specific lock to be associated with an object) - that way
there's less chance of fogetting which lock you need for which object, you
simply ask the object.

The emphasis here is that threads/activities are responsible for
synchronising access to the objects they share.

2. Define objects which protect themselves from concurrent access.

Active objects (ala Eiffel and a host of other research concurrent OO
languages) do this automatically usually by having a single thread per
object (not all objects) which is responsible for message processing.
Concurrency is introduced through asynchronous calls.

Monitors provide the same capability - in fact there is little difference
between active objects and Monitors in this regard, the key difference is
how threads are created/used for invocation.

There is nothing wrong with an object having methods that enforce the
synchronisation policies for that object. You can create new activities to
interact with those objects without even thinking about protocols. But even
here an object can still export a protocol that tells the client how to use
the object in a way not supported directly (such as invoking a sequence of
methods atomically).

The emphasis here is on objects coordinating the actions of the threads
that invoke their methods.

Of course a good separation of functionality and synchronisation is
important, so you provide synchronised wrapper classes for the purely
functional ones -templates are really good for doing this.

From reading this newsgroup the former approach is mostly advocated here
and the latter hardly ever considered. I don't advocate an either or
approach, both have their merits in different circumstances - but if you
don't accept that both views exist then you're not really thinking
concurrent objects in my view.

To take a simple example. Consider a producer/consumer situation. Most of
the code I see here puts synchronisation logic in the producer code and
consumer code. I would generally put the synchronisation logic in the
buffer that the producer and consumer share. When multiple types of
producers/consumers are involved it is much easy to make changes when the
sync code exists only in the buffer, not spread throughout all producer and
consumer types.

Cheers,
David

sa...@bear.com

unread,

Oct 12, 1998, 3:00:00 AM10/12/98

In article <01bdf596$68b58cf0$1bf56f89@dholmes>,

I agree that there are situations where it makes sense to make an object
methods synchronized. Your buffer example is one such case. I also use
such objects such as 'Manager' object or 'MyDatabase' object where the
problem involves one or one set of resources (I could not word it better).
But I think it is a bad design to make java.util.vector methods synchronized,
because typically when you acquire the lock, you want to invoke more
than one method, such as enquiring the size, adding more than one elements,
iterating over all the elements etc.
In such a case, you want to protect all the operations together(but not
individually). In other words, the design has wrong granularity.

Thanks,
Saroj Mahapatra

David Holmes

unread,

Oct 14, 1998, 3:00:00 AM10/14/98

sa...@bear.com wrote in article <6vtnok$221$1...@nnrp1.dejanews.com>...

> But I think it is a bad design to make java.util.vector methods
synchronized,

Vector is written such that any client thread can invoke any single method
in a thread-safe way without having to concern themselves with
synchronisation - the Vector takes care of its own synchronisation. What is
needed is a means by which any client can invoke a series of Vector methods
as an atomic action. No such means is documented but if you look at the
code you see that Vector always uses 'this' as the object whose lock is
acquired. Thus to perform a sequence of methods atomically wrap the
sequence in a synchronized block that acquires the vectors lock. Protocols
such as this should be clearly documented and at least with the new
collection classes they are partly documented. Of course to use such a
protocol you must know that none of the methods may do a wait() *or* that
they will never be called such that they would do a wait().

From a performance perspective if you think that multiple methods are the
most common way of using a vector then this design is inefficient - it
would be quicker to not synchronise the methods but rely on the caller to
call the methods whilst the lock is held. Java's library designers don't
seem to like relying on the caller to do the right thing in such cases.
Given that Java has spread concurrent programming to the ignorant (of
concurrent programming) masses I think they have made the right design
choice. But that simply means Vector is not suitable for all applications.

Cheers,
David

Dave Butenhof

unread,

Oct 21, 1998, 3:00:00 AM10/21/98

David Holmes wrote:

OK, if the lock "must" be held over the call, either

1. The caller has atomicity requirements, or holds broken invariants, such
that no asynchonous changes to the data can be allowed across the call.
In this case, releasing the lock is an error.
2. The caller knows that it is responsible for the callee's synchronization;
i.e., the callee is unsynchronized and is explicitly required to run
within an outer "critical section". Assuming that this is one of the
terms of the contract, and assuming that no callers ever try to
"piggyback" additional semantics (e.g., a broken invariant) on the lock
that must be held anyway, then it would be OK to unlock.

The problem is there's no way the language can distinguish between these two
cases. There's no way for the language (or the callee) to know whether there's
a broken invariant, or the intentions of the caller. The "synchronized"
keyword, unfortunately, seems to have proven itself an invitation to assume
that the language has solved the problems. People using Java rarely even
consider the assumptions inherent in the language's convenient-seeming
synchronization syntax, and the resulting complacency can be dangerous.

In an earlier post, you said "all synchronization control is about reducing
concurrency". Yeah, right. However, the point of threading is to get
concurrency. Synchronization is not good; it's a NECESSARY EVIL. You should
always hold a lock over the smallest possible section of code, and
acquire/release a lock as infrequently as possible. To do otherwise reduces
the usefulness of your threads.

Making it easy and convenient to unnecessarily hold a mutex across a call
doesn't do anyone any favors. "Completely releasing" a recursively locked
mutex is nice & friendly only when the mutex shouldn't have been locked in the
first place -- and it's erroneous and dangerous if the mutex needed to be
locked. You can't have it both ways unless the language understands enough of
the program's semantics to distinguish between the two cases. Java doesn't,
and can't.

If you do everything "Java's way" with full understanding of what that means
for your predicates, and if you completely re-evaluate all calls to any method
to which you add or change a wait, then you can certainly write correct code
with Java. But that doesn't mean it's GOOD, or EFFICIENT, or CONCURRENT. Many
programmers aspire to all three of those conditions, and Java is leading many
of them up the wrong path. Furthermore, and of utmost importance, Java has not
done nearly enough to explain the somewhat odd & unnatural assumptions of its
synchronization model, ensuring that many well-meaning programmers will be
ignorant of their danger. You can be sure that some of them ARE waiting while
a caller has a broken predicate, or is making some assumptions about data
atomicity across the call. They're wrong, of course. But Java itself is partly
to blame.

0 new messages