memory visibility between threads

187 views
Skip to first unread message

Lie-Quan Lee

unread,
Jan 11, 2001, 12:34:42 PM1/11/01
to

I have been reading the book written by David Butenhof, Programming with
POSIX Threads. Thanks David, it is a wonderful book and I think it is a
must-read book to everyone who is doing multithread programming. The book
educated me a lot of valuable points I never knew before.

The memory visibilty rules are insightful. it digged out the in-depth
things behind mutex and condition variables and showed me the real meaning
of "synchronization". For example, the rules mean that "any data one put
in register or atuo variables can be read at a later time with no more
complication than a completely synchronous program." Namely, for a
shared variable, if I use a mutex to protect accessing it, I donot need to
worry about whether it is cached in a processor's register or not.

One question is, whether those memory visibility rules are applicable for
other thread system such as Solaris UI threads, or win32 threads, or JAVA
threads ...? If yes, we can follow the same spirit. Otherwise, it will be
a big difference. (For example, all shared variables might have to be
difined as volatile even with mutex protection.)

Any commments will be appreciated.

***************************************************
Rich Lee (Lie-Quan)
Lab for Scientific Computing
University of Notre Dame
Email : ll...@lsc.nd.edu
Tel : 219-631-3906 (Office)
HomePage: http://www.lsc.nd.edu/~llee1/
***************************************************


Kaz Kylheku

unread,
Jan 11, 2001, 12:58:14 PM1/11/01
to
On Thu, 11 Jan 2001 12:34:42 -0500, Lie-Quan Lee <ll...@lsc.nd.edu> wrote:
>One question is, whether those memory visibility rules are applicable for
>other thread system such as Solaris UI threads, or win32 threads, or JAVA
>threads ...?

They do not apply to Win32 threads. Win32 simply isn't as
intellectually thorough as something like POSIX. It's a moving target.

The synchronization rules are basically whatever the latest Visual C++
compiler happens to do, and whatever MSDN example source code does.
Because Microsoft's synchronizes registers with memory around calls to
external functions like EnterCriticalSection, there's no need to use
volatile. And no flavor of Windows runs on a multiprocessor system that
reorders memory accesses, or poses other kinds of difficulties to
threaded programs. Many Win32 programmers blatantly ignore such
possibilities anyway, so most Win32 applications would be in serious
trouble on such hardware.

Win32 is a one-vendor job, and it shows! Already one has do do a ton
of #ifdef's to get a codebase going on NT, 2000, CE, 98 and 95 due to
differences in the interface availability or the semantics of
interfaces that are supposed to be common.

Dave Butenhof

unread,
Jan 12, 2001, 8:32:40 AM1/12/01
to
Lie-Quan Lee wrote:

> I have been reading the book written by David Butenhof, Programming with
> POSIX Threads. Thanks David, it is a wonderful book and I think it is a
> must-read book to everyone who is doing multithread programming. The book
> educated me a lot of valuable points I never knew before.

Thank you!

> The memory visibilty rules are insightful. it digged out the in-depth
> things behind mutex and condition variables and showed me the real meaning
> of "synchronization". For example, the rules mean that "any data one put
> in register or atuo variables can be read at a later time with no more
> complication than a completely synchronous program." Namely, for a
> shared variable, if I use a mutex to protect accessing it, I donot need to
> worry about whether it is cached in a processor's register or not.

As long as you're writing in C (or, with care, in C++) and you're using a
system and compiler that conform to POSIX 1003.1-1996, of course...

> One question is, whether those memory visibility rules are applicable for
> other thread system such as Solaris UI threads, or win32 threads, or JAVA
> threads ...? If yes, we can follow the same spirit. Otherwise, it will be
> a big difference. (For example, all shared variables might have to be
> difined as volatile even with mutex protection.)

Don't ever use the C/C++ language volatile in threaded code. It'll kill your
performance, and the language definition has nothing to do with what you want
when writing threaded code that shares data. If some OS implementation tells
you that you need to use it anyway on their system, (in the words of a child
safety group), "run, yell, and tell". That's just stupid, and they shouldn't
be allowed to get away with it.

As Kaz already said, Win32 has no memory model. It's a proprietary closed
system rooted strongly and deeply in the archaic X86 hardware model. You do
whatever they expect you do to, and whenever they expect something different,
you change your code. (You may or may not be told when the expectations
change, but that's irrelevant.) This all should have been highly and
expensively embarrassing to Microsoft during the period when they were
pretending to support Windows NT on other architectures such as MIPS,
PowerPC, and Alpha, but instead the failings of the Win32 architecture were
blamed on the hardware vendors (presumably because the machines were not
X86). Memory model ought to be a major issue on IA-64, and I don't know if
(or how thoroughly) Win64 addresses it. If they ignore it, fail to make the
issues visible, or fail to thoroughly and quickly educate the vast throngs of
Win32 programmers, I suspect that IA-64 will also fail (at least as a Windows
machine).

UI threads was based loosely on POSIX (or rather, while it began somewhat
independently, it was "co-developed" with POSIX threads and began to be
targeted essentially as a "bridge" standard on a faster track). The memory
model, last I saw, was essentially the same. Both are based entirely on API
operations that imply (or are) some form of explicit synchronization between
cooperating threads.

Java has its own memory model. While it includes the POSIX aspect of explicit
synchronization, it goes beyond that to specify semantics for access that
isn't explicitly synchronized. There are a few problems and ambiguities in
the current version, and it's being substantially reworked. In Java,
"volatile" is supposed to be sufficient, because it doesn't mean at all what
the C volatile means. (I've never been happy that they chose the same word...
but then, it may well be that Java volatile means very much what most C
programmers have erroneously believed the C volatile to mean, so maybe it's
excusable.)

/------------------[ David.B...@compaq.com ]------------------\
| Compaq Computer Corporation POSIX Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----[ http://home.earthlink.net/~anneart/family/dave.html ]-----/

John Hickin

unread,
Jan 12, 2001, 9:09:53 AM1/12/01
to
Dave Butenhof wrote:
>

>
> Java has its own memory model. While it includes the POSIX aspect of explicit
> synchronization, it goes beyond that to specify semantics for access that
> isn't explicitly synchronized. There are a few problems and ambiguities in
> the current version, and it's being substantially reworked. In Java,
> "volatile" is supposed to be sufficient, because it doesn't mean at all what

There is a wonderful discussion of this at
http://www.cs.umd.edu/~pugh/java/memoryModel/

I remember a protracted discussion of Double Checked Locking (in a C/C++
setting) in this newsgroup and came to appreciate how dangerous it is to
try to optimize synchronization and maintain portability. The Java folks
are about to learn the same lesson (see the discussion in The
"Double-Checked Locking is Broken" Declaration part of the document).


Regards, John.

sa...@bear.com

unread,
Jan 12, 2001, 10:10:37 AM1/12/01
to
In article <3A5F0778...@compaq.com>,
Dave Butenhof <David.B...@compaq.com> wrote:

> Lie-Quan Lee wrote:
>
> Java has its own memory model. While it includes the POSIX aspect of
explicit
> synchronization, it goes beyond that to specify semantics for access
that
> isn't explicitly synchronized. There are a few problems and
ambiguities in
> the current version, and it's being substantially reworked. In Java,
> "volatile" is supposed to be sufficient, because it doesn't mean at
all what
> the C volatile means. (I've never been happy that they chose the same
word...
> but then, it may well be that Java volatile means very much what most
C
> programmers have erroneously believed the C volatile to mean, so
maybe it's
> excusable.)
>

In Java, volatile means:

1) every access to a `volatile' variable is a memory access,
2) program-ordered accesses to `volatile' variables can not be
reordered with respect to each other,
but there is no restriction on the ordering between a `volatile'
variable access, and a non-volatile variable access.

You can not use `volatile', and still get compiler optimizations
in Java (you have to use `synchronized' for that purpose), because
a `volatile' "flag" variable can be reordered with respect to a
non-volatile 'shared' data.

In C or C++, `volatile' inhibits compiler optimizations, and the
standard does not say much about the precise meaning of `volatile'.
For this reason, I think the Java designers' decision to use the
same term (`volatile') is completely defensible. It is C or C++
programmers' mistake if they do not understand `volatile' completely
(which the standard more or less leaves completely to the implementors
to define).

Thank you,
Saroj Mahapatra

Sent via Deja.com
http://www.deja.com/

Martin Berger

unread,
Jan 12, 2001, 9:27:29 PM1/12/01
to
Dave Butenhof wrote:

> > One question is, whether those memory visibility rules are applicable for
> > other thread system such as Solaris UI threads, or win32 threads, or JAVA
> > threads ...? If yes, we can follow the same spirit. Otherwise, it will be
> > a big difference. (For example, all shared variables might have to be
> > difined as volatile even with mutex protection.)
>
> Don't ever use the C/C++ language volatile in threaded code. It'll kill your
> performance, and the language definition has nothing to do with what you want
> when writing threaded code that shares data. If some OS implementation tells
> you that you need to use it anyway on their system, (in the words of a child
> safety group), "run, yell, and tell". That's just stupid, and they shouldn't
> be allowed to get away with it.
>

well, in the c/c++ users journal, Andrei Alexandrescu recommends using
"volatile" to help avoiding race conditions. can the experts please slug it out?
(note the cross posting)

martin

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

David Schwartz

unread,
Jan 13, 2001, 5:56:51 AM1/13/01
to

Martin Berger wrote:

> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> "volatile" to help avoiding race conditions. can the experts please slug it out?
> (note the cross posting)
>
> martin

What race conditions? If you have a race condition, you need to _FIX_
it. Something that might help "avoid" it isn't good enough. If I told my
customers I added some code that helped avoid race conditions, they'd
shoot me. Code shouldn't _have_ race conditions.

DS

Kaz Kylheku

unread,
Jan 13, 2001, 1:14:30 PM1/13/01
to
On 12 Jan 2001 21:27:29 -0500, Martin Berger

<martinb@--remove--me--dcs.qmw.ac.uk> wrote:
>well, in the c/c++ users journal, Andrei Alexandrescu recommends using
>"volatile" to help avoiding race conditions. can the experts please slug it
out?

The C/C++ Users Journal is a comedy of imbeciles.

The article you are talking about completely ignores issues of memory
coherency on multiprocessor systems. It is very geared toward Windows;
the author seems to have little experience with multithreaded
programming, and especially cross-platform multithreading. Luckily, he
provides a disclaimer by openly admitting that some code that he wrote
suffers from occasional deadlocks.

Martin Berger

unread,
Jan 13, 2001, 3:01:23 PM1/13/01
to
David Schwartz wrote:

> > well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> > "volatile" to help avoiding race conditions. can the experts please slug it out?
> > (note the cross posting)
> >
> > martin
>
> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.


maybe i should have given more details: this idea is to use certain properties
of the typing system of c++ with respect to "volatile", "const_cast", overloading
and method invocation to the effect that race conditions will be type errors or
generate at least compiler warnings (see

http://www.cuj.com/experts/1902/alexandr.html

for details). this is quite nifty, provided we ignore the problems dave has
pointed out.

interestingly, andrei's suggestions do not at all depend on the intended semantics
of "volatile", only on how the typing systems checks it and handels const_cast
and method invocation in this case. it it is possible to introduce a new c++
qualifier, say "blob" to the same effect but without the shortcomings (it would
even and in controdistinction to the "volatile" base proposal, handle built in
types correctly).

martin

Rajanish Calisa

unread,
Jan 13, 2001, 5:26:26 PM1/13/01
to

Kaz Kylheku <k...@ashi.footprints.net> wrote in message
news:slrn96173...@ashi.FootPrints.net...

The author is not suggesting that using volatile somehow means that
synchronisation is not necessary. He has come up with a convention,
where one would qualify shared objects as "volatile", and then go on
to define member functions "volatile" and non-volatile. The former is
safe to be called from multiple threads, not as a consequence of the
magic keyword "volatile", but because an internal mutex protects
the data; for the latter type, the caller must provide explicit locking
through a shared mutex.

I remember reading about "volatile" usage not recommended in
Dave Butenhof's book, because it would kill performance. I
am not sure if that applies to constructed types like classes.
If so, the technique described by the author will probably degrade
the performance. Another problem could be the casting away of
the volatile-ness and invoking the non-volatile functions, this
may be undefined by the C++ standard. I am not an expert in
C++, others can probably shed more light on this.

I am not sure if there was any obvious error relating to violation
of memory visibility rules, etc. It will be helpful if you can point
out specific errors. CUJ is a widely read magazine and readers
might adopt the techniques/suggestions in the published articles
without knowing the problems associated.

rajanish...


Martin Berger

unread,
Jan 13, 2001, 5:30:45 PM1/13/01
to

Rajanish Calisa <rca...@ozemail.com.au> wrote in message
news:Wv486.734$FC1....@ozemail.com.au...

> I am not sure if there was any obvious error relating to violation
> of memory visibility rules, etc. It will be helpful if you can point
> out specific errors. CUJ is a widely read magazine and readers
> might adopt the techniques/suggestions in the published articles
> without knowing the problems associated.

i second rajanish's suggestion but add the request to cross post
replies to comp.lang.c++.moderated

martin


James Moe

unread,
Jan 13, 2001, 10:59:46 PM1/13/01
to
Martin Berger wrote:
>
>
> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> "volatile" to help avoiding race conditions. can the experts please slug it
out?
> (note the cross posting)
>
Use mutexes or semaphores to control access to common data areas. It
is what they are for.
"volatile" is meant for things like hardware access where a device
register can change at any time.

--
sma at sohnen-moe dot com

Joerg Faschingbauer

unread,
Jan 13, 2001, 11:01:04 PM1/13/01
to
>>>>> "Martin" == Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk>
>>>>> writes:

>> Don't ever use the C/C++ language volatile in threaded code. It'll
>> kill your performance, and the language definition has nothing to
>> do with what you want when writing threaded code that shares
>> data. If some OS implementation tells you that you need to use it
>> anyway on their system, (in the words of a child safety group),
>> "run, yell, and tell". That's just stupid, and they shouldn't be
>> allowed to get away with it.
>>

Martin> well, in the c/c++ users journal, Andrei Alexandrescu
Martin> recommends using "volatile" to help avoiding race
Martin> conditions. can the experts please slug it out? (note the
Martin> cross posting)

(I am not an expert, but there's a few things I understood :-)

You use a mutex to protect data against concurrent access.

int i;

void f(void) {
lock(mutex);
i++; // or something
unlock(mutex);
// some lengthy epilogue goes here
}

Looking at it more paranoidly, on might argue that an optimizing
compiler will probably want to keep i in a register for some reason,
and that it might want to keep it in that register until the function
returns.

If that was the case the usage of the mutex would be completely
pointless. At the time the function returns (a long time after the
mutex was unlocked) the value of i is written back to memory. It then
overwrites the changes that another thread may have made to the value
in the meantime.

This is where volatile comes in. Common understanding is that volatile
disables any optimization on a variable, so if you declare i volatile,
the compiler won't keep it in a register, and all is well (if that was
the definition of volatile - my understanding is that volatile is just
a hint to the compiler, so nothing is well if you put it
legally). Except that this misperforms - imagine you wouldn't have a
simple increment in the critical section, but instead lots more of
usage of i.

Now what does a compiler do? It compiles modules independently. In
doing so it performs optimizations (fortunately). Inside the module
that is being compiled the compiler is free to assign variables to
registers as it likes, and every function has a certain register set
that it uses.

The compiler also generates code for calls to functions in other
modules that it has no idea of. Having no idea of a module, among
other things means that the compiler does not know the register set
that particular function uses. Especially, the compiler cannot know
for sure that the callee's register set doesn't overlap with the
caller's - in which case the caller would see useless garbage in the
registers on return of the callee.

Now that's the whole point: the compiler has to take care that the
code it generates spills registers before calling a function.

So, provided that unlock() is a function and not a macro, there is no
need to declare i volatile.


Hope this helps,
Joerg

Martin Berger

unread,
Jan 14, 2001, 5:09:29 AM1/14/01
to

Kaz Kylheku <k...@ashi.footprints.net>

> >well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> >"volatile" to help avoiding race conditions. can the experts please slug
it
> out?
>
> The C/C++ Users Journal is a comedy of imbeciles.
>
> The article you are talking about completely ignores issues of memory
> coherency on multiprocessor systems. It is very geared toward Windows;
> the author seems to have little experience with multithreaded
> programming, and especially cross-platform multithreading.

would you care to elaborate how "volatile" causes problems with memory
consistency
on multiprocessors?

> Luckily, he
> provides a disclaimer by openly admitting that some code that he wrote
> suffers from occasional deadlocks.

are you suggesting that these deadlock are a consequence of using
"volatile"?
if so, how? i cannot find indications to deadlock inducing behavior of
"volatile"
in kernighan & ritchie

James Dennett

unread,
Jan 14, 2001, 5:54:52 AM1/14/01
to
David Schwartz wrote:
>
> Martin Berger wrote:
>
> > well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> > "volatile" to help avoiding race conditions. can the experts please slug it out?
> > (note the cross posting)
> >
> > martin
>
> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.

True, and Andrei's claim (which seems reasonable to me, though I've not
verified it in depth) is that his techniques, if used consistently,
will detect all race conditions *at compile time*. If you want to
ensure, absolutely, that no race conditions remain, then you could
try looking into Andrei's technique as a second line of defense.

-- James Dennett <jden...@acm.org>

David Schwartz

unread,
Jan 14, 2001, 5:58:21 AM1/14/01
to

Joerg Faschingbauer wrote:

> Now that's the whole point: the compiler has to take care that the
> code it generates spills registers before calling a function.

This is really not so. It's entirely possible that the compiler might
have some way of assuring that the particular value cached in the
register isn't used by the called function, and hence it can keep it in
a register.

For example:

extern void bar(void);

void foo(void)
{
int i;
i=3;
bar();
i--;
}

The compiler in this case might optimize 'i' away to nothing.
Fortunately, any possible way another thread could get its hands on a
variable is a way that a function in another compilation unit could get
its hands on the variable. Not only is there no legal C way 'bar' could
access 'i', there is no legal C way another thread could.

ConsideR:

extern void bar(void);
extern void qux(int *);

void foo(void)
{
int i;
i=3;
while(i<10)
{
i++;
bar();
i++;
qux(&i);
}
}

For all the compiler knows, 'qux' stores the pointer passed to it and
'bar' uses it. Think about:

int *ptr=NULL;
void bar(void)
{
if(ptr!=NULL) printf("i=%d\n", *ptr);
}

void qux(int *j)
{
ptr=j;
}

So the compiler would have to treat 'i' as if it was volatile in 'foo'
anyway.

So most compilers don't need any special help to compile multithreaded
code. Non-multithreaded code can do the same things.

DS

Kaz Kylheku

unread,
Jan 14, 2001, 5:59:02 AM1/14/01
to
On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer

<jfa...@jfasch.faschingbauer.com> wrote:
>So, provided that unlock() is a function and not a macro, there is no
>need to declare i volatile.

Even if a compiler implements sophisticated global optimizations that
cross module boundaries, the compiler can still be aware of
synchronization functions and do the right thing around calls to those
functions.

Joerg Faschingbauer

unread,
Jan 14, 2001, 2:18:53 PM1/14/01
to
>>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

David> Joerg Faschingbauer wrote:

>> Now that's the whole point: the compiler has to take care that the
>> code it generates spills registers before calling a function.

David> This is really not so. It's entirely possible that the
David> compiler might have some way of assuring that the particular
David> value cached in the register isn't used by the called function,
David> and hence it can keep it in a register.

Of course you may once target a system where everything is tightly
coupled, and where the compiler you use to compile your module knows
about the very internals of the runtime - register allocations for
example. Then it could keep variables in registers even across runtime
function calls.

Even though such a thing is possible, it is quite unlikely - consider
the management effort of the people making (and upgrading!) such a
system. And even if people dared doing such a beast, this wouldn't be
POSIX - at least not with the functions that involve locking and
such. (There was a discussion here recently where Dave Butenhof made
this plausible - and I believe him :-}.)

(Of course there are compilers that do interprocedural and
intermodular (what a word!) optimization, involving such things as not
spilling registers before calling an external function. But usually
you have to compile the calling module and the callee module in one
swoop then - you pass more than one C file on the command line or some
such. But it is not common for you to compile your module together
with the mutex locking function modules of the C runtime.)

David> For example:

David> extern void bar(void);

David> void foo(void)
David> {
David> int i;
David> i=3;
David> bar();
David> i--;
David> }

David> The compiler in this case might optimize 'i' away to nothing.
David> Fortunately, any possible way another thread could get its
David> hands on a variable is a way that a function in another
David> compilation unit could get its hands on the variable. Not only
David> is there no legal C way 'bar' could access 'i', there is no
David> legal C way another thread could.

I don't understand the connection of this example to your statement
above.

Joerg

Kenneth Chiu

unread,
Jan 14, 2001, 2:19:18 PM1/14/01
to
In article <93qjt6$mm2$1...@lure.pipex.net>,

Martin Berger <martin...@orange.net> wrote:
>
>Kaz Kylheku <k...@ashi.footprints.net>
>
>> >well, in the c/c++ users journal, Andrei Alexandrescu recommends using
>> >"volatile" to help avoiding race conditions. can the experts please slug
>it
>> out?
>>
>> The C/C++ Users Journal is a comedy of imbeciles.
>>
>> The article you are talking about completely ignores issues of memory
>> coherency on multiprocessor systems. It is very geared toward Windows;
>> the author seems to have little experience with multithreaded
>> programming, and especially cross-platform multithreading.
>
>would you care to elaborate how "volatile" causes problems with memory
>consistency
>on multiprocessors?

It's not that volatile itself causes memory problems. It's that
it's not sufficient (and if under POSIX should not even be used).

He gives an example, which will work in practice, but if he had two
shared variables, would fail. Code like this, for example, would
be incorrect on an MP with a relaxed memory model. The write to flag_
could occur before the write to data_, despite the order in which the
assignments are written.

class Gadget {
public:
void Wait() {
while (!flag_) {
Sleep(1000); // sleeps for 1000 milliseconds
}
do_some_work(data_);
}
void Wakeup() {
data_ = ...;
flag_ = true;
}
...
private:
volatile bool flag_;
volatile int data_;
};

Joerg Faschingbauer

unread,
Jan 14, 2001, 2:23:39 PM1/14/01
to
>>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

David> Joerg Faschingbauer wrote:

>> Now that's the whole point: the compiler has to take care that the
>> code it generates spills registers before calling a function.

David> This is really not so. It's entirely possible that the compiler might
David> have some way of assuring that the particular value cached in the
David> register isn't used by the called function, and hence it can keep it in
David> a register.

David> For example:

David> extern void bar(void);

David> void foo(void)
David> {
David> int i;
David> i=3;
David> bar();
David> i--;
David> }

David> The compiler in this case might optimize 'i' away to nothing.
David> Fortunately, any possible way another thread could get its hands on a
David> variable is a way that a function in another compilation unit could get
David> its hands on the variable. Not only is there no legal C way 'bar' could
David> access 'i', there is no legal C way another thread could.

David> ConsideR:

David> extern void bar(void);
David> extern void qux(int *);

David> void foo(void)
David> {
David> int i;
David> i=3;

David> while(i<10)
David> {
David> i++;
David> bar();
David> i++;
David> qux(&i);
David> }
David> }

David> For all the compiler knows, 'qux' stores the pointer passed to it and
David> 'bar' uses it. Think about:

David> int *ptr=NULL;
David> void bar(void)
David> {
David> if(ptr!=NULL) printf("i=%d\n", *ptr);
David> }

David> void qux(int *j)
David> {
David> ptr=j;
David> }

David> So the compiler would have to treat 'i' as if it was volatile in 'foo'
David> anyway.

Yes, I believe this (exporting the address of a variable) is called
taking an alias in compilerology. The consequence of this is that it
inhibits holding it in a register.

David> So most compilers don't need any special help to compile multithreaded
David> code. Non-multithreaded code can do the same things.

Joerg

Kaz Kylheku

unread,
Jan 14, 2001, 2:20:42 PM1/14/01
to
On 14 Jan 2001 05:09:29 -0500, Martin Berger <martin...@orange.net> wrote:
>
>Kaz Kylheku <k...@ashi.footprints.net>
>
>> >well, in the c/c++ users journal, Andrei Alexandrescu recommends using
>> >"volatile" to help avoiding race conditions. can the experts please slug
>it
>> out?
>>
>> The C/C++ Users Journal is a comedy of imbeciles.
>>
>> The article you are talking about completely ignores issues of memory
>> coherency on multiprocessor systems. It is very geared toward Windows;
>> the author seems to have little experience with multithreaded
>> programming, and especially cross-platform multithreading.
>
>would you care to elaborate how "volatile" causes problems with memory
>consistency
>on multiprocessors?

The point is that it doesn't *solve* these problems, not that it causes
them. It's not enough to ensure that load and store instructions are
*issued* in some order by the processor, but also that they complete in
some order (or at least partial order) that is seen by all other
processors. At best, volatile defeats access optimizations at the
compiler level; in order to synchronize memory you need to do it at the
hardware level as well, which is often done with a special ``memory
barrier'' instruction.

In other words, volatile is not enough to eliminate race conditions,
at least not on all platforms.

>> Luckily, he
>> provides a disclaimer by openly admitting that some code that he wrote
>> suffers from occasional deadlocks.
>
>are you suggesting that these deadlock are a consequence of using
>"volatile"?

The point is that are you going to take multithreading advice from
someone who admittedly cannot eradicate known deadlocks from his
code? But good points for the honesty, clearly.

>if so, how? i cannot find indications to deadlock inducing behavior of
>"volatile"
>in kernighan & ritchie

This book says very little about volatile and contains no discussion of
threads; this is squarely beyond the scope of K&R.

Dylan Nicholson

unread,
Jan 14, 2001, 6:31:59 PM1/14/01
to
In article <slrn963vb...@ashi.FootPrints.net>,

k...@ashi.footprints.net wrote:
> On 14 Jan 2001 05:09:29 -0500, Martin Berger
<martin...@orange.net> wrote:
> >
> >Kaz Kylheku <k...@ashi.footprints.net>
> >
>
> The point is that are you going to take multithreading advice from
> someone who admittedly cannot eradicate known deadlocks from his
> code? But good points for the honesty, clearly.
>
Well I consider myself pretty well experienced in at least Win32
threads, and I'm working on a project now using POSIX threads (and a
POSIX wrapper for Win32 threads). I thought I had a perfectly sound
design that used only ONE mutex object, only ever used a stack-based
locker/unlocker to ensure it was never left locked, and yet I still got
deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
an unowned critical section causes a deadlock in Win32 (this I consider
a bug, considering how trivial it is to test one member of the critical
section to avoid it), and b) I didn't realise that by default POSIX
mutexes only allowed one lock per thread (i.e. they were non-
recursive). To me these are quirks of the thread library, not design
faults in my code, so they don't necessarily indicate in lack of multi-
threaded knowledge. I don't pretend to know what deadlocks Andrei had,
but I wouldn't be surprised if it was a problem of that nature.
Although I haven't used the technique he described in his library, if I
had read it before I started coding my latest multi-threaded project, I
almost definitely would have given it a go.

Dylan


Sent via Deja.com
http://www.deja.com/

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Martin Berger

unread,
Jan 14, 2001, 6:34:01 PM1/14/01
to
Kaz Kylheku wrote:

> >would you care to elaborate how "volatile" causes problems with memory
> >consistency
> >on multiprocessors?
>
> The point is that it doesn't *solve* these problems, not that it causes
> them. It's not enough to ensure that load and store instructions are
> *issued* in some order by the processor, but also that they complete in
> some order (or at least partial order) that is seen by all other
> processors. At best, volatile defeats access optimizations at the
> compiler level; in order to synchronize memory you need to do it at the
> hardware level as well, which is often done with a special ``memory
> barrier'' instruction.
>
> In other words, volatile is not enough to eliminate race conditions,
> at least not on all platforms.

either you or me don't quite understand the point of the article. the
semantics of "volatile" is irrelevant for his stuff to work. all that
matters is how c++ typechecks classes and methods annotated with
"volatile", together with the usual rules for overloading and casting
away volatile.

if we'd change c++ to include a modifier "blob" and add the ability
to cast away blobness and make "blob" behave like "volatile" w.r.t
typechecking, overloading ... than his scheme would work just the
same way when "volatile" is replace by blob. that's at least how
i understand it.

> The point is that are you going to take multithreading advice from
> someone who admittedly cannot eradicate known deadlocks from his
> code? But good points for the honesty, clearly.


concurrency is an area where i trust no one, including myself.



> >if so, how? i cannot find indications to deadlock inducing behavior of
> >"volatile"
> >in kernighan & ritchie
>
> This book says very little about volatile and contains no discussion of
> threads; this is squarely beyond the scope of K&R.

well, it should be part of the sematics of the language and hence covered.

David Schwartz

unread,
Jan 14, 2001, 8:32:54 PM1/14/01
to

Dylan Nicholson wrote:

> deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
> an unowned critical section causes a deadlock in Win32 (this I consider
> a bug, considering how trivial it is to test one member of the critical
> section to avoid it),

How would you "avoid it"? IMO, the best way to avoid it is for the
library to fail or crash when you do it. The more spetacularly it fails,
the better. If you break an interface contract, there is nothing sane
the library can do about it.

> and b) I didn't realise that by default POSIX
> mutexes only allowed one lock per thread (i.e. they were non-
> recursive).

Since recursive locks are more expensive and almost never needed, this
makes perfect sense. To assume a lock would be recursive by default is
to assume inferior design.

DS

Dylan Nicholson

unread,
Jan 14, 2001, 10:45:10 PM1/14/01
to

>
> How would you "avoid it"? IMO, the best way to avoid it is for
the
> library to fail or crash when you do it. The more spetacularly it
fails,
> the better. If you break an interface contract, there is nothing sane
> the library can do about it.
>
I agree, I just think it's a poor interface contract. The critical
section has a member indicating the owning thread, all you need to do
is you compare it against the current thread id and ignore the call if
it's not owned. I use this because in the destructor for my Mutex
class, it always unlocks it. I would have to add and maintain another
member variable to avoid unlocking in the case that the mutex was not
locked on destruction, but I don't see why this should be necessary.

> > and b) I didn't realise that by default POSIX
> > mutexes only allowed one lock per thread (i.e. they were non-
> > recursive).
>
> Since recursive locks are more expensive and almost never
needed, this
> makes perfect sense. To assume a lock would be recursive by default is
> to assume inferior design.
>

Well I agree they can be a little more expensive, but from my
experience they are worth the effort - I don't like to put
preconditions on widely used functions that mutexes must have been
acquired before entry - instead I have the function attempt to acquire
the mutex, and if it is already owned by the calling thread, you simply
increase the lock count. That way I can write code like the following:

Mutex TheMutex;
int SharedValue;
string SharedString;

int GetSharedValue()
{
AutoLock lock(TheMutex);
return SharedValue;
}

void Foo()
{
int i = GetSharedValue();
DoSomething(i);
}

void Bar()
{
AutoLock lock(TheMutex);
DoWhatever(SharedString);
int i = GetSharedValue();
DoSomethingElse(i);
}

Without having to worry about the state of the Mutex before calling
GetSharedValue. Without this my current project would probably require
4-5 times the numbers of explicit locks (i.e. GetSharedValue() is
called far more from unprotected code than from protected code).
Maybe there is a better design, but it's what I've been using for years
without any serious performance problems.

dale

unread,
Jan 14, 2001, 10:59:01 PM1/14/01
to
David Schwartz wrote:

> Code shouldn't _have_ race conditions.

Well, that's not entirely correct. If you have a number of
threads writing logging data to a file, which is protected
by a mutex, then the order in which they write -is- subject
to race conditions. This may or may not matter however.


Dale

Michiel Salters

unread,
Jan 15, 2001, 8:45:20 AM1/15/01
to
Joerg Faschingbauer wrote:

> >>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

> David> Joerg Faschingbauer wrote:
>
> >> Now that's the whole point: the compiler has to take care that the
> >> code it generates spills registers before calling a function.

> David> This is really not so. It's entirely possible that the
> David> compiler might have some way of assuring that the particular
> David> value cached in the register isn't used by the called function,
> David> and hence it can keep it in a register.

> Of course you may once target a system where everything is tightly
> coupled, and where the compiler you use to compile your module knows
> about the very internals of the runtime - register allocations for
> example. Then it could keep variables in registers even across runtime
> function calls.

> Even though such a thing is possible, it is quite unlikely - consider
> the management effort of the people making (and upgrading!) such a
> system.

I don't think things are that hard - especially in C++, which already
has name mangling. For each translation unit, it is possible to determine
which functions use which registers and what functions are called outside
that translation unit.
Now encode in the name mangling of a function which registers are used,
except of course those functions imported from another translation unit.
Just add to the name something like Reg=EAX_EBX_ECX. The full set of
registers used by a function is the set of registers it uses, plus
the registers used by function it calls. This will even work for
mutually recursive functions across translation units.

With this information the linker can, for each function call, determine
which registers need to be saved. And that is closely related to the
linkers main task: creating correct function calls across translation
units.

--
Michiel Salters
Michiel...@cmg.nl
sal...@lucent.com

James Kanze

unread,
Jan 15, 2001, 8:45:01 AM1/15/01
to
Martin Berger wrote:

> Dave Butenhof wrote:

> > Don't ever use the C/C++ language volatile in threaded code. It'll
> > kill your performance, and the language definition has nothing to
> > do with what you want when writing threaded code that shares
> > data. If some OS implementation tells you that you need to use it
> > anyway on their system, (in the words of a child safety group),
> > "run, yell, and tell". That's just stupid, and they shouldn't be
> > allowed to get away with it.

This is correct up to a point. The problem is that the C++ language
has no other way of signaling that a variable may be accessed by
several threads (and thus ensuring e.g. that it is really written
before the lock is released). The problem *isn't* with the OS; it is
with code movement within the optimizer of the compiler. And while I
agree with the sentiment: volatile isn't the solution, I don't know
how many compilers offer another one. (Of course, some compilers
don't optimize enough for there to be a problem:-).)

> well, in the c/c++ users journal, Andrei Alexandrescu recommends
> using "volatile" to help avoiding race conditions. can the experts
> please slug it out? (note the cross posting)

The crux of Andrei's suggestions really just exploits the compiler
type-checking with regards to volatile, and not the actual semantics
of volatile. If I've understood the suggestion correctly, it would
even be possible to implement it without ever accessing the individual
class members as if they were volatile (although in his examples, I
think he is also counting on volatile to inhibit code movement).

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:18 AM1/15/01
to
"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
news:93sptr$o5s$1...@flotsam.uits.indiana.edu...

> He gives an example, which will work in practice, but if he had two
> shared variables, would fail. Code like this, for example, would
> be incorrect on an MP with a relaxed memory model. The write to flag_
> could occur before the write to data_, despite the order in which the
> assignments are written.
>
> class Gadget {
> public:
> void Wait() {
> while (!flag_) {
> Sleep(1000); // sleeps for 1000 milliseconds
> }
> do_some_work(data_);
> }
> void Wakeup() {
> data_ = ...;
> flag_ = true;
> }
> ...
> private:
> volatile bool flag_;
> volatile int data_;
> };

Your statement is true. However, that is only an _introduction_ to the
meaning of volatile in multithreaded code. I guess I should have thought of
a more elaborate example that would work on any machine. Anyway, the focus
of the article is different. It's about using the type system to detect race
conditions.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:56 AM1/15/01
to
"James Dennett" <jden...@acm.org> wrote in message
news:3A611BF5...@acm.org...

> True, and Andrei's claim (which seems reasonable to me, though I've not
> verified it in depth) is that his techniques, if used consistently,
> will detect all race conditions *at compile time*.

By the way, I maintain that.

Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:51:41 AM1/15/01
to
"Martin Berger" <martinb@--remove--me--dcs.qmw.ac.uk> wrote in message
news:3A620AB8.835DD808@--remove--me--dcs.qmw.ac.uk...

> Kaz Kylheku wrote:
> > The point is that it doesn't *solve* these problems, not that it causes
> > them. It's not enough to ensure that load and store instructions are
> > *issued* in some order by the processor, but also that they complete in
> > some order (or at least partial order) that is seen by all other
> > processors. At best, volatile defeats access optimizations at the
> > compiler level; in order to synchronize memory you need to do it at the
> > hardware level as well, which is often done with a special ``memory
> > barrier'' instruction.
> >
> > In other words, volatile is not enough to eliminate race conditions,
> > at least not on all platforms.
>
> either you or me don't quite understand the point of the article.

Or maybe me :o).

> the
> semantics of "volatile" is irrelevant for his stuff to work. all that
> matters is how c++ typechecks classes and methods annotated with
> "volatile", together with the usual rules for overloading and casting
> away volatile.
>
> if we'd change c++ to include a modifier "blob" and add the ability
> to cast away blobness and make "blob" behave like "volatile" w.r.t
> typechecking, overloading ... than his scheme would work just the
> same way when "volatile" is replace by blob. that's at least how
> i understand it.

This is exactly what the point of the article was.

> > The point is that are you going to take multithreading advice from
> > someone who admittedly cannot eradicate known deadlocks from his
> > code? But good points for the honesty, clearly.

To Mr. Kylheku: There is a misunderstanding here, and a rather gross one. I
wonder what's the text that made you believe I *couldn't* erradicate known
deadlocks. I simply sad that all threading-related runtime errors of our
program were only deadlocks and not race conditions, which precisely proves
the point that the article tried to make. Of course we fixed the deadlocks.
The point is that the compiler fixed the race conditions.

It's clear that you have a great deal of experience in multithreading code
on many platform, and I would be glad to expand my knowledge in the area. If
you would be willing to discuss in more civil terms, I would be glad to.
Also, if you would like to expand the discussion *beyond* the Gadget example
in the opening section of the article, and point possible reasoning errors
that I might have done, that would help the C++ community define the
"volatile correctness" term with precision. For now, I maintain the
conjectures I made.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:00 AM1/15/01
to
"David Schwartz" <dav...@webmaster.com> wrote in message
news:3A5FCDC3...@webmaster.com...

> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.

There is a misunderstanding here - actually, quite a few in this and the
following posts.

Of course code must not _have_ race conditions. So that's why you must
eliminate them, which is what I meant by "avoid". Maybe I didn't use a word
that's strong enough.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:38 AM1/15/01
to
"Martin Berger" <martin...@orange.net> wrote in message
news:93qjt6$mm2$1...@lure.pipex.net...
>
> Kaz Kylheku <k...@ashi.footprints.net>
[snip]

> > The C/C++ Users Journal is a comedy of imbeciles.

Yay, the original message was moderated out.

> > The article you are talking about completely ignores issues of memory
> > coherency on multiprocessor systems. It is very geared toward Windows;
> > the author seems to have little experience with multithreaded
> > programming, and especially cross-platform multithreading.

I'm afraid Mr. Kylheku completely ignores the gist of the article. I know
only Windows, Posix and ACE threads, but that's beyond the point - what the
article tries to say is different. The article uses the volatile modifier as
a device for helping the type system detect race conditions at compile time.

> > Luckily, he
> > provides a disclaimer by openly admitting that some code that he wrote
> > suffers from occasional deadlocks.

This is a misunderstanding. What I said is that the technique described
can't help with deadlocks. In the end of the article I mentioned some
concrete experience with the technique. Indeed there were deadlocks - _only_
deadlocks - in the multithreaded code, simply because all race conditions
were weeded out by the compiler.

> are you suggesting that these deadlock are a consequence of using
> "volatile"?
> if so, how? i cannot find indications to deadlock inducing behavior of
> "volatile"
> in kernighan & ritchie

I guess this is yet another misunderstanding :o).


Andrei

James Kanze

unread,
Jan 15, 2001, 9:35:31 AM1/15/01
to
James Moe wrote:

> Martin Berger wrote:

> > well, in the c/c++ users journal, Andrei Alexandrescu recommends
> > using "volatile" to help avoiding race conditions. can the experts
> > please slug it out? (note the cross posting)

> Use mutexes or semaphores to control access to common data
> areas. It is what they are for. "volatile" is meant for things like
> hardware access where a device register can change at any time.

You still need some way of preventing the optimizer from deferring
writes until after the lock has been released. Ideally, the compiler
will understand the locking system (mutex, or whatever), and generate
the necessary write guards itself. Off hand, I don't know of any
compiler which meets this ideal.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:36:16 AM1/15/01
to
Joerg Faschingbauer wrote:

> int i;

I've used more than one compiler that does this. In fact, most do, at
least with optimization turned on.

> If that was the case the usage of the mutex would be completely
> pointless. At the time the function returns (a long time after the
> mutex was unlocked) the value of i is written back to memory. It
> then overwrites the changes that another thread may have made to the
> value in the meantime.

You guessed it.

> This is where volatile comes in. Common understanding is that
> volatile disables any optimization on a variable, so if you declare
> i volatile, the compiler won't keep it in a register, and all is
> well (if that was the definition of volatile - my understanding is
> that volatile is just a hint to the compiler, so nothing is well if
> you put it legally). Except that this misperforms - imagine you
> wouldn't have a simple increment in the critical section, but
> instead lots more of usage of i.

Volatile is more than just a hint, but it does have a lot of
implementation defined aspects. The *intent* (according to the C
standard, to which the C++ standard refers) is roughly what you
describe.

> Now what does a compiler do? It compiles modules independently. In
> doing so it performs optimizations (fortunately). Inside the module
> that is being compiled the compiler is free to assign variables to
> registers as it likes, and every function has a certain register set
> that it uses.

There is no requirement that a compiler compile modules independantly;
at least one major compiler has a final, post-link optimization phase
in which the optimizer looks beyond the module limits.

> The compiler also generates code for calls to functions in other
> modules that it has no idea of. Having no idea of a module, among
> other things means that the compiler does not know the register set
> that particular function uses.

What set a function can use without restoring is usually defined by
the calling conventions. The compiler not only can know it, it must
know it.

> Especially, the compiler cannot know for sure that the callee's
> register set doesn't overlap with the caller's - in which case the
> caller would see useless garbage in the registers on return of the
> callee.

This depends entirely on the compiler. And the hardware -- on a
Sparc, there are four banks of registers, two of which are
systematically saved and restored by the hardware. So each function
basically has 16 registers in which it can do anything it wishes.

> Now that's the whole point: the compiler has to take care that the
> code it generates spills registers before calling a function.

Not necessarily.

> So, provided that unlock() is a function and not a macro, there is
> no need to declare i volatile.

Not necessarily. It all depends on the compiler.

If the variable is global, and the compiler cannot analyse the unlock
function, it will have to assume that unlock may access the variable,
and so must ensure that the value is up to date. In practice, this IS
generally sufficient -- at some level, unlock resolves to a system
call, and the compiler certainly has no access to the source code of
the system call. So either 1) the compiler makes no assumption about
the system call, must assume that it might access the variable, and so
ensures the correct value, or 2) the compiler knows about system
requests, and which ones can access global variables. In the latter
case, of course, the compiler *should* also know that it needs a write
barrier after unlock. But unless this is actually documented in the
compiler documentation, I'd be leary about counting on it.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:34:46 AM1/15/01
to
Martin Berger wrote:

> David Schwartz wrote:

> > > well, in the c/c++ users journal, Andrei Alexandrescu recommends
> > > using "volatile" to help avoiding race conditions. can the
> > > experts please slug it out? (note the cross posting)

> > > martin

> > What race conditions? If you have a race condition, you
> > need to _FIX_ it. Something that might help "avoid" it isn't good
> > enough. If I told my customers I added some code that helped avoid
> > race conditions, they'd shoot me. Code shouldn't _have_ race
> > conditions.

> maybe i should have given more details: this idea is to use certain
> properties of the typing system of c++ with respect to "volatile",
> "const_cast", overloading and method invocation to the effect that
> race conditions will be type errors or generate at least compiler
> warnings (see

> http://www.cuj.com/experts/1902/alexandr.html

> for details). this is quite nifty, provided we ignore the problems
> dave has pointed out.

> interestingly, andrei's suggestions do not at all depend on the
> intended semantics of "volatile", only on how the typing systems
> checks it and handels const_cast and method invocation in this
> case. it it is possible to introduce a new c++ qualifier, say "blob"
> to the same effect but without the shortcomings (it would even and
> in controdistinction to the "volatile" base proposal, handle built
> in types correctly).

That's not totally true, at least not in his examples. I think he
also counts on volatile to some degree to inhibit code movement;
i.e. to prevent the compiler from moving some of the writes to after
the lock has been freed.

James Kanze

unread,
Jan 15, 2001, 9:36:58 AM1/15/01
to
Joerg Faschingbauer wrote:

> >>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

> David> Joerg Faschingbauer wrote:

> >> Now that's the whole point: the compiler has to take care that
> >> the code it generates spills registers before calling a function.

> David> This is really not so. It's entirely possible that the
> David> compiler might have some way of assuring that the particular
> David> value cached in the register isn't used by the called function,
> David> and hence it can keep it in a register.

> Of course you may once target a system where everything is tightly
> coupled, and where the compiler you use to compile your module knows
> about the very internals of the runtime - register allocations for
> example. Then it could keep variables in registers even across
> runtime function calls.

The compiler always knows about the internals of the runtime register
allocations, since it is the compiler which defines them (at least
partially).

> Even though such a thing is possible, it is quite unlikely -
> consider the management effort of the people making (and upgrading!)
> such a system. And even if people dared doing such a beast, this
> wouldn't be POSIX - at least not with the functions that involve
> locking and such. (There was a discussion here recently where Dave
> Butenhof made this plausible - and I believe him :-}.)

I'm not sure I understand your point. It sounds like you are saying
that it is possible for the compiler not to know which registers it
can use, which is manifestly ridiculous.

> (Of course there are compilers that do interprocedural and
> intermodular (what a word!) optimization, involving such things as
> not spilling registers before calling an external function. But
> usually you have to compile the calling module and the callee module
> in one swoop then - you pass more than one C file on the command
> line or some such. But it is not common for you to compile your
> module together with the mutex locking function modules of the C
> runtime.)

Usually (well, in the once case I actually know of:-)), the compiler
generates extra information in the object file, which is used by the
linker.

About all you can hope for is that a compiler this intelligent also
knows about threads, and can recognize a mutex request when it sees
one.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:40:18 AM1/15/01
to
Joerg Faschingbauer wrote:

[...]


> Yes, I believe this (exporting the address of a variable) is called
> taking an alias in compilerology. The consequence of this is that it
> inhibits holding it in a register.

Correct. The entire issue is called the aliasing problem, and it
makes good optimization extremely difficult. Note well: extremely
difficult, not impossible. In recent years, a few compilers have
gotten good enough to track uses of the variable through aliases and
accross module boundaries. And eventually keep aliased variables in a
register when it will improve performance.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:39:16 AM1/15/01
to
Kaz Kylheku wrote:

> On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer
> <jfa...@jfasch.faschingbauer.com> wrote:
> >So, provided that unlock() is a function and not a macro, there is no
> >need to declare i volatile.

> Even if a compiler implements sophisticated global optimizations
> that cross module boundaries, the compiler can still be aware of
> synchronization functions and do the right thing around calls to
> those functions.

It can be. It should be. Is it? Do todays compilers actually do the
right thing, or are we just lucking out because most of them don't
optimize very aggressively anyway?

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:46:20 AM1/15/01
to
Martin Berger wrote:

> Kaz Kylheku <k...@ashi.footprints.net>

> > >well, in the c/c++ users journal, Andrei Alexandrescu recommends
> > >using "volatile" to help avoiding race conditions. can the
> > >experts please slug it > > out?

> > The C/C++ Users Journal is a comedy of imbeciles.

And the moderator's let this through?

> > The article you are talking about completely ignores issues of
> > memory coherency on multiprocessor systems. It is very geared
> > toward Windows; the author seems to have little experience with
> > multithreaded programming, and especially cross-platform
> > multithreading.

> would you care to elaborate how "volatile" causes problems with
> memory consistency on multiprocessors?

Volatile doesn't cause problems of memory consistency. It's not
guaranteed to solve them, either.

Andrei's article didn't address the problem. Not because Andrei
didn't know the solution. (He may, or he may not. I don't know.)
But because that wasn't the subject of the article.

It might be worth pointing out the exact subject, since the poster you
are responding to obviously missed the point. Andrei basically
"overloads" the keyword volatile in a way that allows the compiler to
verify whether we will use locked access or not when accessing an
object. It offers an additional tool to simplify the writing (and the
verification) of multi-threaded code.

The article does NOT address the question of when locks are needed and
when the aren't. The article doesn't address the question of what is
actually needed when locks are needed, e.g. to ensure memory
coherency. These are other issues, and would require another article
(or maybe even an entire book). About the only real criticism I would
make about the article is that it isn't clear enough that he is
glossing over major issues, because they aren't relevant to that
particular article.

Martin Berger

unread,
Jan 15, 2001, 11:06:17 AM1/15/01
to
Andrei Alexandrescu wrote:

> > if we'd change c++ to include a modifier "blob" and add the ability
> > to cast away blobness and make "blob" behave like "volatile" w.r.t
> > typechecking, overloading ... than his scheme would work just the
> > same way when "volatile" is replace by blob. that's at least how
> > i understand it.
>
> This is exactly what the point of the article was.

this makes me think that c++ *should* be expanded to include something
like "blob" as a modifier. or maybe even user defined modifiers.

the problem with modifiers like "sharded" and the like is that compilers
cannot effectively guarantee the absence of race conditions, as would
suggested by using a name like "shared". with "blob" on the other
hand, all the compiler guarantees is that "blobness" is preserved
which is basically an easy typechecking problem and it is up to the
programmer to use this feature in whatever way she thinks appropriate
(eg for the prevention of race conditions). i also think that user defined
modifier semantics has uses beyond preventing race conditions. how about it?

martin

Martin Berger

unread,
Jan 15, 2001, 11:05:59 AM1/15/01
to
Andrei Alexandrescu wrote:

> To Mr. Kylheku: [...]


> Also, if you would like to expand the discussion *beyond* the Gadget example
> in the opening section of the article, and point possible reasoning errors
> that I might have done, that would help the C++ community define the
> "volatile correctness" term with precision. For now, I maintain the
> conjectures I made.


that would be a worthwhile contribution.

martin

Charles Bryant

unread,
Jan 15, 2001, 11:10:21 AM1/15/01
to
In article <93t924$2vi$1...@nnrp1.deja.com>,

Dylan Nicholson <dn...@my-deja.com> wrote:
>In article <slrn963vb...@ashi.FootPrints.net>,
> k...@ashi.footprints.net wrote:
>>
>> The point is that are you going to take multithreading advice from
>> someone who admittedly cannot eradicate known deadlocks from his
>> code? But good points for the honesty, clearly.
>>
>Well I consider myself pretty well experienced in at least Win32
>threads, and I'm working on a project now using POSIX threads (and a
>POSIX wrapper for Win32 threads). I thought I had a perfectly sound
>design that used only ONE mutex object, only ever used a stack-based
>locker/unlocker to ensure it was never left locked, and yet I still got
>deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
>an unowned critical section causes a deadlock in Win32 (this I consider
>a bug, considering how trivial it is to test one member of the critical
>section to avoid it)

You have a fundamental misunderstanding of the nature of programming.
Programming does not require speculation aboud how something might be
implemented and certainly does not involve writing one's own code
such that it depends on that speculation. Programming involves
determining the guaranteed and documented behaviour of the components
that will be needed and then relying solely on that documented
behaviour.

Calling LeaveCriticalSection() without entering the critical section would
be just as much a bug even if causing it to fail required a huge
amount of very slow code in the library which implements
LeaveCritcalSection. The *only* thing relevant to whether it's a bug
or not is whether the documentation permits it or not.

--
Eppur si muove

Balog Pal

unread,
Jan 15, 2001, 11:14:56 AM1/15/01
to
"Dylan Nicholson" <dn...@my-deja.com> wrote

> deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
> an unowned critical section causes a deadlock in Win32 (this I consider
> a bug,

I consider doing illegal stuff a programming error. For critical sections
I'd go somewhat further, as those you shall wrap into classes anyway, and
use lock guards like CSingleLock. Then doing something odd is pretty hard.
But if you manage it must be a logic error in the program. And a sign for
looking out for other errors too.

> considering how trivial it is to test one member of the critical
> section to avoid it),

That is IMHO irrevelant.

> and b) I didn't realise that by default POSIX
> mutexes only allowed one lock per thread (i.e. they were non-
> recursive).

Yep, they are. But you can implement your wn recursice mutex (workinng like
the WIN32 criticalsection.)

> To me these are quirks of the thread library, not design
> faults in my code,

Your way of looking is somewhat nonstandard. ;-)

> so they don't necessarily indicate in lack of multi-
> threaded knowledge.

Maybe they indicate ignorance. A system API works as it is described, not
along your thoughts, or along your expectations. The user code is the thing
you must write to use tha API as it serves not the other way around. (You
can claim yourself innocent if the dox is in error, like it specifies posix
mutex being recursie then it turn out being fast. But that is not the case.)

Paul

John Mullins

unread,
Jan 15, 2001, 11:15:49 AM1/15/01
to

"James Kanze" <James...@dresdner-bank.com> wrote in message
news:3A62CF96...@dresdner-bank.com...

> That's not totally true, at least not in his examples. I think he
> also counts on volatile to some degree to inhibit code movement;
> i.e. to prevent the compiler from moving some of the writes to after
> the lock has been freed.

But his examples also rely on undefined behaviour so he can't really
count on anything.

JM

Kenneth Chiu

unread,
Jan 15, 2001, 11:16:27 AM1/15/01
to
In article <3A62CF03...@dresdner-bank.com>,

James Kanze <James...@dresdner-bank.com> wrote:
>Martin Berger wrote:
>
>> Dave Butenhof wrote:
>
>> > Don't ever use the C/C++ language volatile in threaded code. It'll
>> > kill your performance, and the language definition has nothing to
>> > do with what you want when writing threaded code that shares
>> > data. If some OS implementation tells you that you need to use it
>> > anyway on their system, (in the words of a child safety group),
>> > "run, yell, and tell". That's just stupid, and they shouldn't be
>> > allowed to get away with it.
>
>This is correct up to a point. The problem is that the C++ language
>has no other way of signaling that a variable may be accessed by
>several threads (and thus ensuring e.g. that it is really written
>before the lock is released). The problem *isn't* with the OS; it is
>with code movement within the optimizer of the compiler.

At this point it isn't really a C++ language issue anymore. However,
if a vendor claims that their compiler is compatible with POSIX
threads, then it is up to them to insure that memory is written before
the unlock.

Kenneth Chiu

unread,
Jan 15, 2001, 12:21:02 PM1/15/01
to
In article <93u77g$bqakt$3...@ID-14036.news.dfncis.de>,

Andrei Alexandrescu <andre...@hotmail.com> wrote:
>"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
>news:93sptr$o5s$1...@flotsam.uits.indiana.edu...
>> He gives an example, which will work in practice, but if he had two
>> shared variables, would fail. Code like this, for example, would
>> be incorrect on an MP with a relaxed memory model. The write to flag_
>> could occur before the write to data_, despite the order in which the
>> assignments are written.
>>
>> ...

>
>Your statement is true. However, that is only an _introduction_ to the
>meaning of volatile in multithreaded code. I guess I should have thought of
>a more elaborate example that would work on any machine. Anyway, the focus
>of the article is different. It's about using the type system to detect race
>conditions.

Yes, I have no quibble with the rest of the article, and in fact thought
it was an interesting idea.

However, I find some of the statements in the introduction to be overly
general.

Basically, without volatile, either writing multithreaded programs
becomes impossible, or the compiler wastes vast optimization
opportunities.

This may be true for some thread standards, but if the vendor claims
that they support POSIX threads with their C++ compiler, then shared
variables should not be declared volatile when using POSIX threads.

Andrei Alexandrescu

unread,
Jan 15, 2001, 1:20:38 PM1/15/01
to
"John Mullins" <John.M...@crossprod.co.uk> wrote in message
news:93v2gj$4fs$1...@newsreaderg1.core.theplanet.net...

>
> "James Kanze" <James...@dresdner-bank.com> wrote in message
> news:3A62CF96...@dresdner-bank.com...
>
> > That's not totally true, at least not in his examples. I think he
> > also counts on volatile to some degree to inhibit code movement;
> > i.e. to prevent the compiler from moving some of the writes to after
> > the lock has been freed.
>
> But his examples also rely on undefined behaviour so he can't really
> count on anything.

Are you referring to const_cast? Strictly speaking, indeed. But then, all MT
programming in C/C++ has undefined behavior.

Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 2:31:08 PM1/15/01
to
"James Kanze" <James...@dresdner-bank.com> wrote in message
news:3A62CF03...@dresdner-bank.com...

> The crux of Andrei's suggestions really just exploits the compiler
> type-checking with regards to volatile, and not the actual semantics
> of volatile. If I've understood the suggestion correctly, it would
> even be possible to implement it without ever accessing the individual
> class members as if they were volatile (although in his examples, I
> think he is also counting on volatile to inhibit code movement).

Thanks James for all your considerations before and after the article
appeared.

There is a point about the use of volatile proposed by the article. If you
write volatile correct code as prescribed by the article, you _never_
*never* NEVER use volatile variables. You _always_ *always* ALWAYS lock a
synchronization object, cast the volatile away, operate on the so-obtained
non-volatile alias, let the alias go, and unlock the synchronization object,
in this order.

Maybe I should have made it clearer that in volatile-correct code, you never
operate on volatile data - you always cast volatile away and more
specifically, you cast it away when it is *semantically correct* to do so
because you locked the afferent synchronization object.

I would be glad if someone explained me in what situations a compiler can
rearrange instructions to the extent that it would invalidate the idiom that
the article proposes. OTOH, such compilers invalidate a number of idioms
anyway, such as the Double-Check Locking Pattern, used by Doug Schmidt in
ACE.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 2:37:05 PM1/15/01
to
"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
news:93v8d2$3aa$1...@flotsam.uits.indiana.edu...

> However, I find some of the statements in the introduction to be overly
> general.
>
> Basically, without volatile, either writing multithreaded programs
> becomes impossible, or the compiler wastes vast optimization
> opportunities.
>
> This may be true for some thread standards, but if the vendor claims
> that they support POSIX threads with their C++ compiler, then shared
> variables should not be declared volatile when using POSIX threads.

I understand your point and agree with it. That statement of mine was a
mistake.

Andrei

Kaz Kylheku

unread,
Jan 15, 2001, 2:39:50 PM1/15/01
to
On 14 Jan 2001 22:59:01 -0500, dale <da...@cs.rmit.edu.au> wrote:
>David Schwartz wrote:
>
>> Code shouldn't _have_ race conditions.
>
>Well, that's not entirely correct. If you have a number of
>threads writing logging data to a file, which is protected
>by a mutex, then the order in which they write -is- subject
>to race conditions. This may or may not matter however.

If it doesn't matter, it's hardly a race condition! A race condition
occurs when the program fails to compute one of the possible correct
results due to a fluctuation in the execution order.

David Schwartz

unread,
Jan 15, 2001, 3:39:11 PM1/15/01
to

Dylan Nicholson wrote:

> I agree, I just think it's a poor interface contract. The critical
> section has a member indicating the owning thread, all you need to do
> is you compare it against the current thread id and ignore the call if
> it's not owned. I use this because in the destructor for my Mutex
> class, it always unlocks it. I would have to add and maintain another
> member variable to avoid unlocking in the case that the mutex was not
> locked on destruction, but I don't see why this should be necessary.

In the debug library, sure. In the release library, no way. Every cycle
ReleaseCriticalSection takes is performance-critical. RCS is the
highest-performance synchronization object WIN32 offers (not counting
Interlocked*, which isn't usually useful).

> > Since recursive locks are more expensive and almost never
> > needed, this
> > makes perfect sense. To assume a lock would be recursive by default is
> > to assume inferior design.

> Well I agree they can be a little more expensive, but from my
> experience they are worth the effort - I don't like to put
> preconditions on widely used functions that mutexes must have been
> acquired before entry - instead I have the function attempt to acquire
> the mutex, and if it is already owned by the calling thread, you simply
> increase the lock count. That way I can write code like the following:
[snip]
> Without having to worry about the state of the Mutex before calling
> GetSharedValue. Without this my current project would probably require
> 4-5 times the numbers of explicit locks (i.e. GetSharedValue() is
> called far more from unprotected code than from protected code).
> Maybe there is a better design, but it's what I've been using for years
> without any serious performance problems.

Performance isn't so much the issue. It's failure-free and logical
operation.

Consider the following:

SomeFunction()
{
LockMutex();
DoSomeWork();
UnlockMutex();
WaitForSomething();
}

This function will deadlock if the mutex used is recursive. And
associating a condition variable with a recursive mutex can cause some
very surprising results.

Here's another one:

SomeFunction()
{
LockMutexA();
DoStuff();
LockMutexB();
DoMoreStuff();
UnlockMutexB();
UnlockMutexA();
}

If all other code locks A before B, this will never deadlock. But what
if B is recursive and this function is entered while holding mutex B?
Now you have code that seems to work, but will sometimes deadlock
unpredictably.

In order to avoid these types of pitfalls, all the code that uses a
recursive mutex _must_ be engineered together. And in that case, it's
trivial to add your own code to track the ownership of the mutex. And
unless you do this horribly wrong, this code will be more efficient than
the equivalent code in the library.

DS

Konrad Schwarz

unread,
Jan 15, 2001, 4:15:11 PM1/15/01
to
James Kanze wrote:
>
> Martin Berger wrote:
>
> > Dave Butenhof wrote:
>
> > > Don't ever use the C/C++ language volatile in threaded code. It'll
> > > kill your performance, and the language definition has nothing to
> > > do with what you want when writing threaded code that shares
> > > data. If some OS implementation tells you that you need to use it
> > > anyway on their system, (in the words of a child safety group),
> > > "run, yell, and tell". That's just stupid, and they shouldn't be
> > > allowed to get away with it.
>
> This is correct up to a point. The problem is that the C++ language
> has no other way of signaling that a variable may be accessed by
> several threads (and thus ensuring e.g. that it is really written
> before the lock is released). The problem *isn't* with the OS; it is
> with code movement within the optimizer of the compiler. And while I
> agree with the sentiment: volatile isn't the solution, I don't know
> how many compilers offer another one. (Of course, some compilers
> don't optimize enough for there to be a problem:-).)

So the optimization of keeping variables in registers accross
function calls is illegal in general (and thus must not be performed),
if the compiler cannot prove that the code will not be linked into a
multi-threaded program or it cannot prove that those variables will
never be shared.

However, the C language has a way of signaling that local variables
cannot be accessed by other threads, namely by placing them in
the register storage class. I don't know about C++; if I remember
correctly, C++ degrades register to a mere "efficiency hint".

Konrad Schwarz

unread,
Jan 15, 2001, 4:16:41 PM1/15/01
to

James Kanze wrote:
>
> Kaz Kylheku wrote:
>
> > On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer
> > <jfa...@jfasch.faschingbauer.com> wrote:
> > >So, provided that unlock() is a function and not a macro, there is no
> > >need to declare i volatile.
>
> > Even if a compiler implements sophisticated global optimizations
> > that cross module boundaries, the compiler can still be aware of
> > synchronization functions and do the right thing around calls to
> > those functions.
>
> It can be. It should be. Is it? Do todays compilers actually do the
> right thing, or are we just lucking out because most of them don't
> optimize very aggressively anyway?
>

If the compiler supports multi-threading (at least POSIX
multi-threading),
then it *must*, since POSIX does not require shared variables to
be volatile qualified. If the compiler decides to
keep values in registers across function calls, it must be able to prove
that
* either these variables are never shared by another thread
* or the functions in question never perform inter-thread operations

Kaz Kylheku

unread,
Jan 15, 2001, 4:17:44 PM1/15/01
to
On 15 Jan 2001 08:52:18 -0500, Andrei Alexandrescu

<andre...@hotmail.com> wrote:
>"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
>news:93sptr$o5s$1...@flotsam.uits.indiana.edu...
>> He gives an example, which will work in practice, but if he had two
>> shared variables, would fail. Code like this, for example, would
>> be incorrect on an MP with a relaxed memory model. The write to flag_
>> could occur before the write to data_, despite the order in which the
>> assignments are written.
>>
>> class Gadget {
>> public:
>> void Wait() {
>> while (!flag_) {
>> Sleep(1000); // sleeps for 1000 milliseconds
>> }
>> do_some_work(data_);
>> }
>> void Wakeup() {
>> data_ = ...;
>> flag_ = true;
>> }
>> ...
>> private:
>> volatile bool flag_;
>> volatile int data_;
>> };

>
>Your statement is true. However, that is only an _introduction_ to the
>meaning of volatile in multithreaded code. I guess I should have thought of
>a more elaborate example that would work on any machine.

You cannot come up with such an example without resorting to
platform-and compiler specific techniques, such as inline assembly language
to insert memory barrier instructions.

In the above example, if one thread writes to data_ and then sets flag_
there is absolutely no assurance that another thread running on another
processor will see these updates in the same order. It is possible for
flag_ to appear to flip true, but data_ to not have been updated yet!

Moreover, there is no assurance that data_ is updated atomically, so
that a processor can either see its old value or its new value, never
any half-baked value in between.

Resolving these issues can't be done in standard C++, so there is no
one examples that can fit all C++ platforms. This makes sense, since
threads are not currently part of the C++ language. (What I don't
understand is why the moderator of comp.lang.c++.moderated is even
allowing this discussion, which clearly belongs in
comp.programming.threads only).

>Anyway, the focus
>of the article is different. It's about using the type system to detect race
>conditions.

This thread was started in comp.programming.threads by Lie-Quan Lee
<ll...@lsc.nd.edu> who was specifically interested in knowing whether
rules similar to the POSIX memory visibility rules apply to other
multithreading platforms.

-> One question is, whether those memory visibility rules are applicable
-> for other thread system such as Solaris UI threads, or win32 threads,
-> or JAVA threads ...? If yes, we can follow the same spirit. Otherwise,
-> it will be a big difference. (For example, all shared variables might
-> have to be difined as volatile even with mutex protection.)

Dave Butenhof then replied:

-> Don't ever use the C/C++ language volatile in threaded code. It'll
-> kill your performance, and the language definition has nothing to do
-> with what you want when writing threaded code that shares data. If
-> some OS implementation tells you that you need to use it anyway on
-> their system, (in the words of a child safety group), "run, yell, and
-> tell". That's just stupid, and they shouldn't be allowed to get away
-> with it.

To which Martin Berger replied (and added, for some strange reason,
comp.lang.c++.moderated to the Newsgroups: header). This is the first
time the CUJ article was mentioned, clearly in the context of a
comp.programming.threads debate about memory visibility rules,
not in the context of a debate about C++ or qualifier-correctness:

-> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
-> "volatile" to help avoiding race conditions. can the experts please
-> slug it out?(note the cross posting)

So it appears that, article does create some confusion at least in the
minds of some readers between volatile used as a request for special
access semantics, and volatile used as a constraint-checking access
control for class member function calls.

INcidentally, I believe that the second property can be exploited
without dragging in the semantics of volatile. Simply do something like
this:

#ifdef RACE_CHECK
#define VOLATILE volatile
#else
#define VOLATILE
#endif

When producing production object code, do not define RACE_CHECK; define
it only when you want to create extra semantic checks for the compiler
to diagnose. Making up a name other than ``VOLATILE'' might be useful
to clarify that what is being done has nothing to do with defeating
optimization.

Joerg Faschingbauer

unread,
Jan 15, 2001, 4:19:30 PM1/15/01