volatile, was: memory visibility between threads

94 views
Skip to first unread message

Martin Berger

unread,
Jan 12, 2001, 9:27:29 PM1/12/01
to
Dave Butenhof wrote:

> > One question is, whether those memory visibility rules are applicable for
> > other thread system such as Solaris UI threads, or win32 threads, or JAVA
> > threads ...? If yes, we can follow the same spirit. Otherwise, it will be
> > a big difference. (For example, all shared variables might have to be
> > difined as volatile even with mutex protection.)
>
> Don't ever use the C/C++ language volatile in threaded code. It'll kill your
> performance, and the language definition has nothing to do with what you want
> when writing threaded code that shares data. If some OS implementation tells
> you that you need to use it anyway on their system, (in the words of a child
> safety group), "run, yell, and tell". That's just stupid, and they shouldn't
> be allowed to get away with it.
>

well, in the c/c++ users journal, Andrei Alexandrescu recommends using
"volatile" to help avoiding race conditions. can the experts please slug it out?
(note the cross posting)

martin

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

David Schwartz

unread,
Jan 13, 2001, 5:56:51 AM1/13/01
to

Martin Berger wrote:

> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> "volatile" to help avoiding race conditions. can the experts please slug it out?
> (note the cross posting)
>
> martin

What race conditions? If you have a race condition, you need to _FIX_
it. Something that might help "avoid" it isn't good enough. If I told my
customers I added some code that helped avoid race conditions, they'd
shoot me. Code shouldn't _have_ race conditions.

DS

Martin Berger

unread,
Jan 13, 2001, 3:01:23 PM1/13/01
to
David Schwartz wrote:

> > well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> > "volatile" to help avoiding race conditions. can the experts please slug it out?
> > (note the cross posting)
> >
> > martin
>
> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.


maybe i should have given more details: this idea is to use certain properties
of the typing system of c++ with respect to "volatile", "const_cast", overloading
and method invocation to the effect that race conditions will be type errors or
generate at least compiler warnings (see

http://www.cuj.com/experts/1902/alexandr.html

for details). this is quite nifty, provided we ignore the problems dave has
pointed out.

interestingly, andrei's suggestions do not at all depend on the intended semantics
of "volatile", only on how the typing systems checks it and handels const_cast
and method invocation in this case. it it is possible to introduce a new c++
qualifier, say "blob" to the same effect but without the shortcomings (it would
even and in controdistinction to the "volatile" base proposal, handle built in
types correctly).

martin

James Moe

unread,
Jan 13, 2001, 10:59:46 PM1/13/01
to
Martin Berger wrote:
>
>
> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> "volatile" to help avoiding race conditions. can the experts please slug it
out?
> (note the cross posting)
>
Use mutexes or semaphores to control access to common data areas. It
is what they are for.
"volatile" is meant for things like hardware access where a device
register can change at any time.

--
sma at sohnen-moe dot com

Joerg Faschingbauer

unread,
Jan 13, 2001, 11:01:04 PM1/13/01
to
>>>>> "Martin" == Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk>
>>>>> writes:

>> Don't ever use the C/C++ language volatile in threaded code. It'll
>> kill your performance, and the language definition has nothing to
>> do with what you want when writing threaded code that shares
>> data. If some OS implementation tells you that you need to use it
>> anyway on their system, (in the words of a child safety group),
>> "run, yell, and tell". That's just stupid, and they shouldn't be
>> allowed to get away with it.
>>

Martin> well, in the c/c++ users journal, Andrei Alexandrescu
Martin> recommends using "volatile" to help avoiding race
Martin> conditions. can the experts please slug it out? (note the
Martin> cross posting)

(I am not an expert, but there's a few things I understood :-)

You use a mutex to protect data against concurrent access.

int i;

void f(void) {
lock(mutex);
i++; // or something
unlock(mutex);
// some lengthy epilogue goes here
}

Looking at it more paranoidly, on might argue that an optimizing
compiler will probably want to keep i in a register for some reason,
and that it might want to keep it in that register until the function
returns.

If that was the case the usage of the mutex would be completely
pointless. At the time the function returns (a long time after the
mutex was unlocked) the value of i is written back to memory. It then
overwrites the changes that another thread may have made to the value
in the meantime.

This is where volatile comes in. Common understanding is that volatile
disables any optimization on a variable, so if you declare i volatile,
the compiler won't keep it in a register, and all is well (if that was
the definition of volatile - my understanding is that volatile is just
a hint to the compiler, so nothing is well if you put it
legally). Except that this misperforms - imagine you wouldn't have a
simple increment in the critical section, but instead lots more of
usage of i.

Now what does a compiler do? It compiles modules independently. In
doing so it performs optimizations (fortunately). Inside the module
that is being compiled the compiler is free to assign variables to
registers as it likes, and every function has a certain register set
that it uses.

The compiler also generates code for calls to functions in other
modules that it has no idea of. Having no idea of a module, among
other things means that the compiler does not know the register set
that particular function uses. Especially, the compiler cannot know
for sure that the callee's register set doesn't overlap with the
caller's - in which case the caller would see useless garbage in the
registers on return of the callee.

Now that's the whole point: the compiler has to take care that the
code it generates spills registers before calling a function.

So, provided that unlock() is a function and not a macro, there is no
need to declare i volatile.


Hope this helps,
Joerg

Martin Berger

unread,
Jan 14, 2001, 5:09:29 AM1/14/01
to

Kaz Kylheku <k...@ashi.footprints.net>

> >well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> >"volatile" to help avoiding race conditions. can the experts please slug
it
> out?
>

> The C/C++ Users Journal is a comedy of imbeciles.
>
> The article you are talking about completely ignores issues of memory
> coherency on multiprocessor systems. It is very geared toward Windows;
> the author seems to have little experience with multithreaded
> programming, and especially cross-platform multithreading.

would you care to elaborate how "volatile" causes problems with memory
consistency
on multiprocessors?

> Luckily, he
> provides a disclaimer by openly admitting that some code that he wrote
> suffers from occasional deadlocks.

are you suggesting that these deadlock are a consequence of using
"volatile"?
if so, how? i cannot find indications to deadlock inducing behavior of
"volatile"
in kernighan & ritchie

James Dennett

unread,
Jan 14, 2001, 5:54:52 AM1/14/01
to
David Schwartz wrote:
>
> Martin Berger wrote:
>
> > well, in the c/c++ users journal, Andrei Alexandrescu recommends using
> > "volatile" to help avoiding race conditions. can the experts please slug it out?
> > (note the cross posting)
> >
> > martin
>
> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.

True, and Andrei's claim (which seems reasonable to me, though I've not
verified it in depth) is that his techniques, if used consistently,
will detect all race conditions *at compile time*. If you want to
ensure, absolutely, that no race conditions remain, then you could
try looking into Andrei's technique as a second line of defense.

-- James Dennett <jden...@acm.org>

David Schwartz

unread,
Jan 14, 2001, 5:58:21 AM1/14/01
to

Joerg Faschingbauer wrote:

> Now that's the whole point: the compiler has to take care that the
> code it generates spills registers before calling a function.

This is really not so. It's entirely possible that the compiler might
have some way of assuring that the particular value cached in the
register isn't used by the called function, and hence it can keep it in
a register.

For example:

extern void bar(void);

void foo(void)
{
int i;
i=3;
bar();
i--;
}

The compiler in this case might optimize 'i' away to nothing.
Fortunately, any possible way another thread could get its hands on a
variable is a way that a function in another compilation unit could get
its hands on the variable. Not only is there no legal C way 'bar' could
access 'i', there is no legal C way another thread could.

ConsideR:

extern void bar(void);
extern void qux(int *);

void foo(void)
{
int i;
i=3;
while(i<10)
{
i++;
bar();
i++;
qux(&i);
}
}

For all the compiler knows, 'qux' stores the pointer passed to it and
'bar' uses it. Think about:

int *ptr=NULL;
void bar(void)
{
if(ptr!=NULL) printf("i=%d\n", *ptr);
}

void qux(int *j)
{
ptr=j;
}

So the compiler would have to treat 'i' as if it was volatile in 'foo'
anyway.

So most compilers don't need any special help to compile multithreaded
code. Non-multithreaded code can do the same things.

DS

Kaz Kylheku

unread,
Jan 14, 2001, 5:59:02 AM1/14/01
to
On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer

<jfa...@jfasch.faschingbauer.com> wrote:
>So, provided that unlock() is a function and not a macro, there is no
>need to declare i volatile.

Even if a compiler implements sophisticated global optimizations that
cross module boundaries, the compiler can still be aware of
synchronization functions and do the right thing around calls to those
functions.

Joerg Faschingbauer

unread,
Jan 14, 2001, 2:18:53 PM1/14/01
to
>>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

David> Joerg Faschingbauer wrote:

>> Now that's the whole point: the compiler has to take care that the
>> code it generates spills registers before calling a function.

David> This is really not so. It's entirely possible that the
David> compiler might have some way of assuring that the particular
David> value cached in the register isn't used by the called function,
David> and hence it can keep it in a register.

Of course you may once target a system where everything is tightly
coupled, and where the compiler you use to compile your module knows
about the very internals of the runtime - register allocations for
example. Then it could keep variables in registers even across runtime
function calls.

Even though such a thing is possible, it is quite unlikely - consider
the management effort of the people making (and upgrading!) such a
system. And even if people dared doing such a beast, this wouldn't be
POSIX - at least not with the functions that involve locking and
such. (There was a discussion here recently where Dave Butenhof made
this plausible - and I believe him :-}.)

(Of course there are compilers that do interprocedural and
intermodular (what a word!) optimization, involving such things as not
spilling registers before calling an external function. But usually
you have to compile the calling module and the callee module in one
swoop then - you pass more than one C file on the command line or some
such. But it is not common for you to compile your module together
with the mutex locking function modules of the C runtime.)

David> For example:

David> extern void bar(void);

David> void foo(void)
David> {
David> int i;
David> i=3;
David> bar();
David> i--;
David> }

David> The compiler in this case might optimize 'i' away to nothing.
David> Fortunately, any possible way another thread could get its
David> hands on a variable is a way that a function in another
David> compilation unit could get its hands on the variable. Not only
David> is there no legal C way 'bar' could access 'i', there is no
David> legal C way another thread could.

I don't understand the connection of this example to your statement
above.

Joerg

Kenneth Chiu

unread,
Jan 14, 2001, 2:19:18 PM1/14/01
to
In article <93qjt6$mm2$1...@lure.pipex.net>,

Martin Berger <martin...@orange.net> wrote:
>
>Kaz Kylheku <k...@ashi.footprints.net>
>
>> >well, in the c/c++ users journal, Andrei Alexandrescu recommends using
>> >"volatile" to help avoiding race conditions. can the experts please slug
>it
>> out?
>>
>> The C/C++ Users Journal is a comedy of imbeciles.
>>
>> The article you are talking about completely ignores issues of memory
>> coherency on multiprocessor systems. It is very geared toward Windows;
>> the author seems to have little experience with multithreaded
>> programming, and especially cross-platform multithreading.
>
>would you care to elaborate how "volatile" causes problems with memory
>consistency
>on multiprocessors?

It's not that volatile itself causes memory problems. It's that
it's not sufficient (and if under POSIX should not even be used).

He gives an example, which will work in practice, but if he had two
shared variables, would fail. Code like this, for example, would
be incorrect on an MP with a relaxed memory model. The write to flag_
could occur before the write to data_, despite the order in which the
assignments are written.

class Gadget {
public:
void Wait() {
while (!flag_) {
Sleep(1000); // sleeps for 1000 milliseconds
}
do_some_work(data_);
}
void Wakeup() {
data_ = ...;
flag_ = true;
}
...
private:
volatile bool flag_;
volatile int data_;
};

Joerg Faschingbauer

unread,
Jan 14, 2001, 2:23:39 PM1/14/01
to
>>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

David> Joerg Faschingbauer wrote:

>> Now that's the whole point: the compiler has to take care that the
>> code it generates spills registers before calling a function.

David> This is really not so. It's entirely possible that the compiler might
David> have some way of assuring that the particular value cached in the
David> register isn't used by the called function, and hence it can keep it in
David> a register.

David> For example:

David> extern void bar(void);

David> void foo(void)
David> {
David> int i;
David> i=3;
David> bar();
David> i--;
David> }

David> The compiler in this case might optimize 'i' away to nothing.
David> Fortunately, any possible way another thread could get its hands on a
David> variable is a way that a function in another compilation unit could get
David> its hands on the variable. Not only is there no legal C way 'bar' could
David> access 'i', there is no legal C way another thread could.

David> ConsideR:

David> extern void bar(void);
David> extern void qux(int *);

David> void foo(void)
David> {
David> int i;
David> i=3;

David> while(i<10)
David> {
David> i++;
David> bar();
David> i++;
David> qux(&i);
David> }
David> }

David> For all the compiler knows, 'qux' stores the pointer passed to it and
David> 'bar' uses it. Think about:

David> int *ptr=NULL;
David> void bar(void)
David> {
David> if(ptr!=NULL) printf("i=%d\n", *ptr);
David> }

David> void qux(int *j)
David> {
David> ptr=j;
David> }

David> So the compiler would have to treat 'i' as if it was volatile in 'foo'
David> anyway.

Yes, I believe this (exporting the address of a variable) is called
taking an alias in compilerology. The consequence of this is that it
inhibits holding it in a register.

David> So most compilers don't need any special help to compile multithreaded
David> code. Non-multithreaded code can do the same things.

Joerg

Dylan Nicholson

unread,
Jan 14, 2001, 6:31:59 PM1/14/01
to
In article <slrn963vb...@ashi.FootPrints.net>,
k...@ashi.footprints.net wrote:
> On 14 Jan 2001 05:09:29 -0500, Martin Berger

<martin...@orange.net> wrote:
> >
> >Kaz Kylheku <k...@ashi.footprints.net>
> >
>
> The point is that are you going to take multithreading advice from
> someone who admittedly cannot eradicate known deadlocks from his
> code? But good points for the honesty, clearly.
>
Well I consider myself pretty well experienced in at least Win32
threads, and I'm working on a project now using POSIX threads (and a
POSIX wrapper for Win32 threads). I thought I had a perfectly sound
design that used only ONE mutex object, only ever used a stack-based
locker/unlocker to ensure it was never left locked, and yet I still got
deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
an unowned critical section causes a deadlock in Win32 (this I consider
a bug, considering how trivial it is to test one member of the critical
section to avoid it), and b) I didn't realise that by default POSIX
mutexes only allowed one lock per thread (i.e. they were non-
recursive). To me these are quirks of the thread library, not design
faults in my code, so they don't necessarily indicate in lack of multi-
threaded knowledge. I don't pretend to know what deadlocks Andrei had,
but I wouldn't be surprised if it was a problem of that nature.
Although I haven't used the technique he described in his library, if I
had read it before I started coding my latest multi-threaded project, I
almost definitely would have given it a go.

Dylan


Sent via Deja.com
http://www.deja.com/

Martin Berger

unread,
Jan 14, 2001, 6:34:01 PM1/14/01
to
Kaz Kylheku wrote:

> >would you care to elaborate how "volatile" causes problems with memory
> >consistency
> >on multiprocessors?
>

> The point is that it doesn't *solve* these problems, not that it causes
> them. It's not enough to ensure that load and store instructions are
> *issued* in some order by the processor, but also that they complete in
> some order (or at least partial order) that is seen by all other
> processors. At best, volatile defeats access optimizations at the
> compiler level; in order to synchronize memory you need to do it at the
> hardware level as well, which is often done with a special ``memory
> barrier'' instruction.
>
> In other words, volatile is not enough to eliminate race conditions,
> at least not on all platforms.

either you or me don't quite understand the point of the article. the
semantics of "volatile" is irrelevant for his stuff to work. all that
matters is how c++ typechecks classes and methods annotated with
"volatile", together with the usual rules for overloading and casting
away volatile.

if we'd change c++ to include a modifier "blob" and add the ability
to cast away blobness and make "blob" behave like "volatile" w.r.t
typechecking, overloading ... than his scheme would work just the
same way when "volatile" is replace by blob. that's at least how
i understand it.

> The point is that are you going to take multithreading advice from
> someone who admittedly cannot eradicate known deadlocks from his
> code? But good points for the honesty, clearly.


concurrency is an area where i trust no one, including myself.



> >if so, how? i cannot find indications to deadlock inducing behavior of
> >"volatile"
> >in kernighan & ritchie
>

> This book says very little about volatile and contains no discussion of
> threads; this is squarely beyond the scope of K&R.

well, it should be part of the sematics of the language and hence covered.

dale

unread,
Jan 14, 2001, 10:59:01 PM1/14/01
to
David Schwartz wrote:

> Code shouldn't _have_ race conditions.

Well, that's not entirely correct. If you have a number of
threads writing logging data to a file, which is protected
by a mutex, then the order in which they write -is- subject
to race conditions. This may or may not matter however.


Dale

Michiel Salters

unread,
Jan 15, 2001, 8:45:20 AM1/15/01
to
Joerg Faschingbauer wrote:

> >>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

> David> Joerg Faschingbauer wrote:
>
> >> Now that's the whole point: the compiler has to take care that the
> >> code it generates spills registers before calling a function.

> David> This is really not so. It's entirely possible that the
> David> compiler might have some way of assuring that the particular
> David> value cached in the register isn't used by the called function,
> David> and hence it can keep it in a register.

> Of course you may once target a system where everything is tightly
> coupled, and where the compiler you use to compile your module knows
> about the very internals of the runtime - register allocations for
> example. Then it could keep variables in registers even across runtime
> function calls.

> Even though such a thing is possible, it is quite unlikely - consider
> the management effort of the people making (and upgrading!) such a
> system.

I don't think things are that hard - especially in C++, which already
has name mangling. For each translation unit, it is possible to determine
which functions use which registers and what functions are called outside
that translation unit.
Now encode in the name mangling of a function which registers are used,
except of course those functions imported from another translation unit.
Just add to the name something like Reg=EAX_EBX_ECX. The full set of
registers used by a function is the set of registers it uses, plus
the registers used by function it calls. This will even work for
mutually recursive functions across translation units.

With this information the linker can, for each function call, determine
which registers need to be saved. And that is closely related to the
linkers main task: creating correct function calls across translation
units.

--
Michiel Salters
Michiel...@cmg.nl
sal...@lucent.com

James Kanze

unread,
Jan 15, 2001, 8:45:01 AM1/15/01
to
Martin Berger wrote:

> Dave Butenhof wrote:

> > Don't ever use the C/C++ language volatile in threaded code. It'll
> > kill your performance, and the language definition has nothing to
> > do with what you want when writing threaded code that shares
> > data. If some OS implementation tells you that you need to use it
> > anyway on their system, (in the words of a child safety group),
> > "run, yell, and tell". That's just stupid, and they shouldn't be
> > allowed to get away with it.

This is correct up to a point. The problem is that the C++ language
has no other way of signaling that a variable may be accessed by
several threads (and thus ensuring e.g. that it is really written
before the lock is released). The problem *isn't* with the OS; it is
with code movement within the optimizer of the compiler. And while I
agree with the sentiment: volatile isn't the solution, I don't know
how many compilers offer another one. (Of course, some compilers
don't optimize enough for there to be a problem:-).)

> well, in the c/c++ users journal, Andrei Alexandrescu recommends
> using "volatile" to help avoiding race conditions. can the experts
> please slug it out? (note the cross posting)

The crux of Andrei's suggestions really just exploits the compiler
type-checking with regards to volatile, and not the actual semantics
of volatile. If I've understood the suggestion correctly, it would
even be possible to implement it without ever accessing the individual
class members as if they were volatile (although in his examples, I
think he is also counting on volatile to inhibit code movement).

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:18 AM1/15/01
to
"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
news:93sptr$o5s$1...@flotsam.uits.indiana.edu...

> He gives an example, which will work in practice, but if he had two
> shared variables, would fail. Code like this, for example, would
> be incorrect on an MP with a relaxed memory model. The write to flag_
> could occur before the write to data_, despite the order in which the
> assignments are written.
>
> class Gadget {
> public:
> void Wait() {
> while (!flag_) {
> Sleep(1000); // sleeps for 1000 milliseconds
> }
> do_some_work(data_);
> }
> void Wakeup() {
> data_ = ...;
> flag_ = true;
> }
> ...
> private:
> volatile bool flag_;
> volatile int data_;
> };

Your statement is true. However, that is only an _introduction_ to the
meaning of volatile in multithreaded code. I guess I should have thought of
a more elaborate example that would work on any machine. Anyway, the focus
of the article is different. It's about using the type system to detect race
conditions.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:56 AM1/15/01
to
"James Dennett" <jden...@acm.org> wrote in message
news:3A611BF5...@acm.org...

> True, and Andrei's claim (which seems reasonable to me, though I've not
> verified it in depth) is that his techniques, if used consistently,
> will detect all race conditions *at compile time*.

By the way, I maintain that.

Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:51:41 AM1/15/01
to
"Martin Berger" <martinb@--remove--me--dcs.qmw.ac.uk> wrote in message
news:3A620AB8.835DD808@--remove--me--dcs.qmw.ac.uk...

> Kaz Kylheku wrote:
> > The point is that it doesn't *solve* these problems, not that it causes
> > them. It's not enough to ensure that load and store instructions are
> > *issued* in some order by the processor, but also that they complete in
> > some order (or at least partial order) that is seen by all other
> > processors. At best, volatile defeats access optimizations at the
> > compiler level; in order to synchronize memory you need to do it at the
> > hardware level as well, which is often done with a special ``memory
> > barrier'' instruction.
> >
> > In other words, volatile is not enough to eliminate race conditions,
> > at least not on all platforms.
>
> either you or me don't quite understand the point of the article.

Or maybe me :o).

> the
> semantics of "volatile" is irrelevant for his stuff to work. all that
> matters is how c++ typechecks classes and methods annotated with
> "volatile", together with the usual rules for overloading and casting
> away volatile.
>
> if we'd change c++ to include a modifier "blob" and add the ability
> to cast away blobness and make "blob" behave like "volatile" w.r.t
> typechecking, overloading ... than his scheme would work just the
> same way when "volatile" is replace by blob. that's at least how
> i understand it.

This is exactly what the point of the article was.

> > The point is that are you going to take multithreading advice from
> > someone who admittedly cannot eradicate known deadlocks from his
> > code? But good points for the honesty, clearly.

To Mr. Kylheku: There is a misunderstanding here, and a rather gross one. I
wonder what's the text that made you believe I *couldn't* erradicate known
deadlocks. I simply sad that all threading-related runtime errors of our
program were only deadlocks and not race conditions, which precisely proves
the point that the article tried to make. Of course we fixed the deadlocks.
The point is that the compiler fixed the race conditions.

It's clear that you have a great deal of experience in multithreading code
on many platform, and I would be glad to expand my knowledge in the area. If
you would be willing to discuss in more civil terms, I would be glad to.
Also, if you would like to expand the discussion *beyond* the Gadget example
in the opening section of the article, and point possible reasoning errors
that I might have done, that would help the C++ community define the
"volatile correctness" term with precision. For now, I maintain the
conjectures I made.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:00 AM1/15/01
to
"David Schwartz" <dav...@webmaster.com> wrote in message
news:3A5FCDC3...@webmaster.com...

> What race conditions? If you have a race condition, you need to _FIX_
> it. Something that might help "avoid" it isn't good enough. If I told my
> customers I added some code that helped avoid race conditions, they'd
> shoot me. Code shouldn't _have_ race conditions.

There is a misunderstanding here - actually, quite a few in this and the
following posts.

Of course code must not _have_ race conditions. So that's why you must
eliminate them, which is what I meant by "avoid". Maybe I didn't use a word
that's strong enough.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 8:52:38 AM1/15/01
to
"Martin Berger" <martin...@orange.net> wrote in message
news:93qjt6$mm2$1...@lure.pipex.net...
>
> Kaz Kylheku <k...@ashi.footprints.net>
[snip]

> > The C/C++ Users Journal is a comedy of imbeciles.

Yay, the original message was moderated out.

> > The article you are talking about completely ignores issues of memory
> > coherency on multiprocessor systems. It is very geared toward Windows;
> > the author seems to have little experience with multithreaded
> > programming, and especially cross-platform multithreading.

I'm afraid Mr. Kylheku completely ignores the gist of the article. I know
only Windows, Posix and ACE threads, but that's beyond the point - what the
article tries to say is different. The article uses the volatile modifier as
a device for helping the type system detect race conditions at compile time.

> > Luckily, he
> > provides a disclaimer by openly admitting that some code that he wrote
> > suffers from occasional deadlocks.

This is a misunderstanding. What I said is that the technique described
can't help with deadlocks. In the end of the article I mentioned some
concrete experience with the technique. Indeed there were deadlocks - _only_
deadlocks - in the multithreaded code, simply because all race conditions
were weeded out by the compiler.

> are you suggesting that these deadlock are a consequence of using
> "volatile"?
> if so, how? i cannot find indications to deadlock inducing behavior of
> "volatile"
> in kernighan & ritchie

I guess this is yet another misunderstanding :o).


Andrei

James Kanze

unread,
Jan 15, 2001, 9:35:31 AM1/15/01
to
James Moe wrote:

> Martin Berger wrote:

> > well, in the c/c++ users journal, Andrei Alexandrescu recommends
> > using "volatile" to help avoiding race conditions. can the experts
> > please slug it out? (note the cross posting)

> Use mutexes or semaphores to control access to common data
> areas. It is what they are for. "volatile" is meant for things like
> hardware access where a device register can change at any time.

You still need some way of preventing the optimizer from deferring
writes until after the lock has been released. Ideally, the compiler
will understand the locking system (mutex, or whatever), and generate
the necessary write guards itself. Off hand, I don't know of any
compiler which meets this ideal.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:36:16 AM1/15/01
to
Joerg Faschingbauer wrote:

> int i;

I've used more than one compiler that does this. In fact, most do, at
least with optimization turned on.

> If that was the case the usage of the mutex would be completely
> pointless. At the time the function returns (a long time after the
> mutex was unlocked) the value of i is written back to memory. It
> then overwrites the changes that another thread may have made to the
> value in the meantime.

You guessed it.

> This is where volatile comes in. Common understanding is that
> volatile disables any optimization on a variable, so if you declare
> i volatile, the compiler won't keep it in a register, and all is
> well (if that was the definition of volatile - my understanding is
> that volatile is just a hint to the compiler, so nothing is well if
> you put it legally). Except that this misperforms - imagine you
> wouldn't have a simple increment in the critical section, but
> instead lots more of usage of i.

Volatile is more than just a hint, but it does have a lot of
implementation defined aspects. The *intent* (according to the C
standard, to which the C++ standard refers) is roughly what you
describe.

> Now what does a compiler do? It compiles modules independently. In
> doing so it performs optimizations (fortunately). Inside the module
> that is being compiled the compiler is free to assign variables to
> registers as it likes, and every function has a certain register set
> that it uses.

There is no requirement that a compiler compile modules independantly;
at least one major compiler has a final, post-link optimization phase
in which the optimizer looks beyond the module limits.

> The compiler also generates code for calls to functions in other
> modules that it has no idea of. Having no idea of a module, among
> other things means that the compiler does not know the register set
> that particular function uses.

What set a function can use without restoring is usually defined by
the calling conventions. The compiler not only can know it, it must
know it.

> Especially, the compiler cannot know for sure that the callee's
> register set doesn't overlap with the caller's - in which case the
> caller would see useless garbage in the registers on return of the
> callee.

This depends entirely on the compiler. And the hardware -- on a
Sparc, there are four banks of registers, two of which are
systematically saved and restored by the hardware. So each function
basically has 16 registers in which it can do anything it wishes.

> Now that's the whole point: the compiler has to take care that the
> code it generates spills registers before calling a function.

Not necessarily.

> So, provided that unlock() is a function and not a macro, there is
> no need to declare i volatile.

Not necessarily. It all depends on the compiler.

If the variable is global, and the compiler cannot analyse the unlock
function, it will have to assume that unlock may access the variable,
and so must ensure that the value is up to date. In practice, this IS
generally sufficient -- at some level, unlock resolves to a system
call, and the compiler certainly has no access to the source code of
the system call. So either 1) the compiler makes no assumption about
the system call, must assume that it might access the variable, and so
ensures the correct value, or 2) the compiler knows about system
requests, and which ones can access global variables. In the latter
case, of course, the compiler *should* also know that it needs a write
barrier after unlock. But unless this is actually documented in the
compiler documentation, I'd be leary about counting on it.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:34:46 AM1/15/01
to
Martin Berger wrote:

> David Schwartz wrote:

> > > martin

> http://www.cuj.com/experts/1902/alexandr.html

That's not totally true, at least not in his examples. I think he
also counts on volatile to some degree to inhibit code movement;
i.e. to prevent the compiler from moving some of the writes to after
the lock has been freed.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:36:58 AM1/15/01
to
Joerg Faschingbauer wrote:

> >>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

> David> Joerg Faschingbauer wrote:

> >> Now that's the whole point: the compiler has to take care that
> >> the code it generates spills registers before calling a function.

> David> This is really not so. It's entirely possible that the
> David> compiler might have some way of assuring that the particular
> David> value cached in the register isn't used by the called function,
> David> and hence it can keep it in a register.

> Of course you may once target a system where everything is tightly
> coupled, and where the compiler you use to compile your module knows
> about the very internals of the runtime - register allocations for
> example. Then it could keep variables in registers even across
> runtime function calls.

The compiler always knows about the internals of the runtime register
allocations, since it is the compiler which defines them (at least
partially).

> Even though such a thing is possible, it is quite unlikely -
> consider the management effort of the people making (and upgrading!)
> such a system. And even if people dared doing such a beast, this
> wouldn't be POSIX - at least not with the functions that involve
> locking and such. (There was a discussion here recently where Dave
> Butenhof made this plausible - and I believe him :-}.)

I'm not sure I understand your point. It sounds like you are saying
that it is possible for the compiler not to know which registers it
can use, which is manifestly ridiculous.

> (Of course there are compilers that do interprocedural and
> intermodular (what a word!) optimization, involving such things as
> not spilling registers before calling an external function. But
> usually you have to compile the calling module and the callee module
> in one swoop then - you pass more than one C file on the command
> line or some such. But it is not common for you to compile your
> module together with the mutex locking function modules of the C
> runtime.)

Usually (well, in the once case I actually know of:-)), the compiler
generates extra information in the object file, which is used by the
linker.

About all you can hope for is that a compiler this intelligent also
knows about threads, and can recognize a mutex request when it sees
one.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:40:18 AM1/15/01
to
Joerg Faschingbauer wrote:

[...]


> Yes, I believe this (exporting the address of a variable) is called
> taking an alias in compilerology. The consequence of this is that it
> inhibits holding it in a register.

Correct. The entire issue is called the aliasing problem, and it
makes good optimization extremely difficult. Note well: extremely
difficult, not impossible. In recent years, a few compilers have
gotten good enough to track uses of the variable through aliases and
accross module boundaries. And eventually keep aliased variables in a
register when it will improve performance.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:39:16 AM1/15/01
to
Kaz Kylheku wrote:

> On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer
> <jfa...@jfasch.faschingbauer.com> wrote:
> >So, provided that unlock() is a function and not a macro, there is no
> >need to declare i volatile.

> Even if a compiler implements sophisticated global optimizations
> that cross module boundaries, the compiler can still be aware of
> synchronization functions and do the right thing around calls to
> those functions.

It can be. It should be. Is it? Do todays compilers actually do the
right thing, or are we just lucking out because most of them don't
optimize very aggressively anyway?

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

James Kanze

unread,
Jan 15, 2001, 9:46:20 AM1/15/01
to
Martin Berger wrote:

> Kaz Kylheku <k...@ashi.footprints.net>

> > >well, in the c/c++ users journal, Andrei Alexandrescu recommends
> > >using "volatile" to help avoiding race conditions. can the
> > >experts please slug it > > out?

> > The C/C++ Users Journal is a comedy of imbeciles.

And the moderator's let this through?

> > The article you are talking about completely ignores issues of
> > memory coherency on multiprocessor systems. It is very geared
> > toward Windows; the author seems to have little experience with
> > multithreaded programming, and especially cross-platform
> > multithreading.

> would you care to elaborate how "volatile" causes problems with
> memory consistency on multiprocessors?

Volatile doesn't cause problems of memory consistency. It's not
guaranteed to solve them, either.

Andrei's article didn't address the problem. Not because Andrei
didn't know the solution. (He may, or he may not. I don't know.)
But because that wasn't the subject of the article.

It might be worth pointing out the exact subject, since the poster you
are responding to obviously missed the point. Andrei basically
"overloads" the keyword volatile in a way that allows the compiler to
verify whether we will use locked access or not when accessing an
object. It offers an additional tool to simplify the writing (and the
verification) of multi-threaded code.

The article does NOT address the question of when locks are needed and
when the aren't. The article doesn't address the question of what is
actually needed when locks are needed, e.g. to ensure memory
coherency. These are other issues, and would require another article
(or maybe even an entire book). About the only real criticism I would
make about the article is that it isn't clear enough that he is
glossing over major issues, because they aren't relevant to that
particular article.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627

Martin Berger

unread,
Jan 15, 2001, 11:06:17 AM1/15/01
to
Andrei Alexandrescu wrote:

> > if we'd change c++ to include a modifier "blob" and add the ability
> > to cast away blobness and make "blob" behave like "volatile" w.r.t
> > typechecking, overloading ... than his scheme would work just the
> > same way when "volatile" is replace by blob. that's at least how
> > i understand it.
>
> This is exactly what the point of the article was.

this makes me think that c++ *should* be expanded to include something
like "blob" as a modifier. or maybe even user defined modifiers.

the problem with modifiers like "sharded" and the like is that compilers
cannot effectively guarantee the absence of race conditions, as would
suggested by using a name like "shared". with "blob" on the other
hand, all the compiler guarantees is that "blobness" is preserved
which is basically an easy typechecking problem and it is up to the
programmer to use this feature in whatever way she thinks appropriate
(eg for the prevention of race conditions). i also think that user defined
modifier semantics has uses beyond preventing race conditions. how about it?

martin

Martin Berger

unread,
Jan 15, 2001, 11:05:59 AM1/15/01
to
Andrei Alexandrescu wrote:

> To Mr. Kylheku: [...]


> Also, if you would like to expand the discussion *beyond* the Gadget example
> in the opening section of the article, and point possible reasoning errors
> that I might have done, that would help the C++ community define the
> "volatile correctness" term with precision. For now, I maintain the
> conjectures I made.


that would be a worthwhile contribution.

martin

Charles Bryant

unread,
Jan 15, 2001, 11:10:21 AM1/15/01
to
In article <93t924$2vi$1...@nnrp1.deja.com>,

Dylan Nicholson <dn...@my-deja.com> wrote:
>In article <slrn963vb...@ashi.FootPrints.net>,
> k...@ashi.footprints.net wrote:
>>
>> The point is that are you going to take multithreading advice from
>> someone who admittedly cannot eradicate known deadlocks from his
>> code? But good points for the honesty, clearly.
>>
>Well I consider myself pretty well experienced in at least Win32
>threads, and I'm working on a project now using POSIX threads (and a
>POSIX wrapper for Win32 threads). I thought I had a perfectly sound
>design that used only ONE mutex object, only ever used a stack-based
>locker/unlocker to ensure it was never left locked, and yet I still got
>deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
>an unowned critical section causes a deadlock in Win32 (this I consider
>a bug, considering how trivial it is to test one member of the critical
>section to avoid it)

You have a fundamental misunderstanding of the nature of programming.
Programming does not require speculation aboud how something might be
implemented and certainly does not involve writing one's own code
such that it depends on that speculation. Programming involves
determining the guaranteed and documented behaviour of the components
that will be needed and then relying solely on that documented
behaviour.

Calling LeaveCriticalSection() without entering the critical section would
be just as much a bug even if causing it to fail required a huge
amount of very slow code in the library which implements
LeaveCritcalSection. The *only* thing relevant to whether it's a bug
or not is whether the documentation permits it or not.

--
Eppur si muove

Balog Pal

unread,
Jan 15, 2001, 11:14:56 AM1/15/01
to
"Dylan Nicholson" <dn...@my-deja.com> wrote

> deadlocks! The reason was simple, a) Calling LeaveCriticalSection on
> an unowned critical section causes a deadlock in Win32 (this I consider
> a bug,

I consider doing illegal stuff a programming error. For critical sections
I'd go somewhat further, as those you shall wrap into classes anyway, and
use lock guards like CSingleLock. Then doing something odd is pretty hard.
But if you manage it must be a logic error in the program. And a sign for
looking out for other errors too.

> considering how trivial it is to test one member of the critical
> section to avoid it),

That is IMHO irrevelant.

> and b) I didn't realise that by default POSIX
> mutexes only allowed one lock per thread (i.e. they were non-
> recursive).

Yep, they are. But you can implement your wn recursice mutex (workinng like
the WIN32 criticalsection.)

> To me these are quirks of the thread library, not design
> faults in my code,

Your way of looking is somewhat nonstandard. ;-)

> so they don't necessarily indicate in lack of multi-
> threaded knowledge.

Maybe they indicate ignorance. A system API works as it is described, not
along your thoughts, or along your expectations. The user code is the thing
you must write to use tha API as it serves not the other way around. (You
can claim yourself innocent if the dox is in error, like it specifies posix
mutex being recursie then it turn out being fast. But that is not the case.)

Paul

John Mullins

unread,
Jan 15, 2001, 11:15:49 AM1/15/01
to

"James Kanze" <James...@dresdner-bank.com> wrote in message
news:3A62CF96...@dresdner-bank.com...

> That's not totally true, at least not in his examples. I think he
> also counts on volatile to some degree to inhibit code movement;
> i.e. to prevent the compiler from moving some of the writes to after
> the lock has been freed.

But his examples also rely on undefined behaviour so he can't really
count on anything.

JM

Kenneth Chiu

unread,
Jan 15, 2001, 11:16:27 AM1/15/01
to
In article <3A62CF03...@dresdner-bank.com>,

James Kanze <James...@dresdner-bank.com> wrote:
>Martin Berger wrote:
>
>> Dave Butenhof wrote:
>
>> > Don't ever use the C/C++ language volatile in threaded code. It'll
>> > kill your performance, and the language definition has nothing to
>> > do with what you want when writing threaded code that shares
>> > data. If some OS implementation tells you that you need to use it
>> > anyway on their system, (in the words of a child safety group),
>> > "run, yell, and tell". That's just stupid, and they shouldn't be
>> > allowed to get away with it.
>
>This is correct up to a point. The problem is that the C++ language
>has no other way of signaling that a variable may be accessed by
>several threads (and thus ensuring e.g. that it is really written
>before the lock is released). The problem *isn't* with the OS; it is
>with code movement within the optimizer of the compiler.

At this point it isn't really a C++ language issue anymore. However,
if a vendor claims that their compiler is compatible with POSIX
threads, then it is up to them to insure that memory is written before
the unlock.

Kenneth Chiu

unread,
Jan 15, 2001, 12:21:02 PM1/15/01
to
In article <93u77g$bqakt$3...@ID-14036.news.dfncis.de>,

Andrei Alexandrescu <andre...@hotmail.com> wrote:
>"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
>news:93sptr$o5s$1...@flotsam.uits.indiana.edu...
>> He gives an example, which will work in practice, but if he had two
>> shared variables, would fail. Code like this, for example, would
>> be incorrect on an MP with a relaxed memory model. The write to flag_
>> could occur before the write to data_, despite the order in which the
>> assignments are written.
>>
>> ...

>
>Your statement is true. However, that is only an _introduction_ to the
>meaning of volatile in multithreaded code. I guess I should have thought of
>a more elaborate example that would work on any machine. Anyway, the focus
>of the article is different. It's about using the type system to detect race
>conditions.

Yes, I have no quibble with the rest of the article, and in fact thought
it was an interesting idea.

However, I find some of the statements in the introduction to be overly
general.

Basically, without volatile, either writing multithreaded programs
becomes impossible, or the compiler wastes vast optimization
opportunities.

This may be true for some thread standards, but if the vendor claims
that they support POSIX threads with their C++ compiler, then shared
variables should not be declared volatile when using POSIX threads.

Andrei Alexandrescu

unread,
Jan 15, 2001, 1:20:38 PM1/15/01
to
"John Mullins" <John.M...@crossprod.co.uk> wrote in message
news:93v2gj$4fs$1...@newsreaderg1.core.theplanet.net...

>
> "James Kanze" <James...@dresdner-bank.com> wrote in message
> news:3A62CF96...@dresdner-bank.com...
>
> > That's not totally true, at least not in his examples. I think he
> > also counts on volatile to some degree to inhibit code movement;
> > i.e. to prevent the compiler from moving some of the writes to after
> > the lock has been freed.
>
> But his examples also rely on undefined behaviour so he can't really
> count on anything.

Are you referring to const_cast? Strictly speaking, indeed. But then, all MT
programming in C/C++ has undefined behavior.

Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 2:31:08 PM1/15/01
to
"James Kanze" <James...@dresdner-bank.com> wrote in message
news:3A62CF03...@dresdner-bank.com...

> The crux of Andrei's suggestions really just exploits the compiler
> type-checking with regards to volatile, and not the actual semantics
> of volatile. If I've understood the suggestion correctly, it would
> even be possible to implement it without ever accessing the individual
> class members as if they were volatile (although in his examples, I
> think he is also counting on volatile to inhibit code movement).

Thanks James for all your considerations before and after the article
appeared.

There is a point about the use of volatile proposed by the article. If you
write volatile correct code as prescribed by the article, you _never_
*never* NEVER use volatile variables. You _always_ *always* ALWAYS lock a
synchronization object, cast the volatile away, operate on the so-obtained
non-volatile alias, let the alias go, and unlock the synchronization object,
in this order.

Maybe I should have made it clearer that in volatile-correct code, you never
operate on volatile data - you always cast volatile away and more
specifically, you cast it away when it is *semantically correct* to do so
because you locked the afferent synchronization object.

I would be glad if someone explained me in what situations a compiler can
rearrange instructions to the extent that it would invalidate the idiom that
the article proposes. OTOH, such compilers invalidate a number of idioms
anyway, such as the Double-Check Locking Pattern, used by Doug Schmidt in
ACE.


Andrei

Andrei Alexandrescu

unread,
Jan 15, 2001, 2:37:05 PM1/15/01
to
"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
news:93v8d2$3aa$1...@flotsam.uits.indiana.edu...

> However, I find some of the statements in the introduction to be overly
> general.
>
> Basically, without volatile, either writing multithreaded programs
> becomes impossible, or the compiler wastes vast optimization
> opportunities.
>
> This may be true for some thread standards, but if the vendor claims
> that they support POSIX threads with their C++ compiler, then shared
> variables should not be declared volatile when using POSIX threads.

I understand your point and agree with it. That statement of mine was a
mistake.

Andrei

Kaz Kylheku

unread,
Jan 15, 2001, 2:39:50 PM1/15/01
to
On 14 Jan 2001 22:59:01 -0500, dale <da...@cs.rmit.edu.au> wrote:
>David Schwartz wrote:
>
>> Code shouldn't _have_ race conditions.
>
>Well, that's not entirely correct. If you have a number of
>threads writing logging data to a file, which is protected
>by a mutex, then the order in which they write -is- subject
>to race conditions. This may or may not matter however.

If it doesn't matter, it's hardly a race condition! A race condition
occurs when the program fails to compute one of the possible correct
results due to a fluctuation in the execution order.

Konrad Schwarz

unread,
Jan 15, 2001, 4:15:11 PM1/15/01
to
James Kanze wrote:
>
> Martin Berger wrote:
>
> > Dave Butenhof wrote:
>
> > > Don't ever use the C/C++ language volatile in threaded code. It'll
> > > kill your performance, and the language definition has nothing to
> > > do with what you want when writing threaded code that shares
> > > data. If some OS implementation tells you that you need to use it
> > > anyway on their system, (in the words of a child safety group),
> > > "run, yell, and tell". That's just stupid, and they shouldn't be
> > > allowed to get away with it.
>
> This is correct up to a point. The problem is that the C++ language
> has no other way of signaling that a variable may be accessed by
> several threads (and thus ensuring e.g. that it is really written
> before the lock is released). The problem *isn't* with the OS; it is
> with code movement within the optimizer of the compiler. And while I
> agree with the sentiment: volatile isn't the solution, I don't know
> how many compilers offer another one. (Of course, some compilers
> don't optimize enough for there to be a problem:-).)

So the optimization of keeping variables in registers accross
function calls is illegal in general (and thus must not be performed),
if the compiler cannot prove that the code will not be linked into a
multi-threaded program or it cannot prove that those variables will
never be shared.

However, the C language has a way of signaling that local variables
cannot be accessed by other threads, namely by placing them in
the register storage class. I don't know about C++; if I remember
correctly, C++ degrades register to a mere "efficiency hint".

Konrad Schwarz

unread,
Jan 15, 2001, 4:16:41 PM1/15/01
to

James Kanze wrote:
>
> Kaz Kylheku wrote:
>
> > On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer
> > <jfa...@jfasch.faschingbauer.com> wrote:
> > >So, provided that unlock() is a function and not a macro, there is no
> > >need to declare i volatile.
>
> > Even if a compiler implements sophisticated global optimizations
> > that cross module boundaries, the compiler can still be aware of
> > synchronization functions and do the right thing around calls to
> > those functions.
>
> It can be. It should be. Is it? Do todays compilers actually do the
> right thing, or are we just lucking out because most of them don't
> optimize very aggressively anyway?
>

If the compiler supports multi-threading (at least POSIX
multi-threading),
then it *must*, since POSIX does not require shared variables to
be volatile qualified. If the compiler decides to
keep values in registers across function calls, it must be able to prove
that
* either these variables are never shared by another thread
* or the functions in question never perform inter-thread operations

Kaz Kylheku

unread,
Jan 15, 2001, 4:17:44 PM1/15/01
to
On 15 Jan 2001 08:52:18 -0500, Andrei Alexandrescu

<andre...@hotmail.com> wrote:
>"Kenneth Chiu" <ch...@cs.indiana.edu> wrote in message
>news:93sptr$o5s$1...@flotsam.uits.indiana.edu...
>> He gives an example, which will work in practice, but if he had two
>> shared variables, would fail. Code like this, for example, would
>> be incorrect on an MP with a relaxed memory model. The write to flag_
>> could occur before the write to data_, despite the order in which the
>> assignments are written.
>>
>> class Gadget {
>> public:
>> void Wait() {
>> while (!flag_) {
>> Sleep(1000); // sleeps for 1000 milliseconds
>> }
>> do_some_work(data_);
>> }
>> void Wakeup() {
>> data_ = ...;
>> flag_ = true;
>> }
>> ...
>> private:
>> volatile bool flag_;
>> volatile int data_;
>> };
>
>Your statement is true. However, that is only an _introduction_ to the
>meaning of volatile in multithreaded code. I guess I should have thought of
>a more elaborate example that would work on any machine.

You cannot come up with such an example without resorting to
platform-and compiler specific techniques, such as inline assembly language
to insert memory barrier instructions.

In the above example, if one thread writes to data_ and then sets flag_
there is absolutely no assurance that another thread running on another
processor will see these updates in the same order. It is possible for
flag_ to appear to flip true, but data_ to not have been updated yet!

Moreover, there is no assurance that data_ is updated atomically, so
that a processor can either see its old value or its new value, never
any half-baked value in between.

Resolving these issues can't be done in standard C++, so there is no
one examples that can fit all C++ platforms. This makes sense, since
threads are not currently part of the C++ language. (What I don't
understand is why the moderator of comp.lang.c++.moderated is even
allowing this discussion, which clearly belongs in
comp.programming.threads only).

>Anyway, the focus
>of the article is different. It's about using the type system to detect race
>conditions.

This thread was started in comp.programming.threads by Lie-Quan Lee
<ll...@lsc.nd.edu> who was specifically interested in knowing whether
rules similar to the POSIX memory visibility rules apply to other
multithreading platforms.

-> One question is, whether those memory visibility rules are applicable
-> for other thread system such as Solaris UI threads, or win32 threads,
-> or JAVA threads ...? If yes, we can follow the same spirit. Otherwise,
-> it will be a big difference. (For example, all shared variables might
-> have to be difined as volatile even with mutex protection.)

Dave Butenhof then replied:

-> Don't ever use the C/C++ language volatile in threaded code. It'll
-> kill your performance, and the language definition has nothing to do
-> with what you want when writing threaded code that shares data. If
-> some OS implementation tells you that you need to use it anyway on
-> their system, (in the words of a child safety group), "run, yell, and
-> tell". That's just stupid, and they shouldn't be allowed to get away
-> with it.

To which Martin Berger replied (and added, for some strange reason,
comp.lang.c++.moderated to the Newsgroups: header). This is the first
time the CUJ article was mentioned, clearly in the context of a
comp.programming.threads debate about memory visibility rules,
not in the context of a debate about C++ or qualifier-correctness:

-> well, in the c/c++ users journal, Andrei Alexandrescu recommends using
-> "volatile" to help avoiding race conditions. can the experts please
-> slug it out?(note the cross posting)

So it appears that, article does create some confusion at least in the
minds of some readers between volatile used as a request for special
access semantics, and volatile used as a constraint-checking access
control for class member function calls.

INcidentally, I believe that the second property can be exploited
without dragging in the semantics of volatile. Simply do something like
this:

#ifdef RACE_CHECK
#define VOLATILE volatile
#else
#define VOLATILE
#endif

When producing production object code, do not define RACE_CHECK; define
it only when you want to create extra semantic checks for the compiler
to diagnose. Making up a name other than ``VOLATILE'' might be useful
to clarify that what is being done has nothing to do with defeating
optimization.

Joerg Faschingbauer

unread,
Jan 15, 2001, 4:19:30 PM1/15/01
to
Duh! What compiler and what language are you talking about?

>>>>> "James" == James Kanze <James...@dresdner-bank.com> writes:

James> Joerg Faschingbauer wrote:
>> >>>>> "David" == David Schwartz <dav...@webmaster.com> writes:

David> Joerg Faschingbauer wrote:

>> >> Now that's the whole point: the compiler has to take care that
>> >> the code it generates spills registers before calling a function.

David> This is really not so. It's entirely possible that the
David> compiler might have some way of assuring that the particular
David> value cached in the register isn't used by the called function,
David> and hence it can keep it in a register.

>> Of course you may once target a system where everything is tightly
>> coupled, and where the compiler you use to compile your module knows
>> about the very internals of the runtime - register allocations for
>> example. Then it could keep variables in registers even across
>> runtime function calls.

James> The compiler always knows about the internals of the runtime register
James> allocations, since it is the compiler which defines them (at least
James> partially).

>> Even though such a thing is possible, it is quite unlikely -
>> consider the management effort of the people making (and upgrading!)
>> such a system. And even if people dared doing such a beast, this
>> wouldn't be POSIX - at least not with the functions that involve
>> locking and such. (There was a discussion here recently where Dave
>> Butenhof made this plausible - and I believe him :-}.)

James> I'm not sure I understand your point. It sounds like you are saying
James> that it is possible for the compiler not to know which registers it
James> can use, which is manifestly ridiculous.

>> (Of course there are compilers that do interprocedural and
>> intermodular (what a word!) optimization, involving such things as
>> not spilling registers before calling an external function. But
>> usually you have to compile the calling module and the callee module
>> in one swoop then - you pass more than one C file on the command
>> line or some such. But it is not common for you to compile your
>> module together with the mutex locking function modules of the C
>> runtime.)

James> Usually (well, in the once case I actually know of:-)), the compiler
James> generates extra information in the object file, which is used by the
James> linker.

James> About all you can hope for is that a compiler this intelligent also
James> knows about threads, and can recognize a mutex request when it sees
James> one.

David Schwartz

unread,
Jan 15, 2001, 4:21:17 PM1/15/01
to

dale wrote:
>
> David Schwartz wrote:
>
> > Code shouldn't _have_ race conditions.
>
> Well, that's not entirely correct. If you have a number of
> threads writing logging data to a file, which is protected
> by a mutex, then the order in which they write -is- subject
> to race conditions. This may or may not matter however.

If all of the possible outputs are valid, it's not a race condition.
The definition of a "race condition" is a programming construct where
the resultant output can be valid or invalid based upon the vagaries of
system timing.

DS

Tom Payne

unread,
Jan 15, 2001, 4:21:36 PM1/15/01
to
In comp.lang.c++.moderated James Dennett <jden...@acm.org> wrote:
[...]
: Andrei's claim (which seems reasonable to me, though I've not
: verified it in depth) is that his techniques, if used consistently,
: will detect all race conditions *at compile time*.

His technique seems a good way to guarantee atomicity of certain
operations, but AFIK it doesn't detect or prevent all situation where
the outcome of multiple operations on a thread-shared object depends
on how those threads are scheduled.

class Int {
int i;
public
Int() i(0) {}
double() { i = 2*i; }
incr() { i = i+1; }
}

Apply Andrei's technique to Int and then create a static Int k and two
threads:
- thread1 increments k and then exits
- thread2 doubles k and then exits.
Visibly, the final value of k.i is going to depend on the scheduling of
these two threads.

AFIK, detecting race conditions is equivalent to the halting problem.

Tom Payne

David Schwartz

unread,
Jan 15, 2001, 4:22:12 PM1/15/01
to

James Kanze wrote:

> You still need some way of preventing the optimizer from deferring
> writes until after the lock has been released. Ideally, the compiler
> will understand the locking system (mutex, or whatever), and generate
> the necessary write guards itself. Off hand, I don't know of any
> compiler which meets this ideal.

Give an example of what you think the problem is. The typical solution
is to give the compiler no information at all about the locking system.
Since the compiler then must assume the locking system could do
anything, it can't optimize anything across it.

It's not clear to me what you mean by "deferring writes". This could
either refer to variables being cached in registers and not written back
or it couuld refer to a hardware write cache not being flushed.
Fortunately, neither is a problem. Variables can't be cached in
registers because the compiler doesn't know what the lock/unlock
functions do, and so must assume they might access those variables from
their memory locations. Hardware write caches aren't a problem, because
the lock/unlock functions contain the appropriate memory barrier. The
compiler doesn't know this, but the compiler has nothing to do with such
hardware write reordering and so doesn't need to.

DS

Kaz Kylheku

unread,
Jan 15, 2001, 4:27:31 PM1/15/01
to
On 15 Jan 2001 13:20:38 -0500, Andrei Alexandrescu

<andre...@hotmail.com> wrote:
>"John Mullins" <John.M...@crossprod.co.uk> wrote in message
>> > the lock has been freed.
>>
>> But his examples also rely on undefined behaviour so he can't really
>> count on anything.
>
>Are you referring to const_cast? Strictly speaking, indeed. But then, all MT
>programming in C/C++ has undefined behavior.

However, some MT programming has another standard to serve as a safety
net. For example, correct POSIX MT programming is well-defined within
the realm of POSIX threads, even though it's not well-defined C++.
>From a C++ language point of view, the behavior is undefined; however,
the program correctly uses a documented extension.

When you say undefined behavior, there is some implicit interface
standard that is intended, be it ANSI/ISO C++, POSIX or what have you.
In comp.programming.threads, undefined has a necessarily weaker
meaning; obviously some multithreaded programs are deemed to be well
defined with respect to some interface.

It's not clear what class of undefined behavior John was referring to
here.

Ron Natalie

unread,
Jan 15, 2001, 5:21:35 PM1/15/01
to

> However, the C language has a way of signaling that local variables
> cannot be accessed by other threads, namely by placing them in
> the register storage class.

Huh? How is that? The C language doesn't contain the word thread
anywhere. Register is auto + a hint to keep in a register.

> I don't know about C++; if I remember
> correctly, C++ degrades register to a mere "efficiency hint".

The ONLY difference between C and C++ is that C++ allows you to
take the address of something with a register storage class (noting
that doing so may force it out of a register), while C prohibits
the & operator on register-declared objects even if they weren't
actually put in a register by the compiler.

Kaz Kylheku

unread,
Jan 15, 2001, 6:33:49 PM1/15/01
to
On 15 Jan 2001 09:36:16 -0500, James Kanze
<James...@dresdner-bank.com> wrote:
>> You use a mutex to protect data against concurrent access.
>
>> int i;
>
>> void f(void) {
>> lock(mutex);
>> i++; // or something
>> unlock(mutex);
>> // some lengthy epilogue goes here
>> }
>
>> Looking at it more paranoidly, on might argue that an optimizing
>> compiler will probably want to keep i in a register for some reason,
>> and that it might want to keep it in that register until the
>> function returns.
>
>I've used more than one compiler that does this. In fact, most do, at
>least with optimization turned on.

This optimization is only permitted if the compiler ``knows'' that i is
not modified by these functions, and, in the case of POSIX, if the
compiler also knows that these functions don't call library functions
that have memory synchronizing properties.

Unless the compiler has very sophisticated global optimizations that
can look into the retained images of other translation units of the
program, this means that if lock() and unlock() are, or contain,
calls to other units, then i cannot be cached.

You will probaly find with most compilers that the most aggressive
caching optimizations are applied to auto variables whose address is
never taken. These cannot possibly be accessed or modified by another
thread or signal handler or what have you, so it is generally safe to
cache them in registers, or even optimize them to registers entirely.

Kaz Kylheku

unread,
Jan 15, 2001, 6:33:30 PM1/15/01
to
On 15 Jan 2001 08:45:01 -0500, James Kanze
<James...@dresdner-bank.com> wrote:

>> Dave Butenhof wrote:
>
>> > Don't ever use the C/C++ language volatile in threaded code. It'll
>> > kill your performance, and the language definition has nothing to
>> > do with what you want when writing threaded code that shares
>> > data. If some OS implementation tells you that you need to use it
>> > anyway on their system, (in the words of a child safety group),
>> > "run, yell, and tell". That's just stupid, and they shouldn't be
>> > allowed to get away with it.
>
>This is correct up to a point. The problem is that the C++ language
>has no other way of signaling that a variable may be accessed by
>several threads (and thus ensuring e.g. that it is really written
>before the lock is released). The problem *isn't* with the OS; it is
>with code movement within the optimizer of the compiler.

The problem is with the specification which governs the implementation
of the compiler and the operating system.

> And while I
>agree with the sentiment: volatile isn't the solution, I don't know
>how many compilers offer another one.

All POSIX implementations must honor the rules that the synchronization
functions like pthread_mutex_lock and so forth have memory
synchronizing properties. The combined implementation of language,
library and operating system must ensure that data is made consistent
across multiple processors when these functions are used. It is simply
a requirement. So a POSIX threaded application never needs to do
anything special such as using volatile; the implementation must do
whatever is needed, including having the compiler specially recognize
some functions, if that's what it takes!

Without such a statement of requirement, you cannot infer anything
about the behavior. At best you can look at what the compiler does now
and hope that it will do similar things in the future. This is the
case, e.g., with Visual C++ for Microsoft Windows. It so happens that
if you call an external function like EnterCriticalSection, then the
Microsoft compiler emits code that does not cache any non-local data.

Tom Payne

unread,
Jan 15, 2001, 6:34:15 PM1/15/01
to
In comp.lang.c++.moderated Kenneth Chiu <ch...@cs.indiana.edu> wrote:
: In article <3A62CF03...@dresdner-bank.com>,

That's a very good and important point. Volatility is neither
necessary nor sufficient for the synchronization that is needed in
multi-threading. Volatile objects get synchronized at every sequence
point, which is unnecessarily often, but only register-resident
objects get synchronized; there is no requirement to synchronized
processor-local caches.

Tom Payne

Brian McNamara!

unread,
Jan 15, 2001, 6:35:50 PM1/15/01
to
Martin Berger <martinb@--remove--me--dcs.qmw.ac.uk> once said:
>this makes me think that c++ *should* be expanded to include something
>like "blob" as a modifier. or maybe even user defined modifiers.
...

>(eg for the prevention of race conditions). i also think that user defined
>modifier semantics has uses beyond preventing race conditions. how about it?

The way I see it, Andrei's approach uses the compiler as a model
checker. Volatile/non-volatile comprises a two-state model which
distunguishes whether objects are locked or unlocked (or whatever), and
the typechecker/type system captures the model and thus forces the
compiler to verify it.

I agree that user-defined modifiers could have uses beyond preventing
race conditions, but I think that's just an abuse of the type system.
(Which is not to say I'm not a proponent of abusing the type system; I
do it all the time. But if we are speaking of extensions, we might as
well do them right, rather than continue using less-than-ideal language
constructs as a means to achieve our ends.)

I think there could be exciting things done if there were a kind of
"property system" in addition to a type system, and the compiler served
as a property verifier. Then users could define a lattice of properties
which apply to certain types of objects, as well as property-transitions
that happen when mutating operations are applied to those objects, and
the compiler could verify that user-specified properties hold for
particular objects at particular points in the code. In other words, I
mean a kind of symbolic-model-checker used for proofs-of-correctness
whose equations are solved by the dataflow analysis engine that's
already in the compiler.

As I see it, you can't do this well now with templates in C++, because
an object's type is fixed during its (static) lifetime (that is, it's
extent in the code); object properties should be able to change as the
object is mutated, in general. (Andrei uses const_cast to effect the
property-transition, and LockingPtr as a new object (and data type) to
access the property-transformed object; this is sufficient for the
two-state model needed to catch race conditions.)

I don't know that I explained that well. :)

--
Brian M. McNamara lor...@acm.org : I am a parsing fool!
** Reduce - Reuse - Recycle ** : (Where's my medication? ;) )

Kaz Kylheku

unread,
Jan 15, 2001, 9:49:10 PM1/15/01
to
On 15 Jan 2001 16:15:11 -0500, Konrad Schwarz

<konradDO...@mchpDOTsiemens.de> wrote:
>So the optimization of keeping variables in registers accross
>function calls is illegal in general (and thus must not be performed),
>if the compiler cannot prove that the code will not be linked into a
>multi-threaded program or it cannot prove that those variables will
>never be shared.

Basically that is what it boils down to. Proving otherwise in the
general case involves knowing what happens in all the translation units
that are called, and such optimizations therefore must be delayed
somehow until the program is linked.

>However, the C language has a way of signaling that local variables
>cannot be accessed by other threads, namely by placing them in
>the register storage class. I don't know about C++; if I remember
>correctly, C++ degrades register to a mere "efficiency hint".

The auto storage class suffices. The register specifier simply means that
the object (which has automatic storage class---there is no register
storage class, just like there is no typedef storage class!) cannot
have its address taken; it becomes a constraint violation to try to do
so. It's not difficult to verify that an object's address is never
taken, whether or not it is declared register.

Kaz Kylheku

unread,
Jan 15, 2001, 10:35:05 PM1/15/01
to
On 15 Jan 2001 09:39:16 -0500, James Kanze

<James...@dresdner-bank.com> wrote:
>Kaz Kylheku wrote:
>
>> On 13 Jan 2001 23:01:04 -0500, Joerg Faschingbauer
>> <jfa...@jfasch.faschingbauer.com> wrote:
>> >So, provided that unlock() is a function and not a macro, there is no
>> >need to declare i volatile.
>
>> Even if a compiler implements sophisticated global optimizations
>> that cross module boundaries, the compiler can still be aware of
>> synchronization functions and do the right thing around calls to
>> those functions.
>
>It can be. It should be. Is it? Do todays compilers actually do the
>right thing, or are we just lucking out because most of them don't
>optimize very aggressively anyway?

It seems that we are lucking out, but not really. The way compilers
typically work is enough to ensure that volatile is not needed in order
for the right load and store instructions to be issues in the right
order. The rest of the job is done by the synchronization library
implementors, who must insert the appropriate memory barrier
instructions or what have you, into the implementation of these
functions, so that the hardware doesn't make a dog's breakfast out of
the memory access requests issued by each processor.

These library implementors tend to have a clue, and tend to have
influence with the compiler writers. For example, if the GNU compiler
people implemented some sophisticated optimizations not knowing that
these break MT programs, the GNU libc people would take note and work
out some solution---perhaps a special function attribute would be
developed, so that in the library header file, one could write:

int pthread_mutex_lock(pthread_mutex_t *) __attribute___ ((sync));

or some such thing meaning, spill and reload when calling this
function.

The point is that clueful implementors are aware of the issues and are
looking out for you; it's not just some accident that things work. :)

Ron Hunsinger

unread,
Jan 16, 2001, 5:28:45 AM1/16/01