keyword volatile required ?

Pierre Vigneras

unread,

Dec 6, 2001, 7:33:29 AM12/6/01

to

Hi,

i've a problem with the memory model used in computer and the use of
theads. Here it is :

what append with the following code :

int x = 5;
int y = 0;

pthread_mutex_lock(&lock);

y = x + 1;

pthread_mutex_unlock(&lock);

What happend if two threads are running ?
I would like the result (when the two threads has runned this code) :

y = 7.

But, for example, on multiprocessor, both threads may have a copy (in
registers, or in processor's cache) of the x variable which is 5, then they
will try to acquire the mutex. Only one acquire it, then y is 6. Then the other
thread acquire the mutex and since x is still seen has x=5, y = 6 also !!

Am i right ?

Should i declare x and y as volatile, or does pthread_mutex_[lock|unlock]
maintain memory coherency (reading/flushing from registers/cache to memory) ?

--
Pierre Vignéras
http://dept-info.labri.fr/~vigneras/

Équipe systèmes et objets distribués
http://jccf.labri.fr/jodo/

LaBRI
http://dept-info.labri.fr/

Kaz Kylheku

unread,

Dec 6, 2001, 11:49:45 AM12/6/01

to

In article <y7u1yi8...@jago.jodo.labri.u-bordeaux.fr>, Pierre

Vigneras wrote:
>Hi,
>
>i've a problem with the memory model used in computer and the use of
>theads. Here it is :

The POSIX standard does not require programs to use the volatile keyword
to attribute lvalues that are used to access shared objects. Calls to
synchronization functions are enough.

>what append with the following code :
>
>
>int x = 5;
>int y = 0;
>
>pthread_mutex_lock(&lock);
>
> y = x + 1;
>
>pthread_mutex_unlock(&lock);
>
>
>What happend if two threads are running ?
>I would like the result (when the two threads has runned this code) :
>
> y = 7.
>
>But, for example, on multiprocessor, both threads may have a copy (in
>registers, or in processor's cache) of the x variable which is 5, then they

Since these are local variables, all threads have their own instance
of them. Threads do not share local auto variables.

Agathocles, Tyrant of Syracuse

unread,

Dec 6, 2001, 10:47:29 AM12/6/01

to

Post the real code you have.

On 06 Dec 2001 13:33:29 +0100, vign...@labri.fr (Pierre Vigneras) wrote:
> Hi,
>
> i've a problem with the memory model used in computer and the use of
> theads. Here it is :
>
> what append with the following code :
>
>
> int x = 5;
> int y = 0;
>
> pthread_mutex_lock(&lock);
>
> y = x + 1;
>
> pthread_mutex_unlock(&lock);
>
>
> What happend if two threads are running ?

Nothing, it gets executed two times, that's all.

> I would like the result (when the two threads has runned this code) :
>
> y = 7.

When both threads are done, you lose your local vars. Also, *which* y would you like
to watch? Every thread has its own y (assuming the vars are local, which is not
necessarily the case, but how do we know -- which is why you need to post the real
code.)

> But, for example, on multiprocessor, both threads may have a copy (in
> registers, or in processor's cache) of the x variable which is 5,

You're making it way to complicated. You don't need to worry about registers or cache.
Thread have own stacks, so if your vars are local, they won't be shared.

> will try to acquire the mutex. Only one acquire it, then y is 6. Then the other
> thread acquire the mutex and since x is still seen has x=5, y = 6 also !!
>
> Am i right ?

Most likely not, but hard to say w/o the real code <g>.

> Should i declare x and y as volatile,

No.

> or does pthread_mutex_[lock|unlock]
> maintain memory coherency (reading/flushing from registers/cache to memory) ?

Forget registers and caches. You can't see them on the C level. Pretend they don't
exist.

David Schwartz

unread,

Dec 6, 2001, 4:53:33 PM12/6/01

to

Pierre Vigneras wrote:

> pthread_mutex_lock(&lock);
>
> y = x + 1;
>
> pthread_mutex_unlock(&lock);

> But, for example, on multiprocessor, both threads may have a copy (in

> registers, or in processor's cache) of the x variable which is 5, then they
> will try to acquire the mutex. Only one acquire it, then y is 6. Then the other
> thread acquire the mutex and since x is still seen has x=5, y = 6 also !!

> Am i right ?

No. The compiler has no way of knowing that the 'pthread_mutex_lock'
function won't alter the value of 'x', so it can't keep it in a register
across that function call. This has nothing to do with threads, this is
a basic sanity issue involving register caching.

DS

Pierre Vigneras

unread,

Dec 7, 2001, 4:46:41 AM12/7/01

to

>>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com> writes:

Agathocles> Post the real code you have.
Ok, i missed something...

Agathocles> vign...@labri.fr (Pierre Vigneras) wrote:
>> Hi,
>>
>> i've a problem with the memory model used in computer and the use of
>> theads. Here it is :
>>
>> what append with the following code :
>>
>>
>> int x = 5; int y = 0;
>>
>> pthread_mutex_lock(&lock);
>>
>> y = x + 1;
>>
>> pthread_mutex_unlock(&lock);
>>
>>
>> What happend if two threads are running ?

Agathocles> Nothing, it gets executed two times, that's all.

Rigth !

>> I would like the result (when the two threads has runned this code) :
>>
>> y = 7.

Agathocles> When both threads are done, you lose your local vars. Also,
Agathocles> *which* y would you like to watch? Every thread has its own y
Agathocles> (assuming the vars are local, which is not necessarily the
Agathocles> case, but how do we know -- which is why you need to post the
Agathocles> real code.)

I think, using a mutex to lock a local variable is nonsense (correct me if i'm
wrong) since local vars are not shared. So i assumed that in my example, it was
evident that x and y are shared variables. But this wasn't as clear as i
thougth - wasn't 'it ?

>> But, for example, on multiprocessor, both threads may have a copy (in
>> registers, or in processor's cache) of the x variable which is 5,

Agathocles> You're making it way to complicated. You don't need to worry
Agathocles> about registers or cache. Thread have own stacks, so if your
Agathocles> vars are local, they won't be shared.

But now, what happend if x and y are shared variables ? Perhaps global
variables are not load into registers ?

>> will try to acquire the mutex. Only one acquire it, then y is 6. Then
>> the other thread acquire the mutex and since x is still seen has x=5, y
>> = 6 also !!
>>
>> Am i right ?

Agathocles> Most likely not, but hard to say w/o the real code <g>.

>> Should i declare x and y as volatile,

Agathocles> No.

>> or does pthread_mutex_[lock|unlock] maintain memory coherency
>> (reading/flushing from registers/cache to memory) ?

Agathocles> Forget registers and caches. You can't see them on the C
Agathocles> level. Pretend they don't exist.

Thanks, and sorry if i wasn't clear.

Anand

unread,

Dec 7, 2001, 7:28:47 AM12/7/01

to

vign...@labri.fr (Pierre Vigneras) wrote in message news:<y7u1yi8...@jago.jodo.labri.u-bordeaux.fr>...

> Hi,
>
> i've a problem with the memory model used in computer and the use of
> theads. Here it is :
>
> what append with the following code :
>
>
> int x = 5;
> int y = 0;
>
> pthread_mutex_lock(&lock);
>
> y = x + 1;
>
> pthread_mutex_unlock(&lock);
>
>
> What happend if two threads are running ?
> I would like the result (when the two threads has runned this code) :
>
> y = 7.
>
> But, for example, on multiprocessor, both threads may have a copy (in
> registers, or in processor's cache) of the x variable which is 5, then they
> will try to acquire the mutex. Only one acquire it, then y is 6. Then the other
> thread acquire the mutex and since x is still seen has x=5, y = 6 also !!
>
> Am i right ?
>
> Should i declare x and y as volatile, or does pthread_mutex_[lock|unlock]
> maintain memory coherency (reading/flushing from registers/cache to memory) ?
>

pthread_mutex_unlock() will take care of the memory coherency. There
will be a memory barrier at every unlock operation. That guarenties
that all the writes are completed and which does cache invalidations
whatever necessary.
So even you have a copy in the cache it is not a problem. and it can't
be in a register, beacuse varibles are not cached across functions
calls.

-Anand

Pierre Vigneras

unread,

Dec 7, 2001, 9:58:34 AM12/7/01

to

>>>>> "Anand" == Anand <thr_...@yahoo.com> writes:

Anand> pthread_mutex_unlock() will take care of the memory coherency. There
Anand> will be a memory barrier at every unlock operation. That guarenties
Anand> that all the writes are completed and which does cache invalidations
Anand> whatever necessary. So even you have a copy in the cache it is not
Anand> a problem. and it can't be in a register, beacuse varibles are not
Anand> cached across functions calls.

Why is there a memory barrier on every unlock operation and not on lock
operation ? I guess the reason is linked to your last sentence : "varibles are
not cached across functions calls." ? Is this specified somewhere or shall we
hope compilers never cach variables across functions call ?

Kaz Kylheku

unread,

Dec 7, 2001, 1:29:49 PM12/7/01

to

In article <y7usnan...@jago.jodo.labri.u-bordeaux.fr>, Pierre

Vigneras wrote:
>>>>>> "Anand" == Anand <thr_...@yahoo.com> writes:
>
> Anand> pthread_mutex_unlock() will take care of the memory coherency. There
> Anand> will be a memory barrier at every unlock operation. That guarenties
> Anand> that all the writes are completed and which does cache invalidations
> Anand> whatever necessary. So even you have a copy in the cache it is not
> Anand> a problem. and it can't be in a register, beacuse varibles are not
> Anand> cached across functions calls.
>
>Why is there a memory barrier on every unlock operation and not on lock
>operation ?

The POSIX standard doesn't say this. It just says that the various thread
synchronization functions also synchronize memory. It essentially doesn't
say *how* it's done, only that the programmer doesn't have to be concerned
about it.

Of course, depending on the nature of the hardware, you may need a memory
barrier on entry to a critical sectio and on exit. Or possibly even two
different kinds of memory barriers.

David Schwartz

unread,

Dec 7, 2001, 3:36:40 PM12/7/01

to

Pierre Vigneras wrote:

> Why is there a memory barrier on every unlock operation and not on lock
> operation ? I guess the reason is linked to your last sentence : "varibles are
> not cached across functions calls." ? Is this specified somewhere or shall we
> hope compilers never cach variables across functions call ?

Why should you care how your compiler meets the requirements so long as
it meets them?

DS

Anand

unread,

Dec 8, 2001, 12:59:36 AM12/8/01

to

vign...@labri.fr (Pierre Vigneras) wrote in message news:<y7usnan...@jago.jodo.labri.u-bordeaux.fr>...

> >>>>> "Anand" == Anand <thr_...@yahoo.com> writes:
>
> Anand> pthread_mutex_unlock() will take care of the memory coherency. There
> Anand> will be a memory barrier at every unlock operation. That guarenties
> Anand> that all the writes are completed and which does cache invalidations
> Anand> whatever necessary. So even you have a copy in the cache it is not
> Anand> a problem. and it can't be in a register, beacuse varibles are not
> Anand> cached across functions calls.
>
> Why is there a memory barrier on every unlock operation and not on lock
> operation ? I guess the reason is linked to your last sentence : "varibles are
> not cached across functions calls." ? Is this specified somewhere or shall we
> hope compilers never cach variables across functions call ?

i mean variables are not stored in registers across function calls.
compiler has nothing to do with cache.

memory barrier is used at every unlock operation because the present
day hardware supports release consistency or stonger. if any hardware
has some other memory consistency, then it might have to use memory
barrier at lock also.

-Anand

Agathocles, Tyrant of Syracuse

unread,

Dec 7, 2001, 9:22:58 PM12/7/01

to

On 07 Dec 2001 10:46:41 +0100, vign...@labri.fr (Pierre Vigneras) wrote:
> >>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com> writes:
>
> >> I would like the result (when the two threads has runned this code) :
> >>
> >> y = 7.
> Agathocles> When both threads are done, you lose your local vars. Also,
> Agathocles> *which* y would you like to watch? Every thread has its own y
> Agathocles> (assuming the vars are local, which is not necessarily the
> Agathocles> case, but how do we know -- which is why you need to post the
> Agathocles> real code.)
>
> I think, using a mutex to lock a local variable is nonsense (correct me if i'm
> wrong)

For all practical reasons you're right <g>.

> since local vars are not shared. So i assumed that in my example, it was
> evident that x and y are shared variables. But this wasn't as clear as i
> thougth - wasn't 'it ?

It wasn't. At least to me. Now, if your vars are shared, then how will you get y == 7?

int x = 5;
int y = 0;

// First pass
pthread_mutex_lock(&lock);
y = x + 1; // Y = 5 + 1; = 6
pthread_mutex_unlock(&lock);

// Second pass
pthread_mutex_lock(&lock);
y = x + 1; // Y = 5 + 1; = 6
pthread_mutex_unlock(&lock);

????

> >> But, for example, on multiprocessor, both threads may have a copy (in
> >> registers, or in processor's cache) of the x variable which is 5,
> Agathocles> You're making it way to complicated. You don't need to worry
> Agathocles> about registers or cache. Thread have own stacks, so if your
> Agathocles> vars are local, they won't be shared.
>
> But now, what happend if x and y are shared variables ? Perhaps global
> variables are not load into registers ?

You don't need to worry about registers. YOu're not writing in assembly, so you can
assume that your compiler will load and reload the registers when necessary. If you
have several processors, you should assume that the motherboard circuitry will keep
the processors' caches consistent. The problem is with your code, not hardware.
Carefully trace what your version does, understand it, and compare with what you want.
Then change your code to do what you want.

Agathocles, Tyrant of Syracuse

unread,

Dec 7, 2001, 9:26:40 PM12/7/01

to

On 07 Dec 2001 15:58:34 +0100, vign...@labri.fr (Pierre Vigneras) wrote:
> >>>>> "Anand" == Anand <thr_...@yahoo.com> writes:
>
> Anand> pthread_mutex_unlock() will take care of the memory coherency. There
> Anand> will be a memory barrier at every unlock operation. That guarenties
> Anand> that all the writes are completed and which does cache invalidations
> Anand> whatever necessary. So even you have a copy in the cache it is not
> Anand> a problem. and it can't be in a register, beacuse varibles are not
> Anand> cached across functions calls.
>
> Why is there a memory barrier on every unlock operation and not on lock
> operation ?

You keep banging into a wrong door. Again, the problem is with your code, not
the hardware or compiler. Your code doesn't do what you want--because you've written it
so, not because something mysterious happens below the surface.

Randy Crawford

unread,

Dec 9, 2001, 4:08:04 AM12/9/01

to

Kaz Kylheku wrote:
> In article <y7usnan...@jago.jodo.labri.u-bordeaux.fr>, Pierre
> Vigneras wrote:

[...]

> >Why is there a memory barrier on every unlock operation and not on lock
> >operation ?
>
> The POSIX standard doesn't say this. It just says that the various thread
> synchronization functions also synchronize memory. It essentially doesn't
> say *how* it's done, only that the programmer doesn't have to be concerned
> about it.

That's an interesting constraint. I wonder how it's enforced? Since
compilers are oblivious to pthread semantics (all the semantics lie
in the pthread library), and mutex library calls are oblivious to
the state of surrounding variables, I suspect no pthread implementation
is actually POSIX compliant in preserving the semantics you describe.
If you compile with a high level of optimization, global variables are
likely to remain in registers across mutexes, and semantics will be
sensitive to the programmer adding "volatile" to variable declarations.

In OpenMP there's a compiler directive called omp_flush(variable_list)
that's provided for just such contingencies. It's interesting to consider
the implications. After all, OpenMP compilers *must* be aware of parallel
semantics (unlike pthread compilers), and yet they appear to be non-compliant
with these POSIX rules. It seems to me that if the "more clueful" OpenMP
compilers are not POSIX compliant, then pthread compilers are unlikely
to be compliant either.

Ia anybody aware of where in the GCC compiler these semantics are enforced?

Randy

--
Randy Crawford http://www.engin.umich.edu/labs/cpc
crw...@umich.edu http://www-personal.engin.umich.edu/~crwfrd

Ivan Skytte Jørgensen

unread,

Dec 9, 2001, 5:09:13 AM12/9/01

to

Randy Crawford wrote:
>
> Kaz Kylheku wrote:
> > In article <y7usnan...@jago.jodo.labri.u-bordeaux.fr>, Pierre
> > Vigneras wrote:
> [...]
> > >Why is there a memory barrier on every unlock operation and not on lock
> > >operation ?
> >
> > The POSIX standard doesn't say this. It just says that the various thread
> > synchronization functions also synchronize memory. It essentially doesn't
> > say *how* it's done, only that the programmer doesn't have to be concerned
> > about it.
>

...

> If you compile with a high level of optimization, global variables are
> likely to remain in registers across mutexes, and semantics will be
> sensitive to the programmer adding "volatile" to variable declarations.

No.

pthread_mutex_lock()/pthread_mutex_unlock() are external functions, and
the compiler does not know what they do. So the compiler cannot assume
they do not modify the global variables. Therefore the global variables
cannot be held in registers across pthread_mutex_[un]lock() calls. The
same applies to static and local variables which has their address
taken.

Regards,
Ivan

Kaz Kylheku

unread,

Dec 9, 2001, 3:15:12 PM12/9/01

to

In article <3C1329F4...@umich.edu>, Randy Crawford wrote:
>Kaz Kylheku wrote:
>> In article <y7usnan...@jago.jodo.labri.u-bordeaux.fr>, Pierre
>> Vigneras wrote:
>[...]
>> >Why is there a memory barrier on every unlock operation and not on lock
>> >operation ?
>>
>> The POSIX standard doesn't say this. It just says that the various thread
>> synchronization functions also synchronize memory. It essentially doesn't
>> say *how* it's done, only that the programmer doesn't have to be concerned
>> about it.
>
>That's an interesting constraint. I wonder how it's enforced? Since
>compilers are oblivious to pthread semantics (all the semantics lie

A compiler is part of the language implementation. A compiler writer
cannot ignore the requirements set out by POSIX. The language and library
are inseparable; POSIX is a set of extensions to the ANSI C language.

>in the pthread library), and mutex library calls are oblivious to
>the state of surrounding variables, I suspect no pthread implementation
>is actually POSIX compliant in preserving the semantics you describe.
>If you compile with a high level of optimization, global variables are
>likely to remain in registers across mutexes, and semantics will be
>sensitive to the programmer adding "volatile" to variable declarations.

This divides into two cases.

1. The compiler writer knows the semantics of pthread_mutex_lock.

In this case, although it is clear that the function itself does not
modify the global variable, the semantics come from POSIX.
A thread switch could take place in pthread_mutex_lock, and the
new thread can do unknown things. So the variable cannot be cached
across the call.

2. The compiler writer knows nothing about pthread_mutex_lock.

In this case it's just another function, in another translation unit.
There is no telling what that function can do. So the global variable
cannot be cached.

One concern are static globals whose address is never taken. If the
compiler writer is ignorant of threads, it would appear that the the
compiler could make the assumption that the static variable whose address
is never taken cannot possibly be accessed by other translation units.
However, other translation units can make calls to external functions
within this translation unit, so this assumption cannot be made.
Through these calls, they can indirectly modify that variable.
If you are agnostic about pthread_mutex_lock and other functions, you
cannot trust them not to call back arbitrarily.

>In OpenMP there's a compiler directive called omp_flush(variable_list)
>that's provided for just such contingencies.

Isn't it the case that OpenMP is a set of language extensions that are
orthogonal to POSIX and ANSI C? So these documents don't apply to it.

David Schwartz

unread,

Dec 9, 2001, 3:19:36 PM12/9/01

to

Kaz Kylheku wrote:

> One concern are static globals whose address is never taken. If the
> compiler writer is ignorant of threads, it would appear that the the
> compiler could make the assumption that the static variable whose address
> is never taken cannot possibly be accessed by other translation units.
> However, other translation units can make calls to external functions
> within this translation unit, so this assumption cannot be made.
> Through these calls, they can indirectly modify that variable.
> If you are agnostic about pthread_mutex_lock and other functions, you
> cannot trust them not to call back arbitrarily.

In fact, for all the compiler knows, pthread_mutex_lock could call back
into the same function that's currently running. Even if it's static,
another function in the same translation unit could take its address and
squirrel it away somewhere. In practice, the optimizations he's worried
about are impossible.

DS

Randy Crawford

unread,

Dec 10, 2001, 3:19:43 AM12/10/01

to

Kaz Kylheku wrote:
>
> In article <3C1329F4...@umich.edu>, Randy Crawford wrote:
> >Kaz Kylheku wrote:
> >> In article <y7usnan...@jago.jodo.labri.u-bordeaux.fr>, Pierre
> >> Vigneras wrote:
> >[...]
> >> >Why is there a memory barrier on every unlock operation and not on lock
> >> >operation ?

[...]

> >
> >That's an interesting constraint. I wonder how it's enforced? Since
> >compilers are oblivious to pthread semantics (all the semantics lie
>
> A compiler is part of the language implementation. A compiler writer
> cannot ignore the requirements set out by POSIX. The language and library
> are inseparable; POSIX is a set of extensions to the ANSI C language.

You're assuming that POSIX compliance is desirable among vendors. I believe
POSIX is of sufficiently little interest in the marketplace now that most
vendors (as well as the FSF and Linux) no longer especially care about it.

Given that POSIX is really of interest only to the Unix world, and Unix is
rapidly worrying less about POSIX/XOPEN/etc, adn more about Linux, forces
other than POSIX are driving the semantics of compilers more than is POSIX.

What's more, with the wide variation in the way that vendors have implemented
pthreads (Sun LWPs, SGI SPROCs, Linux forks, etc) POSIX compilance at a level
of subtlety like the one under review is probably being deliberately ignored.
Since generic thread semantics are generally pretty simple, and I would bet
that little data protection is supported within most compilers (and probably
with most O/Ses), most vendors have probably decided to let the systems'
hardware deal with data consistency semantics, POSIX be damned.

I don't know this for a fact, but I'd be willing to put money on it.

>
> >in the pthread library), and mutex library calls are oblivious to

> >the state of surrounding variables, [...]

>
> This divides into two cases.
>
> 1. The compiler writer knows the semantics of pthread_mutex_lock.
>
> In this case, although it is clear that the function itself does not
> modify the global variable, the semantics come from POSIX.
> A thread switch could take place in pthread_mutex_lock, and the
> new thread can do unknown things. So the variable cannot be cached
> across the call.

This is almost certainly not applicable. No compiler of my acquantaince
is aware of the semantics of individual library calls, especially of
subtle threading side-effects. After all, thread-safe libraries are
still not yet universal.

>
> 2. The compiler writer knows nothing about pthread_mutex_lock.
>
> In this case it's just another function, in another translation unit.
> There is no telling what that function can do. So the global variable
> cannot be cached.

This example misses my point. Look at the syntax of pthread_mutex_lock().
No variables are specified in the function call. Therefore, no variable
is observed by the compiler as potentially modified by the function, and
all global variables are still vulnerable to the effect in question.

>
> One concern are static globals whose address is never taken. If the
> compiler writer is ignorant of threads, it would appear that the the
> compiler could make the assumption that the static variable whose address
> is never taken cannot possibly be accessed by other translation units.
> However, other translation units can make calls to external functions
> within this translation unit, so this assumption cannot be made.
> Through these calls, they can indirectly modify that variable.
> If you are agnostic about pthread_mutex_lock and other functions, you
> cannot trust them not to call back arbitrarily.

I'm not sure we're talking the same language. The only case in which a
compiler will not optimize around a variable is when the variable is
explicitly passed (by reference) to an external function. If no such
function passes that variable, and by reference, then the compiler is
free to optimize that variable into oblivion. This applies to all
global variables, when using pthreads ot not. All data remain vulnerable,
regardless of mutex locks.

If you surround all assignments to a variable with mutexes, that only
means that no other pthread will change the variable's value when one
pthread has set the mutex. It says nothing about the compiler's obligation
of flushing the variable to memory before or after calling a given mutex
(unless the variable is visible to the compiler as an argument to the
mutex function call, which it is not).

What's more, since many modern compilers don't really use a stack, they're
free to leave the variable in registers, so long as the register is made
available to the called function, and the variable's dependency reflects
its potential modification within the function. That's why I'm fairly
certain that "volatile" is still necessary, even within code like:

mutex on

y = x + 1;

mutex off

>
> >In OpenMP there's a compiler directive called omp_flush(variable_list)
> >that's provided for just such contingencies.
>
> Isn't it the case that OpenMP is a set of language extensions that are
> orthogonal to POSIX and ANSI C? So these documents don't apply to it.

To ANSI C, yes. To the compiler, no. OpenMP assumes both that the
compiler may *not* be aware of threads, and it also assumes that the
compiler *may* be aware of threads. Therefore, it provides directives
that *must* be used if the compiler is oblivious (like omp_flush()),
and directives that may be used if the compiler *is* aware of threads
(like the omp reduce directive, that hints of an upcoming reduction
operation, warning that the variable better not be held in a register,
since another thread may be using it concurrently).

I believe most OpenMP implementations are built atop pthreads.

Steve Watt

unread,

Dec 10, 2001, 2:21:48 PM12/10/01

to

In article <3C14701F...@umich.edu>,

Randy Crawford <crw...@umich.edu> wrote:
>Kaz Kylheku wrote:
>> In article <3C1329F4...@umich.edu>, Randy Crawford wrote:
>> >That's an interesting constraint. I wonder how it's enforced? Since
>> >compilers are oblivious to pthread semantics (all the semantics lie

>> A compiler is part of the language implementation. A compiler writer
>> cannot ignore the requirements set out by POSIX. The language and library
>> are inseparable; POSIX is a set of extensions to the ANSI C language.

>You're assuming that POSIX compliance is desirable among vendors. I believe
>POSIX is of sufficiently little interest in the marketplace now that most
>vendors (as well as the FSF and Linux) no longer especially care about it.

Check again how Linux came about. It started when a grad student was
reading the POSIX standard. Those standards may appear to be somewhat
irrelevant in the academic world that you're coming from, but they're
very widely applied in the industry, especially in the embedded space.
I will agree that there seem to be those in the Linux camp who are
trending towards "we're the standard, let the rest of the world follow",
but that seems to be hubris more than fact.

>Given that POSIX is really of interest only to the Unix world, and Unix is
>rapidly worrying less about POSIX/XOPEN/etc, adn more about Linux, forces
>other than POSIX are driving the semantics of compilers more than is POSIX.

The UNIX world has never really cared about POSIX. The embedded world,
and those who supply the US DoD *do* care, and will continue to care for
the forseeable future.

>What's more, with the wide variation in the way that vendors have implemented
>pthreads (Sun LWPs, SGI SPROCs, Linux forks, etc) POSIX compilance at a level
>of subtlety like the one under review is probably being deliberately ignored.
>Since generic thread semantics are generally pretty simple, and I would bet
>that little data protection is supported within most compilers (and probably
>with most O/Ses), most vendors have probably decided to let the systems'
>hardware deal with data consistency semantics, POSIX be damned.

Actually, you'll discover that these details are definitely not being
ignored by the implementors. I was certainly well aware of them when I
was building a pthreads implementation for a real-time OS, and I was in
communication with other implementors from other OS companies that also
paid close attention to these things.

>I don't know this for a fact, but I'd be willing to put money on it.

Can I collect now? I wasn't aware of this group being occupied by trolls
before, but this sure feels like one. I'll continue to treat it a a
serious post, though.

>> >in the pthread library), and mutex library calls are oblivious to
>> >the state of surrounding variables, [...]

>> This divides into two cases.
>>
>> 1. The compiler writer knows the semantics of pthread_mutex_lock.
>>
>> In this case, although it is clear that the function itself does not
>> modify the global variable, the semantics come from POSIX.
>> A thread switch could take place in pthread_mutex_lock, and the
>> new thread can do unknown things. So the variable cannot be cached
>> across the call.

>This is almost certainly not applicable. No compiler of my acquantaince
>is aware of the semantics of individual library calls, especially of
>subtle threading side-effects. After all, thread-safe libraries are
>still not yet universal.

gcc is most certainly aware of the semantics of individual library
calls. See __builtin_*. Things I can think of that the * expands
to off the top of my head: abs, alloca, memcpy, memcmp, memset,
strcmp, strcpy, strlen, and a bunch of math calls.

And there are compilers out there that are aware of the semantics
of a mutex lock (not just pthread_mutex_lock). The external memory
semantics of any lock are necessarily the same as the pthread ones
because nothing else makes sense from an ANSI C standpoint.

In our implementation, we had a mostly-inlined version of the mutex
lock and unlock calls, and had to do some special coaxing (that
gcc fortunately provided hooks for) to force the update of externally
visible variables on lock and unlock in the code paths that didn't
cause a system call or external function call.

>> 2. The compiler writer knows nothing about pthread_mutex_lock.
>>
>> In this case it's just another function, in another translation unit.
>> There is no telling what that function can do. So the global variable
>> cannot be cached.

>This example misses my point. Look at the syntax of pthread_mutex_lock().
>No variables are specified in the function call. Therefore, no variable
>is observed by the compiler as potentially modified by the function, and
>all global variables are still vulnerable to the effect in question.

If the compiler is unaware of pthread_mutex_lock(), it has no idea
what the function might do, including calling back into another
function in the invoking module, or even recursing to the invoking
function. The memory updates simply *MUST* happen across a function
call if the compiler has no a priori knowledge of that function's
complete execution path.

>> One concern are static globals whose address is never taken. If the
>> compiler writer is ignorant of threads, it would appear that the the
>> compiler could make the assumption that the static variable whose address
>> is never taken cannot possibly be accessed by other translation units.
>> However, other translation units can make calls to external functions
>> within this translation unit, so this assumption cannot be made.
>> Through these calls, they can indirectly modify that variable.
>> If you are agnostic about pthread_mutex_lock and other functions, you
>> cannot trust them not to call back arbitrarily.
>
>I'm not sure we're talking the same language. The only case in which a
>compiler will not optimize around a variable is when the variable is
>explicitly passed (by reference) to an external function. If no such
>function passes that variable, and by reference, then the compiler is
>free to optimize that variable into oblivion. This applies to all
>global variables, when using pthreads ot not. All data remain vulnerable,
>regardless of mutex locks.

False. Static variables are also a hazard on recursion, especially
mutual recursion between functions in different translation units.
If a translation unit contains a static variable that is never
used as an lvalue, it could be optimized away. In any other condition,
the state of that variable must be updated before a function call
to another translation unit. See below for a concrete example.

Threads, in this particular case, have no affect on whether the
compiler can do that optimization.

>If you surround all assignments to a variable with mutexes, that only
>means that no other pthread will change the variable's value when one
>pthread has set the mutex. It says nothing about the compiler's obligation
>of flushing the variable to memory before or after calling a given mutex
>(unless the variable is visible to the compiler as an argument to the
>mutex function call, which it is not).

See above. If the compiler has knowledge, it will flush. If it does
not, it must still flush, or it is broken in the presence of certain
types of *SINGLE THREADED* programs.

>What's more, since many modern compilers don't really use a stack, they're
>free to leave the variable in registers, so long as the register is made
>available to the called function, and the variable's dependency reflects
>its potential modification within the function. That's why I'm fairly
>certain that "volatile" is still necessary, even within code like:
>
>mutex on
> y = x + 1;
>mutex off

What is the storage class of y here? Something global, I guess, since
using mutices with auto variables is useless.

I'll put in a concrete example here, of the case where the compiler
has no knowledge of mutex_lock (add pthread_ if you wish, it doesn't
matter).

/* Translation unit 1 */
int
my_function(int parameter) {
static unsigned int datum = 3141;
static mutex_t mut = SOME_MUTEX_INITIALIZER;
int copy_datum;

mutex_lock(&mut);
datum *= datum; datum -= 104; datum ^= parameter; /* or whatever */
copy_datum = datum;
mutex_unlock(&mut);

return copy_datum;
}

So far, so good, right? That looks like nice, protected random number
generation code. Now, in another translation unit, we have:

void /* ok, it should return something, but the above code ignored it */
mutex_lock(mutex_t *m) {
int saw_m = m->beenhere;
m->beenhere = 0;

if (saw_m)
m->number = my_function(m->number);

m->beenhere = 1;
}

Perfectly valid code. It happens to be badly named, since it has nothing
to do with locking a mutex. If the compiler decides it can keep the
value of datum in a call-saved register across the call to that function
named mutex_lock, then this code will not work the way ANSI C says it
must.

[ OpenMP discussion elided; I have no experience with it. ]

If you feel that volatile is somehow still necessary, feel free to
use it wherever you want. As long as I don't have to deal with the
code or the performance problems that result :). However, you will
find that these problems have been carefully considered by threads
implementations, even if the compilers don't always think about them,
and those implementations will handle variables in persistent storage
correctly across mutex operations.
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.8" / 37N 20' 14.9"
Internet: steve @ Watt.COM Whois: SW32
Free time? There's no such thing. It just comes in varying prices...

Kaz Kylheku

unread,

Dec 10, 2001, 5:35:17 PM12/10/01

to

In article <3C14701F...@umich.edu>, Randy Crawford wrote:
>What's more, with the wide variation in the way that vendors have implemented
>pthreads (Sun LWPs, SGI SPROCs, Linux forks, etc) POSIX compilance at a level
>of subtlety like the one under review is probably being deliberately ignored.

I can say on behalf of the Linux implementors, at least, that this
isn't the case. With regard to the specific matter at hand, there is
no intention to force the user to worry about memory visibility when
using the threading library synchronization functions.

Deviations from POSIX are only due to bugs and kernel-imposed architectural
constraints, not because the glibc people don't care about POSIX.
These architectural constraints are difficult to work out because they
involve cooperation from the kernel developers, who have a whole other
set of concerns, like ensuring that there is a solid operating system
with decent SMP scalability, VM performance, networking finesse and all
that.

In some cases, performance has even been sacrificed to comply with POSIX.
There is a discussion in libc-alpha going on now about the putc() function
being slow. This is in large part due to the somewhat braindamaged POSIX
requirement that putc() must lock and unlock the stream. But 99% of the
C programmers out there don't know about putc_unlocked(), nor should they.

Ulrich Drepper and other libc developers care about POSIX. They
keep up with what is going on in POSIX standardization and adopt
some bleeding edge features. Why do you think we have new things like
pthread_mutex_timedlock in glibc 2.2? Drepper attends the Austin group
meetings, representing Red Hat. So he's not only participant in the
standardization process, but also a principal library developer.
Why would you invest your time in the process, but then go home and not
care about implementing it? ;)

>Since generic thread semantics are generally pretty simple, and I would bet
>that little data protection is supported within most compilers (and probably
>with most O/Ses), most vendors have probably decided to let the systems'
>hardware deal with data consistency semantics, POSIX be damned.

A vendor should have some kind of test suite consisting of standard POSIX
threads code, and should be running that suite on all the supported
hardware. If the synchronization functions don't take care of memory
visibility, these tests will fail.

So what you are saying is that some vendors know that standard code
will fail, but they don't bother telling people that they need to insert
their own memory barriers, volatiles or whatever else.

>I'm not sure we're talking the same language. The only case in which a
>compiler will not optimize around a variable is when the variable is
>explicitly passed (by reference) to an external function.

You clearly haven't thought about this in sufficient depth, or read
the existing responses in the thread with sufficient care. The called
function cannot be trusted not to generate a call back into the calling
module. Suppose you have this:

static int global;
pthread_mutex_t(&mutex) = PTHREAD_MUTEX_INITIALIZER;

void function(void)
{
global++;
}

void other_function(void)
{
pthread_mutex_lock(&mutex);
/* do something with global */
pthread_mutex_unlock(&mutex);
}

If the compiler knows nothing about the semantics of pthread_mutex_lock;
in particular, it cannot assume that pthread_mutex_lock() won't call
into function() or other_function(). Note that the address of global
is not taken anywhere.

I hope you aren't working on any compilers I care about. :)

Kaz Kylheku

unread,

Dec 10, 2001, 5:40:37 PM12/10/01

to

In article <3C14701F...@umich.edu>, Randy Crawford wrote:

>What's more, with the wide variation in the way that vendors have implemented
>pthreads (Sun LWPs, SGI SPROCs, Linux forks, etc) POSIX compilance at a level
>of subtlety like the one under review is probably being deliberately ignored.

I can say on behalf of the Linux implementors, at least, that this

isn't the case. With regard to the specific matter at hand, there is
no intention to force the user to worry about memory visibility when
using the threading library synchronization functions.

Deviations from POSIX are only due to bugs and kernel-imposed architectural
constraints, not because the glibc people don't care about POSIX.
These architectural constraints are difficult to work out because they
involve cooperation from the kernel developers, who have a whole other
set of concerns, like ensuring that there is a solid operating system
with decent SMP scalability, VM performance, networking finesse and all
that.

In some cases, performance has even been sacrificed to comply with POSIX.
There is a discussion in libc-alpha going on now about the putc() function
being slow. This is in large part due to the somewhat braindamaged POSIX
requirement that putc() must lock and unlock the stream. But 99% of the
C programmers out there don't know about putc_unlocked(), nor should they.

Ulrich Drepper and other libc developers care about POSIX. They
keep up with what is going on in POSIX standardization and adopt
some bleeding edge features. Why do you think we have new things like
pthread_mutex_timedlock in glibc 2.2? Drepper attends the Austin group
meetings, representing Red Hat. So he's not only participant in the
standardization process, but also a principal library developer.
Why would you invest your time in the process, but then go home and not
care about implementing it? ;)

>Since generic thread semantics are generally pretty simple, and I would bet

>that little data protection is supported within most compilers (and probably
>with most O/Ses), most vendors have probably decided to let the systems'
>hardware deal with data consistency semantics, POSIX be damned.

A vendor should have some kind of test suite consisting of standard POSIX

threads code, and should be running that suite on all the supported
hardware. If the synchronization functions don't take care of memory
visibility, these tests will fail.

So what you are saying is that some vendors know that standard code
will fail, but they don't bother telling people that they need to insert
their own memory barriers, volatiles or whatever else.

>I'm not sure we're talking the same language. The only case in which a

>compiler will not optimize around a variable is when the variable is
>explicitly passed (by reference) to an external function.

You clearly haven't thought about this in sufficient depth, or read

the existing responses in the thread with sufficient care. The called
function cannot be trusted not to generate a call back into the calling
module. Suppose you have this:

static int global;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

Randy Crawford

unread,

Dec 11, 2001, 1:24:04 AM12/11/01

to

To Kaz Kylhku and Steve Watt:

I want to thank you both for well-written and thoughtful responses.
I should admit that even before your replies, I had conceded that
"volatile" was unnecessary in the given circumstance, and that any
function call was sufficient to flush global variables. You wuz
right and I wuz wrong.

In addition, Kaz's perspective on the role of POSIX in Linux was
especially enlightening. I was aware of the US government's interest
in standards compliance, including POSIX, even as it applies to non-
Unix O/Ses (which always seemed a bit daft to me). However, I didn't
realize that POSIX still held such sway (aside from its role in
pthreads, which is eponymous and thus innate).

I'm still not entirely convinced that vendors sweat the details in
preserving POSIX compliance in all their languages, especially when
the language contains extensions which place it outside the purview
of POSIX (as in HPC and parallelism, e.g. OpenMP, autoparallelizing).
But if the parallel extensions are implemented atop a compliant
infrastructure (such as SPROCs, LWPs, clone(), etc) I suppose the
extensions are likely to be POSIX compliant (or can be *made* POSIX
compliant).

Thanks for an enlightening exchange.

Randy

Steve Watt wrote:
[...]
Kaz Kylheku wrote:

> I hope you aren't working on any compilers I care about. :)

Not yet, anyway. :-)

Kaz Kylheku

unread,

Dec 11, 2001, 2:02:13 AM12/11/01

to

In article <3C15A684...@umich.edu>, Randy Crawford wrote:
>I'm still not entirely convinced that vendors sweat the details in
>preserving POSIX compliance in all their languages, especially when

There is no talk of POSIX compliance in languages that don't have
standard POSIX bindings. There are, for instance, POSIX bindings for Ada,
so you can take an implementation of these, and try to determine whether
it conforms to the spec. I don't know the list of languages for which
there are POSIX bindings.

>the language contains extensions which place it outside the purview
>of POSIX (as in HPC and parallelism, e.g. OpenMP, autoparallelizing).

Right, so only the programs which don't use these extensions can
be considered POSIX programs. Programs that rely on the extensions
are, well, programs of the dialect in which they are written.

>But if the parallel extensions are implemented atop a compliant
>infrastructure (such as SPROCs, LWPs, clone(), etc) I suppose the
>extensions are likely to be POSIX compliant (or can be *made* POSIX
>compliant).

The extensions can't be said to conform to POSIX; that's what makes them
extensions. If an implementation can support these extensions without
rejecting correct POSIX programs, then they are conforming extensions;
programs which don't use them are not affected by them. That's the
extent to which an extension can be said to be conforming---only that
its presence doesn't interfere with support for standard features.
If an extension invalidates or change the meaning of POSIX programs,
then it is nonconforming. A way must be provided to turn it off,
or else the language implementation then has no POSIX conforming mode.

Now the behavior of extensions themselves, that conforms only to the whims
of their respective authors. If the extensions of some parallelizing
compiler require users to explicitly denote that cached copies of
variables must be flushed at certain points, that has no bearing on
POSIX conformance.

Pierre Vigneras

unread,

Dec 11, 2001, 4:21:30 AM12/11/01

to

>>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com> writes:

Agathocles> On 07 Dec 2001 10:46:41 +0100, vign...@labri.fr (Pierre

Agathocles> Vigneras) wrote:
>> >>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com>
>> >>>>> writes:
>>
>> >> I would like the result (when the two threads has runned this code) :
>> >>
>> >> y = 7.

Agathocles> [...] Now, if your vars are shared, then how will you get y ==
Agathocles> 7? int x = 5; int y = 0;

Agathocles> // First pass
Agathocles> pthread_mutex_lock(&lock);
Agathocles> y = x + 1; // Y = 5 + 1; = 6
Agathocles> pthread_mutex_unlock(&lock);

Agathocles> // Second pass
Agathocles> pthread_mutex_lock(&lock);
Agathocles> y = x + 1; // Y = 5 + 1; = 6
Agathocles> pthread_mutex_unlock(&lock);

Agathocles> ????

Hum, the big joke ! My example was written too rapidly and i'm confused about
that. I should have writen :

int x = 5; // Global variables
int y = 0;

pthread_mutex_lock(&lock);
y=++x;
pthread_mutex_unlock(&lock);

I think, the "natural" value of y when two threads has runned this portion of
code is 'y == 7'. But i was wondering why 'y != 6' since the 'volatile' keyword
is not used.

Answers to my post give me the reason as far as i understand them :

Anand> variables are not stored in registers across function calls.

Now, more questions arise :

- when the 'volatile' keyword must be used in multithreaded programming
?

- (related to Java thread) does JVM implements the Java 'volatile'
keyword and how ?

Thanks for all your past and futures thoughtful responses.

David Butenhof

unread,

Dec 11, 2001, 7:41:02 AM12/11/01

to

Pierre Vigneras wrote:

> Hum, the big joke ! My example was written too rapidly and i'm confused
> about that. I should have writen :
>
> int x = 5; // Global variables
> int y = 0;
>
> pthread_mutex_lock(&lock);
> y=++x;
> pthread_mutex_unlock(&lock);
>
> I think, the "natural" value of y when two threads has runned this portion
> of code is 'y == 7'. But i was wondering why 'y != 6' since the 'volatile'
> keyword is not used.

Because both threads use a mutex, they are guaranteed mutually exclusive
access and a consistent view of memory. The first will see x == 5 and leave
y == 6; the second will see x == 6 and leave y == 7.

> Answers to my post give me the reason as far as i understand them :
>
> Anand> variables are not stored in registers across function calls.
>
> Now, more questions arise :
>
> - when the 'volatile' keyword must be used in multithreaded
> programming?

Never in PORTABLE threaded programs. The semantics of the C and C++
"volatile" keyword are too loose, and insufficient, to have any particular
value with threads. You don't need it if you're using portable
synchronization (like a POSIX mutex or semaphore) because the semantics of
the synchronization object provide the consistency you need between threads.

The only use for "volatile" is in certain non-portable "optimizations" to
synchronize at (possibly) lower cost in certain specialized circumstances.
That depends on knowing and understanding the specific semantics of
"volatile" under your particular compiler, and what other machine-specific
steps you might need to take. (For example, using "memory barrier" builtins
or assembly code.)

In general, you're best sticking with POSIX synchronization, in which case
you've got no use at all for "volatile". That is, unless you have some
existing use for the feature having nothing to do with threads, such as to
allow access to a variable after longjmp(), or in an asynchronous signal
handler, or when accessing hardware device registers.

> - (related to Java thread) does JVM implements the Java 'volatile'
> keyword and how ?

Which JVM? Which definition of "volatile"? In general, the JVM just needs
to manage its value caching and use of memory barriers to meet the defined
semantics. There, wasn't that easy? ;-)

/------------------[ David.B...@compaq.com ]------------------\
| Compaq Computer Corporation POSIX Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----[ http://home.earthlink.net/~anneart/family/dave.html ]-----/

Konrad Schwarz

unread,

Dec 11, 2001, 10:52:37 AM12/11/01

to

David Butenhof schrieb:

> > - when the 'volatile' keyword must be used in multithreaded
> > programming?
>
> Never in PORTABLE threaded programs. The semantics of the C and C++
> "volatile" keyword are too loose, and insufficient, to have any particular
> value with threads. You don't need it if you're using portable
> synchronization (like a POSIX mutex or semaphore) because the semantics of
> the synchronization object provide the consistency you need between threads.

Where exactly does POSIX specify this? Or is it by omission?

Brooks Johnson

unread,

Dec 11, 2001, 1:58:16 PM12/11/01

to

Konrad Schwarz <konradDO...@mchpDOTsiemens.de> wrote in message news:<3C162BC5...@mchpDOTsiemens.de>...

It is the C language standard that specifies the requirements of the
"volatile" keyword. IIRC, POSIX does not appear to "add anything" to
the requirements.

Refer to Kaz Kylheku's most informative newsgroup post with
Message-ID: <slrn96bsf...@ashi.FootPrints.net>

I cut and paste a small number of lines from his post (that are very
relevant to your question):
-----------------------------------------------------
There are only two places in the standard C language where the use of
volatile is required of the programmer:

--- When an auto object is accessed after a longjmp, and that object
has been modified since the context had been saved with setjmp, the
object must be declared volatile.

--- A signal handler may store a value to a static object of type
volatile
sig_atomic_t.
-----------------------------------------------------

Dave Butenhof notes pretty much the same points in his post that you
responded to.

Which brings up a question. How does the C compiler ensure that access
to this "static object of type volatile sig_atomic_t" is
reentrant-safe? The signal handler may be trying to store a value in
it at the same time the non-signal handling portion of the program may
be trying to access it - right? Is this handled by a ISO C compliant
compiler on all platforms, or would this be a case where a programmer
would need to apply what Butenhof writes: "That depends on knowing and

understanding the specific semantics of 'volatile' under your
particular compiler, and what other machine-specific steps you might
need to take. (For example, using 'memory barrier' builtins or

assembly code.)" Wow - all this for a program that is single-threaded
but has reentrancy.

--Brooks

Kaz Kylheku

unread,

Dec 11, 2001, 3:19:04 PM12/11/01

to

In article <186775ea.01121...@posting.google.com>, Brooks

Johnson wrote:
>Which brings up a question. How does the C compiler ensure that access
>to this "static object of type volatile sig_atomic_t" is
>reentrant-safe?

However the implementors want. For example, they can generate code
which disables interrupts around all accesses to the object, if that's
what makes sense on the given platform. Or rely on the hardware to make
accesses atomic to that datatype.

Agathocles, Tyrant of Syracuse

unread,

Dec 11, 2001, 10:24:52 AM12/11/01

to

On 11 Dec 2001 10:21:30 +0100, vign...@labri.fr (Pierre Vigneras) wrote:
> Now, more questions arise :
>
> - when the 'volatile' keyword must be used in multithreaded programming ?

"Volatile" got nothing to do with multi-threaded programming. It's used to mark
variables that are used in such a context that, if unmarked, could be optimized out of
existence by the compiler. But, imagine you have a var that is actually modified by
something from outside of your program, maybe in Bios something, or whatever. Imagine
that your code only reads this var (but it is written to from the outside), so the
compiler will say, well, it's only ever read, so why the hell have this variable? I'll
get rid of it to optimize the code. But you don't want that, coz it's actually written
to, though not by your code, and so your compiler doesn't know the whole picture. So you
say it's "volatile" and then the compiler knows to leave it alone. Got nothing to do
with MT.

Good luck.

Alexander Terekhov

unread,

Dec 12, 2001, 4:11:30 AM12/12/01

to

David Butenhof wrote:
[...]

> > - (related to Java thread) does JVM implements the Java 'volatile'
> > keyword and how ?
>
> Which JVM? Which definition of "volatile"?

http://jcp.org/jsr/detail/133.jsp
http://www.cs.umd.edu/~pugh/java/memoryModel/semantics.pdf

Informal:

"When a thread T1 reads a volatile field v that was
previously written by a thread T2, all actions that
were visible to T2 at the time T2 wrote to v be-
come visible to T1. This is a strengthening of volatile
over the existing semantics. The existing semantics
make it very difficult to use volatile fields to com-
municate between threads, because you cannot use a
signal received via a read of a volatile field to guar-
antee that writes to non-volatile fields are visible.
With this change, many broken synchronization id-
ioms (e.g., double-checked locking [Pug00a]) can be
fixed by declaring a single field volatile."

regards,
alexander.

Pierre Vigneras

unread,

Dec 12, 2001, 4:20:19 AM12/12/01

to

>>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com> writes:

Agathocles> On 11 Dec 2001 10:21:30 +0100, vign...@labri.fr (Pierre

Agathocles> Vigneras) wrote:
>> Now, more questions arise :
>>
>> - when the 'volatile' keyword must be used in multithreaded programming
>> ?

Agathocles> "Volatile" got nothing to do with multi-threaded programming.

i disagree this assertion : in Java, the keyword 'volatile' is used exclusively
in multithreaded programming, and its semantic is similar to the ANSI C one (as
far as i understand the semantic, correct me - and excuse my ignorance - if i'm
wrong) :

in C, it prevents the compiler to optimize acces to the variable declared
'volatile' (essentially preventing the compiler to catch the variable contents
in registers).

in Java, it means that all access to 'volatile' variables must be done from the
global memory instead of the local thread memory.

Agathocles> It's used to mark variables that are used in such a context
Agathocles> that, if unmarked, could be optimized out of existence by the
Agathocles> compiler. But, imagine you have a var that is actually modified
Agathocles> by something from outside of your program, maybe in Bios
Agathocles> something, or whatever. Imagine that your code only reads this
Agathocles> var (but it is written to from the outside), so the compiler
Agathocles> will say, well, it's only ever read, so why the hell have this
Agathocles> variable? I'll get rid of it to optimize the code. But you
Agathocles> don't want that, coz it's actually written to, though not by
Agathocles> your code, and so your compiler doesn't know the whole
Agathocles> picture. So you say it's "volatile" and then the compiler knows
Agathocles> to leave it alone. Got nothing to do with MT.

Can't we consider another thread in the process writing a global variable as
"something from outside of your program" modifying the variable ?

Consider the code :

int stop = 0; // Global var

// global variable readers thread code
while (true) {

pthread_mutex_lock(&mutex);

if (stop) { // Means this reader thread must stop its execution !
pthread_mutex_unlock(&mutex);
return;
}

pthread_mutex_unlock(&mutex);

do_reader_work();
}

// Only one global variable writer !!

while(true) {

do_writer_work();

if (stop_eveything()) {
pthread_mutex_lock(&mutex);
stop = 1;
pthread_mutex_unlock(&mutex);
return;
}
}

Can't the above code replaced by the following one ?

volatile int stop = 0; // VOLATILE global var !

// readers thread code

while (true) {

// This mutex is unecessary since stop is VOLATILE !!
// pthread_mutex_lock(&mutex);

if (stop) { // Means this reader thread must stop its execution !
// pthread_mutex_unlock(&mutex);
return;
}

// pthread_mutex_unlock(&mutex);

computeSomething();
}

// Only one writer !!

while(true) {

do_writer_work();

if (stop_eveything()) {
// stop is volatile don't need a critical section !
// pthread_mutex_lock(&mutex);
stop = 1;
// pthread_mutex_unlock(&mutex);
return;
}
}

In this situation, i think blocking all readers to see the value of the global
variable 'stop' value which is updated atomically by only one writer (supposing
'int' reads and writes are atomic) is inefficient compared to the "volatile
solution" where readers and the writer access the global variable at maximum
hardware speed (maybe some memory architectures allows multiple reads of the
same value at the same time ?).

So, i want to know if the 'volatile' keyword can be used in POSIX multithreaded
programs to prevent the overhead of mutex locks in some situations as in Java
multithreaded programming ? Or, if on the opposite, their is absolutely and
definitely no use of the 'volatile' keyword in POSIX multithreaded programming
?

Konrad Schwarz

unread,

Dec 12, 2001, 7:53:59 AM12/12/01

to

Brooks Johnson schrieb:

>
> Konrad Schwarz <konradDO...@mchpDOTsiemens.de> wrote in message news:<3C162BC5...@mchpDOTsiemens.de>...
> > David Butenhof schrieb:
> > > > - when the 'volatile' keyword must be used in multithreaded
> > > > programming?
> > >

> > Where exactly does POSIX specify this? Or is it by omission?
>
> It is the C language standard that specifies the requirements of the
> "volatile" keyword. IIRC, POSIX does not appear to "add anything" to
> the requirements.

[...]
This doesn't answer my question exactly. I'd like to have a statement
to the effect of either:

see page so and so of thus and thus

or

The POSIX standard never mentions volatile in connection with
shared variables, thus, it is not required.

My question is a language lawyer type question.

David Butenhof

unread,

Dec 12, 2001, 8:38:56 AM12/12/01

to

Pierre Vigneras wrote:

> So, i want to know if the 'volatile' keyword can be used in POSIX
> multithreaded programs to prevent the overhead of mutex locks in some
> situations as in Java multithreaded programming ? Or, if on the opposite,
> their is absolutely and definitely no use of the 'volatile' keyword in
> POSIX multithreaded programming ?

I'm tempted to ignore this question simply because the fact that you've
asked it implies you've been ignoring EVERYTHING written in this thread
(and many previous threads in this newsgroup) by anyone who knows what
they're talking about.

But, OK, one more time.

You may freely use volatile in POSIX multithreaded programs FOR THE
PURPOSES SPECIFIED IN THE ANSI C LANGUAGE STANDARD. Which have nothing to
do with threads.

You cannot, ever, in a "POSIX multithreaded program" use volatile as a
substitute for explicit synchronization. Note that the phrase "POSIX
multithreaded program" here can usefully be interpreted only as referring
to STRICTLY CONFORMING POSIX code guaranteed to be fully portable to all
conforming implementations of the POSIX standard. (Any other definition
requires reference to a specific combination of hardware, OS, and
compiler.) There may be some implementations where the compiler and/or
hardware present a specific EXTENSION of "volatile" that can, with
appropriate caution, be used in such a manner -- but such use is not
portable (or strictly speaking "correct") under either ANSI C or POSIX
rules.

You may of course freely use the volatile keyword in any context allowed by
ANSI C, as long as you ALSO apply the POSIX memory visibility rules and
synchronization to any shared use of the data. Which is to say, you can
slow down your code as much as you want by adding volatile, but you can't
depend on it doing anything else.

Java presents a perhaps subtly but substantially different definition of
"volatile" that IS relevant to threads. The original definition tried to
make it useful, but didn't quite get it right. A revision is under way to
fix the definition so that it can actually be used portably for the uses
commonly inferred from (but not supported by) the original definition. This
has nothing to do with C or POSIX threads, because Java is and depends on
neither of those standards.

Joe Seigh

unread,

Dec 12, 2001, 6:11:34 AM12/12/01

to

Pierre Vigneras wrote:
...

> So, i want to know if the 'volatile' keyword can be used in POSIX multithreaded
> programs to prevent the overhead of mutex locks in some situations as in Java
> multithreaded programming ? Or, if on the opposite, their is absolutely and
> definitely no use of the 'volatile' keyword in POSIX multithreaded programming
> ?
>

Volatile cannot be used in Java to avoid the need for locks. Java volatile
variables references are only ordered with respect to other volatile variables,
and are not ordered with respect to non-volatile variables. At some point you are
going to have to get a lock to sync the memory visability of your non-volatile
variables. Unless of course you declare all of your variables as volatile.

It's conjectured that volatile in Java was meant for doing i/o in imbedded
systems. It's not useful for normal programming. If you are using volatile
and think it is necessary then it's highly likely that you are doing something
wrong.

Joe Seigh

Kaz Kylheku

unread,

Dec 12, 2001, 1:24:47 PM12/12/01

to

In article <y7ulmg8...@jago.jodo.labri.u-bordeaux.fr>, Pierre

Vigneras wrote:
>So, i want to know if the 'volatile' keyword can be used in POSIX multithreaded
>programs to prevent the overhead of mutex locks in some situations as in Java
>multithreaded programming ? Or, if on the opposite, their is absolutely and
>definitely no use of the 'volatile' keyword in POSIX multithreaded programming
>?

In Java, you can assume that your code is running on a SPARC computer. That's
because Java is the invention of Sun microsystems, and its virtual machine
model is basically modeled closely after their proprietary hardware.

The portability of Java rests in dressing up each target host to look
like the Java platform. It's not based on the idea of source-code
level portability. Java isn't really a language, but a platform-language
combination.

If the Java spec does in fact say that you can access objects without
synchronization, then the implementations have to ensure that.

POSIX is based on the C language, which unlike Java, is
platform-independent. There is no specific implied memory model,
and this specification weakness propagates into the library extensions
provided by POSIX. C and POSIX recognize hardware diversity to a far
greater extent.

Drazen Kacar

unread,

Dec 12, 2001, 1:48:44 PM12/12/01

to

Kaz Kylheku wrote:

> In Java, you can assume that your code is running on a SPARC computer. That's
> because Java is the invention of Sun microsystems, and its virtual machine
> model is basically modeled closely after their proprietary hardware.

Can you give a few examples for that?

--
.-. .-. Unlike good wine, bullshit doesn't improve with age.
(_ \ / _) -- John McLean
| da...@willfork.com
|

Agathocles, Tyrant of Syracuse

unread,

Dec 12, 2001, 3:22:46 PM12/12/01

to

On 12 Dec 2001 10:20:19 +0100, vign...@labri.fr (Pierre Vigneras) wrote:
> >>>>> "Agathocles" == Agathocles, Tyrant of Syracuse <nog...@all.com> writes:
>
> Agathocles> On 11 Dec 2001 10:21:30 +0100, vign...@labri.fr (Pierre
> Agathocles> Vigneras) wrote:
> >> Now, more questions arise :
> >>
> >> - when the 'volatile' keyword must be used in multithreaded programming
> >> ?
> Agathocles> "Volatile" got nothing to do with multi-threaded programming.
>
> i disagree this assertion : in Java, the keyword 'volatile' is used exclusively
> in multithreaded programming, and its semantic is similar to the ANSI C one (as
> far as i understand the semantic, correct me - and excuse my ignorance - if i'm
> wrong) :

I am only slightly familiar with Java and never used it for serious, so perhaps you're
right and it's different there. I'm using C++ (and though I'm not positive, I suspect it's
the same in C.) To tell you the truth, I've never used this keyword. It's more for
embedded systems, where you pick some fixed location that you *know* will be modified by
some driver or whatever, and you read that thing, but you don't want to let the compiler
to take that var and replace with a const just because it's never changed by *your* code.
I can't think of a reason why this keyword couldn't be used in multi-threaded programs but
the same semantics remain the same. I suspect you don't really need this keyword and are
mistakenly ascribing it some unrelated functionality that you need to solve your problem.

> in C, it prevents the compiler to optimize acces to the variable declared
> 'volatile' (essentially preventing the compiler to catch the variable contents
> in registers).

That would have to be a famed "implementation detail" <g>. We don't worry about things
like that, all we care for is that it won't disappear, and when we read it, it's the real
up-to-date value, not some accidental junk the compiler decided would be just as good as
the real var <g>.

> in Java, it means that all access to 'volatile' variables must be done from the
> global memory instead of the local thread memory.
>
>
> Agathocles> It's used to mark variables that are used in such a context
> Agathocles> that, if unmarked, could be optimized out of existence by the
> Agathocles> compiler. But, imagine you have a var that is actually modified
> Agathocles> by something from outside of your program, maybe in Bios
> Agathocles> something, or whatever. Imagine that your code only reads this
> Agathocles> var (but it is written to from the outside), so the compiler
> Agathocles> will say, well, it's only ever read, so why the hell have this
> Agathocles> variable? I'll get rid of it to optimize the code. But you
> Agathocles> don't want that, coz it's actually written to, though not by
> Agathocles> your code, and so your compiler doesn't know the whole
> Agathocles> picture. So you say it's "volatile" and then the compiler knows
> Agathocles> to leave it alone. Got nothing to do with MT.
>
> Can't we consider another thread in the process writing a global variable as
> "something from outside of your program" modifying the variable ?

No, cuz that's your code as well. The compiler wouldn't care about threads one bit. It'll
simply compile all your functions (threaded or not) and give you an image, that the OS
then will thread according to your logic. Now, if a var at some known address gets
modified by a hardware driver (not part of your code), that the compiler has no way of
figuring out, so you need to give it a hint, which is "volatile".

> Consider the code :
That'll work. In fact, with that code (only one writer and the reader(s) checking the
shared flag only once through the repeated chunk of code) I don't think you need any
locking at all. The worst can happen is that your no-lock reader gets the terminate-signal
one scan later than it would with locks, which is probably acceptable functionally. But,
to return to your question, no, I don't think you can use "volatile" to synchronize access
to anything. It's just not what it does. Don't know about Java though (but suspect the
semantics of "volatile" are the same there too.)

Brooks Johnson

unread,

Dec 13, 2001, 1:26:58 PM12/13/01

to

Konrad Schwarz <konradDO...@mchpDOTsiemens.de> wrote in message news:<3C175367...@mchpDOTsiemens.de>...

O.K.
The latest POSIX standard, IEEE Std. P1003.1-2001, was approved on
Dec. 6, 2001.

While the standard is not freely available, the final publicly
available draft (Draft 7) is available (if one registers) at
//www.opengroup.org.

I encourage you to access the final draft at the aforementioned URL.

So, let's look for occurrences of the word "volatile" when it is used
as a qualifier for a type in the C programming language.

In the section on the asynchronous i/o control block "struct aiocb",
one of the mandatory fields is "volatile void * aio_buf". Well, this
is specific to the domain of the aio facilities.

The next two (also the last) areas of reference to "volatile" merely
elaborate on what Butenhof and Kylheku wrote in their respective
posts:

i) longjmp()
pg. 1195

ii) aynchronous interrupts
In the specification for <signal.h>, the Base definitions volume
states that sig_atomic_t is defined as a typedef:

"sig_atomic_t: Possibly volatile qualified integer type of an object
that can be accessed as an atomic entity, even in the presence of
asynchronous interrupts"

Nothing in the above states anything about threads. By the way,
asynchronous signals are asynchronous interrupts.

On pg. 1840 and 1857, the sigaction() and signal() specs. require
defined behaviour for a signal handler "assigning a value to a static
storage duration variable of type volatile sig_atomic_t."

--Brooks