volatile problem

j0mbolar

unread,

Sep 22, 2004, 8:58:04 AM9/22/04

to

suppose I have

struct info {
char name[100];
double ssn;
long cc;
};

volatile struct info buyer;
struct info bought;

...

bought = buyer;

how can I work around this bug while having
volatile qualification but ensuring data integrity?

James Kuyper

unread,

Sep 22, 2004, 10:53:07 AM9/22/04

to

What exactly is the "bug"?

j0mbolar

unread,

Sep 23, 2004, 12:59:17 AM9/23/04

to

James Kuyper <kuy...@saicmodis.com> wrote in message news:<415191D3...@saicmodis.com>...

think about the value of the object being changed
unknowingly to the compiler while a copy is being made.

how do you ensure atomicity?

Brian Inglis

unread,

Sep 23, 2004, 1:47:45 AM9/23/04

to

On 22 Sep 2004 21:59:17 -0700 in comp.std.c, j0mb...@engineer.com
(j0mbolar) wrote:

^^^^^^^^ program

>
>how do you ensure atomicity?

The only guarantee of atomicity in the standard is for access
(including modification) to an object defined as type volatile
sig_atomic_t. For larger objects, you can use that type to build
locks, semaphores, mutexes, or can use some other signalling or mutual
exclusion method which is hardware or OS dependent.

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Alexander Terekhov

unread,

Sep 23, 2004, 5:36:38 AM9/23/04

to

Brian Inglis wrote:
[...]

> The only guarantee of atomicity in the standard is for access
> (including modification) to an object defined as type volatile
> sig_atomic_t.

It has really nothing to do with threads. Implementation can achieve
it by masking signals to ensure uninterruptible "atomic" access, for
example. Under POSIX/SUS, if you share modifiable sig_atomic_t stuff
across threads (whether it's static volatile or not doesn't really
matter at all), you must synchronize access... and you can't do that
in the async signal context unless you mask suspect signal(s) for the
duration of async-signal-UNsafe calls in all thread interruptable by
suspect signal(s). Alternatively, you might want to use SIGEV_THREAD
signal delivery mechanism that turns async signals into threads. You
won't have any problems with respect to async-signal-safety (pretty
busted concept given that neither errno nor thread-cancel state/mode
manipulation is async-signal-safe according to standard).

regards,
alexander.

Douglas A. Gwyn

unread,

Sep 23, 2004, 5:41:35 AM9/23/04

to

All that volatile qualification does is force the compiler
to emit code that directly accesses the so-qualified
object at its declared storage location rather than
sometimes using a copy of its contents in a cache at some
other location (e.g. in a fast register). Thus, in the
above code snippet, you are assured that "buyer" is read
from its storage location during evaluation of the r.h.s.
of that assignment. However, "bought" is not necessarily
written at its storage location during that assignment.

There is no issue of lack of "data integrity" within a
single-threaded program. The compiler is obliged to treat
all references as if they occur in strict accordance
with a virtual machine that operates according to a simple
model. So wherever the value of "bought" is cached, any
reference to "bought" will be to that cached value.

Data integrity issues (apart from correctness of the
program) arise only when there is access to some *shared*
resource by concurrent processes, e.g. multiple threads.
If you find yourself having to deal with such a situation,
carefully identify what data objects must be shared,
declare *all* of those objects with volatile qualification,
and surround *every* access to any of those objects with
some kind of exclusive-access interlock. (Typically, a
"mutex" coordinated with the thread scheduler.) You can
"batch" accesses to related objects under the span of a
single instance of an interlock. There are two main
things to watch out for: (1) don't lock out another
process for longer than necessary, and (2) during a lock,
don't do anything that might cause the thread to block on
some other lock (which could lead to "deadlock").

Alexander Terekhov

unread,

Sep 23, 2004, 5:55:05 AM9/23/04

to

"Douglas A. Gwyn" wrote:
[...]

> Data integrity issues (apart from correctness of the
> program) arise only when there is access to some *shared*
> resource by concurrent processes, e.g. multiple threads.
> If you find yourself having to deal with such a situation,
> carefully identify what data objects must be shared,
> declare *all* of those objects with volatile qualification,

Never do that. C/C+ volatile is brain-dead (and has nothing
to do with thread). And Java's (including MS.Net clone) isn't
much better (though it is relevant to threads).

regards,
alexander.

James Kuyper

unread,

Sep 23, 2004, 9:00:05 AM9/23/04

to

In principle, qualifying an object's declaration with 'volatile' is
sufficient in itself to guarantee that the object's value is unreliable.

However, in practice a volatile qualification is applied to an object
only if there's a specific reason why the object may be expected to be
externally modified. The appropriate method for safely using an object
like 'buyer' depends upon what that reason is. There's no portable way
to do it.

Douglas A. Gwyn

unread,

Sep 23, 2004, 3:26:51 PM9/23/04

to

Alexander Terekhov wrote:

> "Douglas A. Gwyn" wrote:
>>Data integrity issues (apart from correctness of the
>>program) arise only when there is access to some *shared*
>>resource by concurrent processes, e.g. multiple threads.
>>If you find yourself having to deal with such a situation,
>>carefully identify what data objects must be shared,
>>declare *all* of those objects with volatile qualification,
> Never do that. C/C+ volatile is brain-dead (and has nothing
> to do with thread).

Wrong.

Douglas A. Gwyn

unread,

Sep 23, 2004, 3:33:29 PM9/23/04

to

James Kuyper wrote:
> In principle, qualifying an object's declaration with 'volatile' is
> sufficient in itself to guarantee that the object's value is unreliable.

No! Volatile qualification merely disables caching of
the object's value in the generated code.

Perhaps you were misled by the requirement that access
of an object *that is modified by external influences*
(such as a directly addressable I/O register, or a
separate thread) needs to be accessed via a volatile-
qualified type in order to ensure that the actual
value at its location is used rather than a cached
value.

Or perhaps you understand this but just chose
unfortunate wording to express it.

Eric Sosman

unread,

Sep 23, 2004, 6:39:26 PM9/23/04

to

Douglas A. Gwyn wrote:

"Brain-dead" is too strong, to be sure. However, Alexander
is correct when he says that `volatile' has nothing to do with
multi-threading. A `volatile' qualification is neither sufficient
nor necessary as a means of ensuring integrity of shared data.
The topic arises on comp.programming.threads about as often as
`void main()' gets re-hashed on the C newsgroups; you may wish
to browse the archives for an interminable overview of the issue.

--
Eric....@sun.com

Jack Klein

unread,

Sep 24, 2004, 12:33:03 AM9/24/04

to

On Thu, 23 Sep 2004 11:36:38 +0200, Alexander Terekhov
<tere...@web.de> wrote in comp.std.c:

You are so very, very wrong, because even though you say it has
nothing really to do with threads, the issues you talk about only
apply to threads, or at least areas of memory that are volatile
because the might be modified by other code under the control of a
common operating system.

In many cases a volatile object is actually a memory-mapped hardware
device, and there is absolutely nothing you can do in software to
"ensure" atomic access if the device is capable of changing its
internal registers asynchronously to your access.

In other cases the memory might be accessed by a hardware device such
as a DMA controller. If it has higher priority in hardware than does
the processor executing your code, it could modify the memory in the
middle of your access.

So there is absolutely no guarantee that any software can provide that
a volatile object wider than a single bus access can be read
atomically.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Alexander Terekhov

unread,

Sep 24, 2004, 5:04:19 AM9/24/04

to

Jack Klein wrote: ...

http://groups.google.com/groups?selm=3F054950.F505F41B%40web.de
http://groups.google.com/groups?selm=414E9E40.A66D4F48%40web.de

regards,
alexander.

Alexander Terekhov

unread,

Sep 24, 2004, 5:30:14 AM9/24/04

to

Eric Sosman wrote:
[...]

> "Brain-dead" is too strong, to be sure.

Once you contemplate atomic<> and something ala <iohw.h>/<hardware>
TR stuff (both with msync's portable barriers and memory isolation
guarantees in the language), real exceptions, and threads to process
async. signals in the unrestricted context, you'll see that volatiles
are totally brain-dead and shall better be deprecated ASAP (together
with setjmp/longjmp and sig_atomic_t silliness).

regards,
alexander.

Dan Pop

unread,

Sep 24, 2004, 7:03:34 AM9/24/04

to

In <4153509E...@sun.com> Eric Sosman <eric....@sun.com> writes:

>Douglas A. Gwyn wrote:
>
>> Alexander Terekhov wrote:
>>
>>> "Douglas A. Gwyn" wrote:
>>>
>>>> Data integrity issues (apart from correctness of the
>>>> program) arise only when there is access to some *shared*
>>>> resource by concurrent processes, e.g. multiple threads.
>>>> If you find yourself having to deal with such a situation,
>>>> carefully identify what data objects must be shared,
>>>> declare *all* of those objects with volatile qualification,
>>>
>>> Never do that. C/C+ volatile is brain-dead (and has nothing
>>> to do with thread).
>>
>> Wrong.
>
> "Brain-dead" is too strong, to be sure. However, Alexander
>is correct when he says that `volatile' has nothing to do with
>multi-threading. A `volatile' qualification is neither sufficient
>nor necessary as a means of ensuring integrity of shared data.

It's certainly not sufficient, but it *is* necessary, to alert the
compiler that the value of the shared object might change behind its
back.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de
Currently looking for a job in the European Union

Wojtek Lerch

unread,

Sep 24, 2004, 9:41:28 AM9/24/04

to

Since it's not sufficient, whether it's necessary or not depends on what
other mechanism you use that is sufficient. If, for instance, you use
POSIX mutexes to synchronise access, you don't need "volatile".

James Kuyper

unread,

Sep 24, 2004, 9:04:36 AM9/24/04

to

Douglas A. Gwyn wrote:
> James Kuyper wrote:
>
>> In principle, qualifying an object's declaration with 'volatile' is
sufficient in itself to guarantee that the object's value is unreliable.
>
> No! Volatile qualification merely disables caching of
> the object's value in the generated code.

Volatile qualification alone is sufficient, according to 6.7.3p6
additional requirements. Therefore, a conforming implementation of C
could put all objects defined as volatile, in one or more memory
segments that are mapped to devices or shared with other programs.

I'm well aware of the fact that this is not the way any real compiler
does it; they would drive away customers if they did. It is not the
intent of the clause. That's why I said "in principle". But that is what
it says.

> Perhaps you were misled by the requirement that access
> of an object *that is modified by external influences*
> (such as a directly addressable I/O register, or a
> separate thread) needs to be accessed via a volatile-
> qualified type in order to ensure that the actual
> value at its location is used rather than a cached
> value.

It would have been nice if the standard had chosen to describe it in
that fashion, or something equivalent. For instance, if 6.7.3p6 had
started out with the words "An object shall be declared volatile if it
may be modified ...". However, it doesn't, so the unpredictable
modification becomes a permitted consequence of declaring something
volatile, rather than being a reason why it should be declared volatile.

> Or perhaps you understand this but just chose
> unfortunate wording to express it.

I chose correct wording to express an unfortunate implication of the way
the standard describes things.

Dan Pop

unread,

Sep 24, 2004, 10:04:15 AM9/24/04

to

Why? How are the POSIX mutexes telling to the C compiler that the value
of the shared object might change behind its back?

Alexander Terekhov

unread,

Sep 24, 2004, 10:27:18 AM9/24/04

to

Dan Pop wrote:
[...]

> How are the POSIX mutexes telling to the C compiler that the value
> of the shared object might change behind its back?

They tell nothing to the C compiler. It's what makes the compiler
POSIX, not C. XBD 4.10.

regards,
alexander.

Wojtek Lerch

unread,

Sep 24, 2004, 12:09:39 PM9/24/04

to

Dan Pop wrote:
> In <2rimg9F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>Since it's not sufficient, whether it's necessary or not depends on what
>>other mechanism you use that is sufficient. If, for instance, you use
>>POSIX mutexes to synchronise access, you don't need "volatile".

> Why?

Because POSIX says so. Do *you* really use "volatile" with POSIX mutexes?

> How are the POSIX mutexes telling to the C compiler that the value
> of the shared object might change behind its back?

When your thread owns the mutex, the compiler is free to assume that the
shared object does not change behind its back. It's the programmer's
responsibility to make sure that other threads don't break that
assumption by modifying the object without locking the mutex.

When your thread doesn't own the mutex, it's your responsibility to make
sure that it's not accessing the object. If it's not accessing the
object, the compiler has no reason to care about whether its value
changes behind its back or not.

The locking and unlocking is done by calling library functions. As long
as the compiler doesn't know if those functions access the shared
object, it must assume that they may, and therefore any cached value of
the shared object becomes invalid when you call the function that locks
the mutex, and any value you assign to the object must be actually
stored in the object before you call the function that unlocks the
mutex. In short, it just works!

Alexander Terekhov

unread,

Sep 24, 2004, 12:43:07 PM9/24/04

to

Wojtek Lerch wrote:
[...]

> The locking and unlocking is done by calling library functions. As long
> as the compiler doesn't know if those functions access the shared
> object, it must assume that they may, and therefore any cached value of
> the shared object becomes invalid when you call the function that locks
> the mutex, and any value you assign to the object must be actually
> stored in the object before you call the function that unlocks the
> mutex. In short, it just works!

(quoting XBD 4.10 rationale)

Conforming applications may only use the functions listed to
synchronize threads of control with respect to memory access.
There are many other candidates for functions that might also be
used. Examples are: signal sending and reception, or pipe writing
and reading. In general, any function that allows one thread of
control to wait for an action caused by another thread of control
is a candidate. IEEE Std 1003.1-2001 does not require these
additional functions to synchronize memory access since this
would imply the following:

All these functions would have to be recognized by advanced
compilation systems so that memory operations and calls to
these functions are not reordered by optimization.

All these functions would potentially have to have memory
synchronization instructions added, depending on the particular
machine.

The additional functions complicate the model of how memory
is synchronized and make automatic data race detection
techniques impractical.

regards,
alexander.

Wojtek Lerch

unread,

Sep 24, 2004, 1:02:08 PM9/24/04

to

Alexander Terekhov wrote:
> (quoting XBD 4.10 rationale)

If you were trying to make a point, I have to admit I completely missed.

Alexander Terekhov

unread,

Sep 24, 2004, 1:17:43 PM9/24/04

to

Your (rather popular) "just works" explanation is misleading.

regards,
alexander.

Wojtek Lerch

unread,

Sep 24, 2004, 1:28:14 PM9/24/04

to

Alexander Terekhov wrote:
> Your (rather popular) "just works" explanation is misleading.

I guess it might not apply to some strange hardware, or some strange
compilers; but it does apply to a lot of typical hardware and a lot of
popular compilers, does it not?

Tzvetan Mikov

unread,

Sep 24, 2004, 7:46:08 PM9/24/04

to

Wojtek Lerch <Wojt...@yahoo.ca> wrote in message news:<2rj3peF...@uni-berlin.de>...

>
> I guess it might not apply to some strange hardware, or some strange
> compilers; but it does apply to a lot of typical hardware and a lot of
> popular compilers, does it not?

My guess is it applies to all _useful_ compilers :-) This issue is
broader than Posix, since Posix is not the only multithreading API,
and in most cases the multithreading API in question itself is
implemented in C...

regards,
Tzvetan

David Hopwood

unread,

Sep 24, 2004, 8:44:02 PM9/24/04

to

Dan Pop wrote:
> In <2rimg9F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>Dan Pop wrote:
>>>In <4153509E...@sun.com> Eric Sosman <eric....@sun.com> writes:
>>>
>>>> "Brain-dead" is too strong, to be sure. However, Alexander
>>>>is correct when he says that `volatile' has nothing to do with
>>>>multi-threading. A `volatile' qualification is neither sufficient
>>>>nor necessary as a means of ensuring integrity of shared data.
>>>
>>>It's certainly not sufficient, but it *is* necessary, to alert the
>>>compiler that the value of the shared object might change behind its
>>>back.
>>
>>Since it's not sufficient, whether it's necessary or not depends on what
>>other mechanism you use that is sufficient. If, for instance, you use
>>POSIX mutexes to synchronise access, you don't need "volatile".
>
> Why? How are the POSIX mutexes telling to the C compiler that the value
> of the shared object might change behind its back?

Implementation detail. It just has to work. Use of volatile in the mutex
implementation is neither necessary nor sufficient for it to work. In fact,
the implementation has to depend on a knowledge of both the hardware's memory
access model, and the code generation of the system's POSIX-conforming C
compiler.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Douglas A. Gwyn

unread,

Sep 25, 2004, 1:18:53 AM9/25/04

to

Eric Sosman wrote:
> "Brain-dead" is too strong, to be sure. However, Alexander
> is correct when he says that `volatile' has nothing to do with
> multi-threading. A `volatile' qualification is neither sufficient
> nor necessary as a means of ensuring integrity of shared data.
> The topic arises on comp.programming.threads about as often as
> `void main()' gets re-hashed on the C newsgroups; you may wish
> to browse the archives for an interminable overview of the issue.

And you may wish to read what I said upthread.

Douglas A. Gwyn

unread,

Sep 25, 2004, 1:23:54 AM9/25/04

to

Jack Klein wrote:
> So there is absolutely no guarantee that any software can provide that
> a volatile object wider than a single bus access can be read
> atomically.

One also needs mutexes or other primitives for protecting
conditional critical regions. Generally speaking, more
than a single object needs to be updated "atomically"
withing a c.c.r. For example, a balanced tree or a
linked list structure may need to be updated, and in many
cases it simply isn't feasible to do that reliably when
multiple concurrent threads are accessing the data
structure with at most local interlocking.

Douglas A. Gwyn

unread,

Sep 25, 2004, 1:25:11 AM9/25/04

to

Wojtek Lerch wrote:
> ... If, for instance, you use

> POSIX mutexes to synchronise access, you don't need "volatile".

Yes, you do.

Douglas A. Gwyn

unread,

Sep 25, 2004, 1:26:15 AM9/25/04

to

Alexander Terekhov wrote:
> They tell nothing to the C compiler. It's what makes the compiler
> POSIX, not C. XBD 4.10.

POSIX is not a compiler specification.

Douglas A. Gwyn

unread,

Sep 25, 2004, 1:42:56 AM9/25/04

to

James Kuyper wrote:
> ... It is not the intent of the clause. That's why I said

> "in principle". But that is what it says.

No, in fact what you claimed doesn't even make sense
in the context of pointer-to-volatile. The operative
requirements are: (1) reference to a volatile-qualified
object shall be done strictly according to the abstract
machine; (2) modifications to the value of the object
are flushed to the object storage by the next sequence
point. It is also stated that the implementation does
not know what other factors might also be modifying an
object declared with volatile qualification, which is
in fact why it is necessary for the programmer to be
able to instruct the compiler to follow the above rules
when accessing such an object.

It should be quite evident that without the compiler
following such procedures for an object, the value of
the object cannot be properly synchronized across
multiple threads. At least, not without time-slicing
the thread executions and at every thread switch
flushing all registers to associated storage (which is
not generally feasible), or else forcing the compiler
to treat every object in accordance with the above
rules (which would force a terrible performance hit).

David Hopwood

unread,

Sep 25, 2004, 3:21:17 AM9/25/04

to

Yes it is.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Alexander Terekhov

unread,

Sep 25, 2004, 8:42:13 AM9/25/04

to

http://groups.google.com/groups?selm=40505ed6%40usenet01.boi.hp.com
(Subject: Re: Harbison says "volatile" necessary for MT programming!)

regards,
alexander.

James Kuyper

unread,

Sep 25, 2004, 11:34:57 PM9/25/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<H6udnVlrdPH...@comcast.com>...

> James Kuyper wrote:
>
>> ... It is not the intent of the clause. That's why I said "in
principle".
>> But that is what it says.
>
> No, in fact what you claimed doesn't even make sense

That's what I'm pointing out: 6.7.3p6 doesn't make sense.

> in the context of pointer-to-volatile. The operative
> requirements are: (1) reference to a volatile-qualified
> object shall be done strictly according to the abstract
> machine; (2) modifications to the value of the object
> are flushed to the object storage by the next sequence
> point. It is also stated that the implementation does
> not know what other factors might also be modifying an
> object declared with volatile qualification, which is
> in fact why it is necessary for the programmer to be
> able to instruct the compiler to follow the above rules
> when accessing such an object.

It would be nice if the standard said that, but it doesn't. It says
nothing about volatile being necessary if the object is subject to
external modification. It says that volatile qualification is
sufficient to allow external modification.

Douglas A. Gwyn

unread,

Sep 26, 2004, 2:22:45 PM9/26/04

to

James Kuyper wrote:
> It would be nice if the standard said that, but it doesn't.

Sure it does. If you're having trouble reading it,
submit a DR and we'll explain it to you officially

Alexander Terekhov

unread,

Sep 27, 2004, 5:42:01 AM9/27/04

to

Save your time. Volatiles are brain-dead. What's needed is atomic<T>
with explicit store/load/red-modify-write calls together with msync::
stuff to label atomic<T> accesses with unidirectional reordering
constraints if/when they're needed (constraining both "compiler" and
"hardware") plus injection of bidirectional fences (e.g. slfence).

See http://groups.google.com/groups?selm=414E9E40.A66D4F48%40web.de

Uhmm,

< Forward Inline >

-------- Original Message --------
Message-ID: <4157CFA9...@web.de>
Newsgroups: comp.lang.c++.moderated
Subject: Re: Possible solution to the DCL problem (Scott Meyers, Andrei Alexandrescu)
References: ... <4fb4137d.04092...@posting.google.com>

johnchx wrote:
[...]
> You may be thinking of the Intel-style fence instructions (LFENCE,
> SFENCE, MFENCE), which *do* impose this kind of ordering.

I doubt that it's really needed, but OK... sfence/lfence has been
added. ;-)

msync::none // nothing (e.g. for refcount<T, basic>::increment)
msync::fence // classic fence (acq+rel -- see below)
msync::acq // classic acquire (hlb+hsb -- see below)
msync::ddacq // acquire with data dependency
msync::ccacq // acquire with control dependency
msync::hlb // hoist-load barrier -- acquire not affecting stores
msync::ddhlb // ...
msync::cchlb // ...
msync::hsb // hoist-store barrier -- acquire not affecting loads
msync::ddhsb // ...
msync::cchsb // ...
msync::rel // classic release (slb+ssb -- see below)
msync::slb // sink-load barrier -- release not affecting stores
msync::ssb // sink-store barrier -- release not affecting loads
msync::slfence // store-load fence (ssb+hlb -- see above)
msync::sfence // store-fence (ssb+hsb -- see above)
msync::lfence // load-fence (slb+hlb -- see above)

Note that unidirectional stuff is only uselful in conjunction with
some atomic<> access to "label" it. I mean:

atomic<int> X;

/* ... */
int x = X.load(msync::acq);

/* ... */
X.store(x, msync::rel);

Compare it to use of bidirectional fences... something like

atomic<int> X;
atomic<int> Y;

/* ... */
X.store(0, msync::rel);
barrier(msync::slfence);
int y = Y.load(msync::acq);

Some context can be found here:

http://groups.google.com/groups?selm=40CD5709.D487D33A%40web.de
(Subject: Re: What does Memory Barriers mean ??)

regards,
alexander.

Dan Pop

unread,

Sep 27, 2004, 9:38:44 AM9/27/04

to

In <2riv63F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:

>Dan Pop wrote:
>> In <2rimg9F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>>Since it's not sufficient, whether it's necessary or not depends on what
>>>other mechanism you use that is sufficient. If, for instance, you use
>>>POSIX mutexes to synchronise access, you don't need "volatile".
>
>> Why?
>
>Because POSIX says so. Do *you* really use "volatile" with POSIX mutexes?

Not with the mutexes themselves, with the shared objects protected by
them.

>> How are the POSIX mutexes telling to the C compiler that the value
>> of the shared object might change behind its back?
>
>When your thread owns the mutex, the compiler is free to assume that the
>shared object does not change behind its back.

How does the compiler know which shared object is protected by which
mutex? The relationship is not exposed to the compiler in any way...

Alexander Terekhov

unread,

Sep 27, 2004, 10:09:45 AM9/27/04

to

Dan Pop wrote:
[...]

> >Because POSIX says so. Do *you* really use "volatile" with POSIX mutexes?
>
> Not with the mutexes themselves, with the shared objects protected by
> them.

CLM.

>
> >> How are the POSIX mutexes telling to the C compiler that the value
> >> of the shared object might change behind its back?
> >
> >When your thread owns the mutex, the compiler is free to assume that the
> >shared object does not change behind its back.
>
> How does the compiler know which shared object is protected by which
> mutex? The relationship is not exposed to the compiler in any way...

POSIX mutexes (and alike stuff) are used to pass around the entire
view of shared space from releaser to acquirer. The so-called entry
release consistency (neither POSIX nor Java are fit for it, so to
say) is more restrictive in this respect.

regards,
alexander.

Richard Bos

unread,

Sep 27, 2004, 10:59:46 AM9/27/04

to

Alexander Terekhov <tere...@web.de> wrote:

> "Douglas A. Gwyn" wrote:
> >
> > James Kuyper wrote:
> > > It would be nice if the standard said that, but it doesn't.
> >
> > Sure it does. If you're having trouble reading it,
> > submit a DR and we'll explain it to you officially
>
> Save your time. Volatiles are brain-dead. What's needed is atomic<T>
> with explicit store/load/red-modify-write calls together with msync::

What is most definitely _not_ needed in C is all kinds of C++-style
butchered OO programming syntax shoehorned into the language.

Richard

Alexander Terekhov

unread,

Sep 27, 2004, 11:15:21 AM9/27/04

to

Richard Bos wrote:
[...]

> What is most definitely _not_ needed in C is all kinds of C++-style
> butchered OO programming syntax shoehorned into the language.

Get a clue. GP, not OO.

regards,
alexander.

Wojtek Lerch

unread,

Sep 27, 2004, 12:15:38 PM9/27/04

to

Dan Pop wrote:
> In <2riv63F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>Dan Pop wrote:
>>>How are the POSIX mutexes telling to the C compiler that the value
>>>of the shared object might change behind its back?
>>
>>When your thread owns the mutex, the compiler is free to assume that the
>>shared object does not change behind its back.
>
> How does the compiler know which shared object is protected by which
> mutex? The relationship is not exposed to the compiler in any way...

It doesn't need to. A compiler is *always* free to assume that a
non-volatile object doesn't change behind its back. The compiler
doesn't need to know which mutex protects the object. The compiler
doesn't need to know anything about mutexes.

As far as the compiler is concerned, you're calling some function that
may access the object, then you're doing some stuff that doesn't access
the object, and then you're calling some other function that may access
the object again. The compiler doesn't have to understand that the
first of those two functions unlocks a mutex and the second one locks
it. But it works, because the compiler has no reason to do anything
that depends on the assumption that if the object is read or modified
during that time, it happens inside those two function calls rather than
between them.

Yes, it is theoretically possible for a conforming compiler to make such
an assumption even though there's no reason to. A trivial example would
be a compiler that occassionally generates code that reads the value of
a randomly chosen static object, waits a little, reads it again, and
then aborts the program if the value turns out to have changed. But in
practice, it just works.

Wojtek Lerch

unread,

Sep 27, 2004, 1:30:05 PM9/27/04

to

Tzvetan Mikov wrote:
> Wojtek Lerch <Wojt...@yahoo.ca> wrote in message news:<2rj3peF...@uni-berlin.de>...
>>I guess it might not apply to some strange hardware, or some strange
>>compilers; but it does apply to a lot of typical hardware and a lot of
>>popular compilers, does it not?
>
> My guess is it applies to all _useful_ compilers :-)

Imagine an implementation of C where optimisation and code generation
are done by the linker rather than the compiler proper, in order to
allow more global optimisations. Most library functions are implemented
in C and are subject to optimisation by the linker; a small number of
library functions are implemented as machine code and are opaque to the
optimiser.

For any global object, the optimiser checks if the object's address can
possibly be known to any of the opaque functions. If not, none of the
opaque functions can possibly access the object, and the linker can
perform optimizations based on that. As a result, object accesses and
function calls can often be reordered. This makes it impossible to
implement a mutex API solely as library functions.

I don't know of any such implementation; but if it existed, I don't
think it would necessarily be useless.

Dan Pop

unread,

Sep 27, 2004, 1:44:24 PM9/27/04

to

In <2rqslbF...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:

>Dan Pop wrote:
>> In <2riv63F...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>>Dan Pop wrote:
>>>>How are the POSIX mutexes telling to the C compiler that the value
>>>>of the shared object might change behind its back?
>>>
>>>When your thread owns the mutex, the compiler is free to assume that the
>>>shared object does not change behind its back.
>>
>> How does the compiler know which shared object is protected by which
>> mutex? The relationship is not exposed to the compiler in any way...
>
>It doesn't need to. A compiler is *always* free to assume that a
>non-volatile object doesn't change behind its back. The compiler
>doesn't need to know which mutex protects the object. The compiler
>doesn't need to know anything about mutexes.
>
>As far as the compiler is concerned, you're calling some function that
>may access the object, then you're doing some stuff that doesn't access
>the object, and then you're calling some other function that may access
>the object again.

If the object doesn't have external linkage and its address is never
taken, the compiler can safely assume that the mutex-related functions
cannot access the shared object.

Wojtek Lerch

unread,

Sep 27, 2004, 2:35:08 PM9/27/04

to

Presumably, there is at least one extern function in the translation
unit that either accesses the object directly or calls some function
that accesses the object. Also, if there is another thread that can
access the object, the address of the function that that thread runs has
been given to a thread-creation function, and may be known to the
mutex-related functions. Sure, it may in some cases be possible for a
compiler to prove that none of those functions can be called by the
mutex-related function call without invoking undefined behaviour, but do
you know of any compilers that actually bother to try?...

Tzvetan Mikov

unread,

Sep 27, 2004, 9:14:19 PM9/27/04

to

Wojtek Lerch <Wojt...@yahoo.ca> wrote in message news:<2rr10uF...@uni-berlin.de>...

> Imagine an implementation of C where optimisation and code generation
> are done by the linker rather than the compiler proper, in order to
> allow more global optimisations.

At least VisualC and IntelC already do that (I am sure there are
others, but I haven't personally used them).

>
> For any global object, the optimiser checks if the object's address can
> possibly be known to any of the opaque functions. If not, none of the
> opaque functions can possibly access the object, and the linker can
> perform optimizations based on that. As a result, object accesses and
> function calls can often be reordered. This makes it impossible to
> implement a mutex API solely as library functions.
>
> I don't know of any such implementation; but if it existed, I don't
> think it would necessarily be useless.

But this is precisely my point: if a mutex cannot reasonably be
implemented as a library function, then such a compiler wouldn't be
very useful, would it ? Useful, of course, not in the absolute sense,
but in the sense that C (although not necessarily Standard C) is
usually used to implement most of the OS and libraries, not the other
way around.

regards,
Tzvetan

Douglas A. Gwyn

unread,

Sep 28, 2004, 5:41:11 AM9/28/04

to

Richard Bos wrote:
> What is most definitely _not_ needed in C is all kinds of C++-style
> butchered OO programming syntax shoehorned into the language.

Worse than that, the apparent semantics aren't
implementable with reasonable performance on many
platforms.

Douglas A. Gwyn

unread,

Sep 28, 2004, 5:42:55 AM9/28/04

to

Alexander Terekhov wrote:
> Save your time. Volatiles are brain-dead. What's needed is atomic<T>
> with explicit store/load/red-modify-write calls together with msync::
> stuff to label atomic<T> accesses with unidirectional reordering
> constraints if/when they're needed (constraining both "compiler" and
> "hardware") plus injection of bidirectional fences (e.g. slfence).

If volatile is "brain dead" then I suppose we'd have
to call the garbage you're advocating "retarded".

Alexander Terekhov

unread,

Sep 28, 2004, 6:00:07 AM9/28/04

to

Whatever. I, for one, am mildly amused at your profound ignorance of
MT memory models.

regards,
alexander.

Alexander Terekhov

unread,

Sep 28, 2004, 6:05:12 AM9/28/04

to

"Douglas A. Gwyn" wrote:

[... atomic<>/msync ...]

> apparent semantics aren't
> implementable with reasonable performance on many
> platforms.

Nonsense.

regards,
alexander.

Douglas A. Gwyn

unread,

Sep 28, 2004, 6:15:27 AM9/28/04

to

Alexander Terekhov wrote:
> Whatever. I, for one, am mildly amused at your profound ignorance of
> MT memory models.

From what information do you draw that conclusion?
I haven't been talking about "MT memory models".
I *have* been talking about the semantics and use
of volatile qualification in Standard C, to which
your sole contribution has been to keep repeating
that "volatile is brain-dead". Very sophisticated.

Dan Pop

unread,

Sep 28, 2004, 7:14:42 AM9/28/04

to

^^^^^^

>you know of any compilers that actually bother to try?...

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Since when is such a question relevant in comp.std.c?

As far as the C standard is concerned, a shared object may be changed
behind compiler's back, therefore it must be declared as volatile.

The alternative would be to add a new property to functions and declare
the mutex-related functions with this property, whose semantics would be:
this function can change any object of the program.

Tom Payne

unread,

Sep 28, 2004, 9:20:32 AM9/28/04

to

Hmmmmm. I think that somewhere in the Standard nonvolatile objects
are required to retain their values until the program modifies them.
So, if a nonvolatile object undergoes a spontaneous modification, the
implementation is out of conformance.

Tom Payne

Wojtek Lerch

unread,

Sep 28, 2004, 10:10:22 AM9/28/04

to

Dan Pop wrote:
> In <2rr4qtF...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>Dan Pop wrote:
>>>If the object doesn't have external linkage and its address is never
>>>taken, the compiler can safely assume that the mutex-related functions
>>>cannot access the shared object.
>>
>>Presumably, there is at least one extern function in the translation
>>unit that either accesses the object directly or calls some function
>>that accesses the object. Also, if there is another thread that can
>>access the object, the address of the function that that thread runs has
>>been given to a thread-creation function, and may be known to the
>>mutex-related functions. Sure, it may in some cases be possible for a
>>compiler to prove that none of those functions can be called by the
>>mutex-related function call without invoking undefined behaviour, but do
>
> ^^^^^^
>
>>you know of any compilers that actually bother to try?...
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Since when is such a question relevant in comp.std.c?

I'm not sure about comp.std.c, but it's relevant in a discussion about
POSIX mutexes. By demonstrating that protecting non-volatile object
with mutexes "just works" with most compilers, it explains why POSIX
*requires* it to work. Even if it doesn't work with some existing
compilers, those compilers are simply unsuitable to implement the POSIX
semantics of threads and mutexes.

> As far as the C standard is concerned, a shared object may be changed
> behind compiler's back, therefore it must be declared as volatile.

No; the C standard doesn't say it *must* be declared as volatile, only
that you get undefined behaviour if it's not. That's an important
difference. POSIX threads is an extension to standard C that defines
the behaviour, provided you follow the POSIX rules of how to protect
shared objects with mutexes.

> The alternative would be to add a new property to functions and declare
> the mutex-related functions with this property, whose semantics would be:
> this function can change any object of the program.

That's a possible way to change a compiler to make it suitable for
implementing the POSIX semantics, yes.

Wojtek Lerch

unread,

Sep 28, 2004, 11:26:06 AM9/28/04

to

Tzvetan Mikov wrote:
> But this is precisely my point: if a mutex cannot reasonably be
> implemented as a library function, then such a compiler wouldn't be
> very useful, would it ? Useful, of course, not in the absolute sense,
> but in the sense that C (although not necessarily Standard C) is
> usually used to implement most of the OS and libraries, not the other
> way around.

OK, if you define "useful" as "suitable for all the jobs a compiler can
possibly be used for", then I suppose I don't disagree. But in a more
conventional sense of "useful", this kind of a compiler could still be
quite useful in some environments. For instance, some people might want
to use it as a second compiler for an operating system that doesn't
support threads anyway -- perhaps because the compiler that came with
the OS is not as good at optimisation, or maybe is not quite conforming,
or lacks some useful extensions that the other compiler has, or whatever.

Keith Thompson

unread,

Sep 28, 2004, 1:27:21 PM9/28/04

to

Since I'm perfectly willing to admit my own profound ignorance in this
area, I'll ask: what is "MT"?

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Fred J. Tydeman

unread,

Sep 28, 2004, 7:33:11 PM9/28/04

to

Keith Thompson wrote:
>
> Since I'm perfectly willing to admit my own profound ignorance in this
> area, I'll ask: what is "MT"?

I assume Multi-Tasking
---
Fred J. Tydeman Tydeman Consulting
tyd...@tybor.com Programming, testing, numerics
+1 (775) 287-5904 Vice-chair of J11 (ANSI "C")
Sample C99+FPCE tests: ftp://jump.net/pub/tybor/
Savers sleep well, investors eat well, spenders work forever.

Brian Inglis

unread,

Sep 28, 2004, 8:23:08 PM9/28/04

to

On Tue, 28 Sep 2004 17:27:21 GMT in comp.std.c, Keith Thompson
<ks...@mib.org> wrote:

>Alexander Terekhov <tere...@web.de> writes:
>> "Douglas A. Gwyn" wrote:
>>> Alexander Terekhov wrote:
>>> > Save your time. Volatiles are brain-dead. What's needed is atomic<T>
>>> > with explicit store/load/red-modify-write calls together with msync::
>>> > stuff to label atomic<T> accesses with unidirectional reordering
>>> > constraints if/when they're needed (constraining both "compiler" and
>>> > "hardware") plus injection of bidirectional fences (e.g. slfence).
>>>
>>> If volatile is "brain dead" then I suppose we'd have
>>> to call the garbage you're advocating "retarded".
>>
>> Whatever. I, for one, am mildly amused at your profound ignorance of
>> MT memory models.
>
>Since I'm perfectly willing to admit my own profound ignorance in this
>area, I'll ask: what is "MT"?

I suspect it means Multi-Threaded, but the issue is actually MP
(Multi-Processor).

--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada

Brian....@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply

Douglas A. Gwyn

unread,

Sep 29, 2004, 5:33:56 AM9/29/04

to

Wojtek Lerch wrote:
> I'm not sure about comp.std.c, but it's relevant in a discussion about
> POSIX mutexes. By demonstrating that protecting non-volatile object
> with mutexes "just works" with most compilers, it explains why POSIX
> *requires* it to work. Even if it doesn't work with some existing
> compilers, those compilers are simply unsuitable to implement the POSIX
> semantics of threads and mutexes.

It is absurd for POSIX to require that a compiler truly
support such a thoughtless approach to shared storage;
without the volatile qualification the compiler would
have to produce horrendously inefficient code on many
platforms, in effect treating every data reference as
volatile (thus invalidating any caching of values in
registers). I hope that you are mistaken about POSIX
having made an explicit decision to require compilers
to act that way; however, judging from the bizarre
notions expressed by several in this thread, if they
have influence in such matters then matters may well
be as bad as you imply.

I recall urging WG14 some time ago to tackle the
threads issue head-on, but the majority view seemed
to be to leave that up to POSIX. Perhaps that was a
big mistake.

Alexander Terekhov

unread,

Sep 29, 2004, 6:09:13 AM9/29/04

to

Brian Inglis wrote:
[...]

> I suspect it means Multi-Threaded, but the issue is actually MP
> (Multi-Processor).

Not really.

http://groups.google.com/groups?selm=4152B42E.2B271094%40web.d

regards,
alexander.

Alexander Terekhov

unread,

Sep 29, 2004, 6:16:38 AM9/29/04

to

"Douglas A. Gwyn" wrote:

[... profound ignorance of MT memory models ...]

http://groups.google.com/groups?selm=40E57DD8.1C8E2FC2%40web.de
http://groups.google.com/groups?selm=40F4F750.FE8BCD9B%40web.de
http://groups.google.com/groups?selm=40F647F8.F59FE644%40web.de

might help.

regards,
alexander.

Alexander Terekhov

unread,

Sep 29, 2004, 6:19:21 AM9/29/04

to

Alexander Terekhov wrote:
>
> Brian Inglis wrote:
> [...]
> > I suspect it means Multi-Threaded,

Yep. Mr. Terekhov's model aside for a moment. ;-)

> but the issue is actually MP
> > (Multi-Processor).
>
> Not really.
>
> http://groups.google.com/groups?selm=4152B42E.2B271094%40web.d

Err.

http://groups.google.com/groups?selm=4152B42E.2B271094%40web.de

regards,
alexander.

Francis Glassborow

unread,

Sep 29, 2004, 6:56:42 AM9/29/04

to

In article <hZednQ11jPr...@comcast.com>, Douglas A. Gwyn
<DAG...@null.net> writes

>I recall urging WG14 some time ago to tackle the
>threads issue head-on, but the majority view seemed
>to be to leave that up to POSIX. Perhaps that was a
>big mistake.

Perhaps it wasn't originally but continuing to ignore the issue might
be. Increasingly even desk-top machines have multiple processors and
multi-cored single CPUs. That means that multiple threads can actually
be executed in parallel. I think that requires core language support for
efficiency and portability.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Alexander Terekhov

unread,

Sep 29, 2004, 7:49:28 AM9/29/04

to

Francis Glassborow wrote:
[...]

> Increasingly even desk-top machines have multiple processors and

Ha! Even set-top boxes...

http://www.gamepro.com/microsoft/xbox/hardware/news/35216.shtml

Sorta 6 way!

regards,
alexander.

Dan Pop

unread,

Sep 29, 2004, 8:22:05 AM9/29/04

to

In <2rt9mdF...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:

>Dan Pop wrote:
>> In <2rr4qtF...@uni-berlin.de> Wojtek Lerch <Wojt...@yahoo.ca> writes:
>>>Dan Pop wrote:
>>>>If the object doesn't have external linkage and its address is never
>>>>taken, the compiler can safely assume that the mutex-related functions
>>>>cannot access the shared object.
>>>
>>>Presumably, there is at least one extern function in the translation
>>>unit that either accesses the object directly or calls some function
>>>that accesses the object. Also, if there is another thread that can
>>>access the object, the address of the function that that thread runs has
>>>been given to a thread-creation function, and may be known to the
>>>mutex-related functions. Sure, it may in some cases be possible for a
>>>compiler to prove that none of those functions can be called by the
>>>mutex-related function call without invoking undefined behaviour, but do
>>
>> ^^^^^^
>>
>>>you know of any compilers that actually bother to try?...
>>
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Since when is such a question relevant in comp.std.c?
>
>I'm not sure about comp.std.c, but it's relevant in a discussion about
>POSIX mutexes. By demonstrating that protecting non-volatile object
>with mutexes "just works" with most compilers, it explains why POSIX
>*requires* it to work.

Where does POSIX *requires* it to work with non-volatile objects?

Alexander Terekhov

unread,

Sep 29, 2004, 9:28:33 AM9/29/04

to

Dan Pop wrote:
[...]
> Where does POSIX *requires* it to work with non-[... "Those
> That Shall Not Be Named" ...] objects?

POSIX (XBD 4.10) outlaws unsynchronized {read-}write access to
"memory locations" (whatever that is) and says that certain
functions "synchronize thread execution and also synchronize
memory with respect to other threads." POSIX rationale sorta
clarifies it a bit: "Formal definitions of the memory model
were rejected as unreadable by the vast majority of
programmers. In addition, most of the formal work in the
literature has concentrated on the memory as provided by the
hardware as opposed to the application programmer through the
compiler and runtime system. It was believed that a simple
statement intuitive to most programmers would be most
effective."

regards,
alexander.

Wojtek Lerch

unread,

Sep 29, 2004, 10:26:28 AM9/29/04

to

"Dan Pop" <Dan...@cern.ch> wrote in message
news:cje9dd$b00$1...@sunnews.cern.ch...

> Where does POSIX *requires* it to work with non-volatile objects?

Dan Pop wrote:
> Where does POSIX *requires* it to work with non-volatile objects?

Um... I have to admit that I have trouble finding anything clearer than
this
(http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_10):

| 4.10 Memory Synchronization
|
| Applications shall ensure that access to any memory location by more
| than one thread of control (threads or processes) is restricted such
| that no thread of control can read or modify a memory location while
| another thread of control may be modifying it. Such access is
| restricted using functions that synchronize thread execution and
| also synchronize memory with respect to other threads. The following
| functions synchronize memory with respect to other threads:
|
| <a list of functions including pthread_mutex_lock and
| pthread_mutex_unlock>

Here's how Dave Butenhoff explained it in the comp.programming.threads FAQ
(http://www.lambdacs.com/cpt/FAQ.html#Q56):

| Q56: Why don't I need to declare shared variables VOLATILE?
|
|
| > I'm concerned, however, about cases where both the compiler and the
| > threads library fulfill their respective specifications. A conforming
| > C compiler can globally allocate some shared (nonvolatile) variable to
| > a register that gets saved and restored as the CPU gets passed from
| > thread to thread. Each thread will have it's own private value for
| > this shared variable, which is not what we want from a shared
| > variable.
|
| In some sense this is true, if the compiler knows enough about the
| respective scopes of the variable and the pthread_cond_wait (or
| pthread_mutex_lock) functions. In practice, most compilers will not try
| to keep register copies of global data across a call to an external
| function, because it's too hard to know whether the routine might
| somehow have access to the address of the data.
|
| So yes, it's true that a compiler that conforms strictly (but very
| aggressively) to ANSI C might not work with multiple threads without
| volatile. But someone had better fix it. Because any SYSTEM (that is,
| pragmatically, a combination of kernel, libraries, and C compiler) that
| does not provide the POSIX memory coherency guarantees does not CONFORM
| to the POSIX standard. Period. The system CANNOT require you to use
| volatile on shared variables for correct behavior, because POSIX
| requires only that the POSIX synchronization functions are necessary.
|
| So if your program breaks because you didn't use volatile, that's a BUG.
| It may not be a bug in C, or a bug in the threads library, or a bug in
| the kernel. But it's a SYSTEM bug, and one or more of those components
| will have to work to fix it.
|
| You don't want to use volatile, because, on any system where it makes
| any difference, it will be vastly more expensive than a proper
| nonvolatile variable. (ANSI C requires "sequence points" for volatile
| variables at each expression, whereas POSIX requires them only at
| synchronization operations -- a compute-intensive threaded application
| will see substantially more memory activity using volatile, and, after
| all, it's the memory activity that really slows you down.)

Wojtek Lerch

unread,

Sep 29, 2004, 10:53:26 AM9/29/04

to

Douglas A. Gwyn wrote:
> Wojtek Lerch wrote:
>
>> I'm not sure about comp.std.c, but it's relevant in a discussion about
>> POSIX mutexes. By demonstrating that protecting non-volatile object
>> with mutexes "just works" with most compilers, it explains why POSIX
>> *requires* it to work. Even if it doesn't work with some existing
>> compilers, those compilers are simply unsuitable to implement the
>> POSIX semantics of threads and mutexes.
>
>
> It is absurd for POSIX to require that a compiler truly
> support such a thoughtless approach to shared storage;
> without the volatile qualification the compiler would
> have to produce horrendously inefficient code on many
> platforms, in effect treating every data reference as
> volatile (thus invalidating any caching of values in

Why? All that POSIX requires is that for a small set of library
functions, any cached value must be flushed to memory before calling the
function, and is invalidated by the call. In most cases, the compiler
must do that anyway, because it can't prove that the function doesn't
access the object. How does that make the code less efficient compared
to declaring all the shared objects as volatile and flushing them to
memory at every sequence point?

Dan Pop

unread,

Sep 29, 2004, 11:04:07 AM9/29/04

to

In <2rvv0mF...@uni-berlin.de> "Wojtek Lerch" <Wojt...@yahoo.ca> writes:

>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:cje9dd$b00$1...@sunnews.cern.ch...
>> Where does POSIX *requires* it to work with non-volatile objects?
>

>Um... I have to admit that I have trouble finding anything clearer than
>this
>(http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_10):
>
>| 4.10 Memory Synchronization
>|
>| Applications shall ensure that access to any memory location by more
>| than one thread of control (threads or processes) is restricted such
>| that no thread of control can read or modify a memory location while
>| another thread of control may be modifying it. Such access is
>| restricted using functions that synchronize thread execution and
>| also synchronize memory with respect to other threads. The following
>| functions synchronize memory with respect to other threads:
>|
>| <a list of functions including pthread_mutex_lock and
>| pthread_mutex_unlock>

I have a hard time seeing how a function call can force the compiler to
unconditionally dump *all* the data cached in registers (or, at least,
data that was modified since being last loaded from memory), but it
is obvious that this is what this paragraph requires.

I guess most real compilers do that anyway, whenever *any* function is
called, so implementing POSIX threads on them is no big deal. But very
smart compilers might need to be invoked in POSIX mode for this to work.

Wojtek Lerch

unread,

Sep 29, 2004, 11:29:35 AM9/29/04

to

Dan Pop wrote:
> I have a hard time seeing how a function call can force the compiler to
> unconditionally dump *all* the data cached in registers (or, at least,
> data that was modified since being last loaded from memory), but it
> is obvious that this is what this paragraph requires.
>
> I guess most real compilers do that anyway, whenever *any* function is
> called, so implementing POSIX threads on them is no big deal. But very
> smart compilers might need to be invoked in POSIX mode for this to work.

Or they could have a #pragma or some other extension allowing the POSIX
header to tell the compiler that the functions from the list have to be
treated specially.

Tzvetan Mikov

unread,

Sep 29, 2004, 6:32:42 PM9/29/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<hZednQ11jPr...@comcast.com>...

> Wojtek Lerch wrote:
> > I'm not sure about comp.std.c, but it's relevant in a discussion about
> > POSIX mutexes. By demonstrating that protecting non-volatile object
> > with mutexes "just works" with most compilers, it explains why POSIX
> > *requires* it to work. Even if it doesn't work with some existing
> > compilers, those compilers are simply unsuitable to implement the POSIX
> > semantics of threads and mutexes.
>
> It is absurd for POSIX to require that a compiler truly
> support such a thoughtless approach to shared storage;
> without the volatile qualification the compiler would
> have to produce horrendously inefficient code on many
> platforms, in effect treating every data reference as
> volatile (thus invalidating any caching of values in
> registers). I hope that you are mistaken about POSIX
> having made an explicit decision to require compilers
> to act that way; however, judging from the bizarre
> notions expressed by several in this thread, if they
> have influence in such matters then matters may well
> be as bad as you imply.

Are you seriously advocating that _all_ data shared between threads
should be declared volatile ? You must realize that this is extremely
inconvenient and practically impossible.
The performance hit of such a decision would be much greater than the
purely hypothetical problem in a hypothetical super-optimizing
compiler that you are describing. Further, I see no problem in
requiring even such a super-optimizing compiler compiler to flush the
cached values of global variables and clear its aliasing cache before
invoking certain functions.

regards,
Tzvetan

Douglas A. Gwyn

unread,

Sep 30, 2004, 3:16:07 AM9/30/04

to

Wojtek Lerch wrote:
> | 4.10 Memory Synchronization
> |
> | Applications shall ensure that access to any memory location by more
> | than one thread of control (threads or processes) is restricted such
> | that no thread of control can read or modify a memory location while
> | another thread of control may be modifying it. Such access is
> | restricted using functions that synchronize thread execution and
> | also synchronize memory with respect to other threads. The following
> | functions synchronize memory with respect to other threads:
> |
> | <a list of functions including pthread_mutex_lock and
> | pthread_mutex_unlock>

Indeed, and note that the burden is placed on the
*application* to synchronize access to shared memory
location by using such interlocking functions, whose
very names derive from "mutual exclusion". There is
no requirement that such functions effect register-
cache flushing (which isn't even feasible). Thus,
in addition to serializing access through use of
mutexes, the application *also* needs to declare the
shared data (or at least all accesses to it) with
volatile qualification to ensure that cached values
will be flushed before crossing the mutex boundary.

> Here's how Dave Butenhoff explained it in the comp.programming.threads FAQ
> (http://www.lambdacs.com/cpt/FAQ.html#Q56):
> | Q56: Why don't I need to declare shared variables VOLATILE?
> | > I'm concerned, however, about cases where both the compiler and the
> | > threads library fulfill their respective specifications. A conforming
> | > C compiler can globally allocate some shared (nonvolatile) variable to
> | > a register that gets saved and restored as the CPU gets passed from
> | > thread to thread. Each thread will have it's own private value for
> | > this shared variable, which is not what we want from a shared
> | > variable.
> | In some sense this is true, if the compiler knows enough about the
> | respective scopes of the variable and the pthread_cond_wait (or
> | pthread_mutex_lock) functions. In practice, most compilers will not try
> | to keep register copies of global data across a call to an external
> | function, because it's too hard to know whether the routine might
> | somehow have access to the address of the data.

Butenhoff and that FAQ have gotten this sadly wrong.
While most C implementations necessarily flush cached
"global" variables (those with external linkage) upon
calling a separately compiled function, there is no
need for them to flush cached "local" variables, some
of which can easily be shared among threads. POSIX
does not require it, it is a disservice to compiler
users when the data is not in fact shared, and it is
unreasonable to think that compilers are going to
obey this mistaken model. Contrary to the FAQ,
applications *must* use volatile qualification to
ensure that their shared data is accurately shared;
they might get away with not doing so when the shared
data has external linkage, but even in such a case it
is advisable to explicitly use volatile qualification.

> | You don't want to use volatile, because, on any system where it makes
> | any difference, it will be vastly more expensive than a proper
> | nonvolatile variable. (ANSI C requires "sequence points" for volatile
> | variables at each expression, whereas POSIX requires them only at
> | synchronization operations -- a compute-intensive threaded application
> | will see substantially more memory activity using volatile, and, after
> | all, it's the memory activity that really slows you down.)

That's wrong. POSIX doesn't require them at all;
that's a misunderstanding of the spec 4.10.

It is certainly true that while one thread has
exclusive access to some set of shared variables,
it doesn't really need all accesses synchronized
with the actual storage at each sequence point,
just at the boundaries of the conditional critical
region. There are ways to access objects via types
that include volatile qualification even though the
actual storage being referenced is not declared
with volatile qualification, that will have the
desired effect. The FAQ would provide a useful
service if it were to show an example of how to do
this eficiently the right way, rather than
encouraging incorrect programming.

Douglas A. Gwyn

unread,

Sep 30, 2004, 3:22:31 AM9/30/04

to

Dan Pop wrote:
> I guess most real compilers do that anyway, whenever *any* function is

> called, ...

I haven't taken a survey, but "callee saves" implementations
are likely to have a set of registers reserved for local use
in each frame, with cached values preserved across function
calls. Motorola M*CORE architecture and calling convention
is interesting in having half the registers "caller saves"
and half "callee saves".

Objects with *external linkage* will have cached values
flushed before an externally provided function is called,
because that is necessary for proper C semantics in the
case that the called function accesses the object values.
(However, if the compiler can "see" what the function can
access, it need only flush what might be accessed.)

Alexander Terekhov

unread,

Sep 30, 2004, 6:51:10 AM9/30/04

to

"Douglas A. Gwyn" wrote:
[...]

> obey this mistaken model. Contrary to the FAQ,
> applications *must* use volatile qualification to
> ensure that their shared data is accurately shared;
> they might get away with not doing so when the shared
> data has external linkage, but even in such a case it
> is advisable to explicitly use volatile qualification.

One must be facing serious denial or a sad mental
condition to follow that advice.

regards,
alexander.

Tzvetan Mikov

unread,

Sep 30, 2004, 1:03:49 PM9/30/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<b7ednRZwVNQ...@comcast.com>...

> Butenhoff and that FAQ have gotten this sadly wrong.
> While most C implementations necessarily flush cached
> "global" variables (those with external linkage) upon
> calling a separately compiled function, there is no
> need for them to flush cached "local" variables, some
> of which can easily be shared among threads. POSIX
> does not require it, it is a disservice to compiler
> users when the data is not in fact shared, and it is
> unreasonable to think that compilers are going to
> obey this mistaken model. Contrary to the FAQ,
> applications *must* use volatile qualification to
> ensure that their shared data is accurately shared;
> they might get away with not doing so when the shared
> data has external linkage, but even in such a case it
> is advisable to explicitly use volatile qualification.

Isn't the purpose of the Standard to standardize existing practices ?
In that case, requiring volatile for shared data is out of the
question. 99% (if not 100%) of users and compilers do not require it.
It will never happen. What is the point of arguing theory, when all
that can happen is add a new burden on the programmer ?

That is not to say I completely disagree with your point. I just think
it isn't practical.

regards,
Tzvetan

Keith Thompson

unread,

Sep 30, 2004, 3:13:00 PM9/30/04

to

Perhaps you could make your point without flinging personal insults.
If you have a technical point to make, make it. (And please keep in
mind that a lot of us here aren't intimately familiar with this area.)

Douglas A. Gwyn

unread,

Sep 30, 2004, 4:03:37 PM9/30/04

to

Tzvetan Mikov wrote:
> Isn't the purpose of the Standard to standardize existing practices ?

There would be no need for a standard if there wasn't a problem
with existing practice. In the case of programming languages
such as C that intentionally permit nonportable code, requiring
all implementations to support a specific instance of what is
clearly a nonportable, even ill-advised, programming practice
would perform a disservice for all other applications.

> In that case, requiring volatile for shared data is out of the
> question. 99% (if not 100%) of users and compilers do not require it.
> It will never happen. What is the point of arguing theory, when all
> that can happen is add a new burden on the programmer ?

It *isn't just theory*, it's *necessary* for correct use of
local shared objects (in C) in a threaded environment. The
arguments you have heard about it not being necessary are
simply *wrong*. While nearly all implementations happen to
do the "right thing" when the shared data has external
linkage *and* actual function calls are used for mutex
interlocks, if either the data does not have external
linkage *or* mutexes are implemented with in-line code,
several compilers I know of will *not* do what is considered
the "right thing". Yet what they do is perfectly reasonable
and in fact very desirable in the big picture. The burden
of correct use of the language falls on the application
programmer, not on the language implementation.

Tom Payne

unread,

Oct 1, 2004, 12:16:53 AM10/1/04

to

Dan Pop <Dan...@cern.ch> wrote:
[...]

> I have a hard time seeing how a function call can force the compiler to
> unconditionally dump *all* the data cached in registers (or, at least,
> data that was modified since being last loaded from memory), but it
> is obvious that this is what this paragraph requires.
>
> I guess most real compilers do that anyway, whenever *any* function is
> called, so implementing POSIX threads on them is no big deal. But very
> smart compilers might need to be invoked in POSIX mode for this to work.

POSIX requires a super sequence point (i.e., one that applies even to
nonvolatile variables) only at calls to coordination primitives.

Tom Payne

Tzvetan Mikov

unread,

Oct 1, 2004, 12:29:51 AM10/1/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<415C6699...@null.net>...

> It *isn't just theory*, it's *necessary* for correct use of
> local shared objects (in C) in a threaded environment. The
> arguments you have heard about it not being necessary are
> simply *wrong*. While nearly all implementations happen to
> do the "right thing" when the shared data has external
> linkage *and* actual function calls are used for mutex
> interlocks, if either the data does not have external
> linkage *or* mutexes are implemented with in-line code,
> several compilers I know of will *not* do what is considered
> the "right thing".

As others have said, if the compiler does such extensive
optimizations, there should be a way to turn them off selectively when
implementing synchronization calls. After such a call the compiler
must assume that all globals have unknown values and that pointers can
alias anything.

Even if you are sharing local data, you must obtain its address and
either pass it to a synchronization routine, or store it in a global
variable. Either way, the compiler must assume that it changes across
"synchronized calls".
Can you give an example where this will not work ?

> Yet what they do is perfectly reasonable
> and in fact very desirable in the big picture.

I do see your point that in some cases it might be desirable to have
the behavior you describe. For example if there are several global
variables, but not all of them are actually shared between threads,
you might want to allow the compiler to optimize the non-volatile
ones.
But we must also consider the other side: declaring a variable
volatile is actually more than what is necessary for correct thread
usage. Example:

volatile int b;
lock();
b = 10;
++b;++b;++b;++b;++b;
unlock();

It b wasn't volatile it would have been perfectly legal to optimize
the code between the lock/unlock, even in a multithreaded environment.

> The burden
> of correct use of the language falls on the application
> programmer, not on the language implementation.

So, if a large dynamic structure must be shared between threads, all
pointers to the structure and all data members within the structure
should be declared volatile, correct ? The implications of this are
vast. For example, C++ STL could never be used in a multithreaded
environment since the containers do not use volatile (and you can
guess what the effect of having volatile containers would be :-)

In my experience (with different OSes and compilers), volatile is
almost always useless and sometimes even harmful, considering its
unclear semantics.
Its effects are better achieved with a function call (albeit a
non-standard one).

For example, Win32's InterlockedIncrement() guarantees both atomicity
of execution in respect to invocations of other Interlocked...()
functions, and guarantees that the caller will not see a cached value.
This is well defined, precise and very useful.

regards,
Tzvetan

David Hopwood

unread,

Oct 1, 2004, 12:42:15 AM10/1/04

to

Douglas A. Gwyn wrote:
> Wojtek Lerch wrote:
>
>> | 4.10 Memory Synchronization
>> |
>> | Applications shall ensure that access to any memory location by more
>> | than one thread of control (threads or processes) is restricted such
>> | that no thread of control can read or modify a memory location while
>> | another thread of control may be modifying it. Such access is
>> | restricted using functions that synchronize thread execution and
>> | also synchronize memory with respect to other threads. The following
>> | functions synchronize memory with respect to other threads:
>> |
>> | <a list of functions including pthread_mutex_lock and
>> | pthread_mutex_unlock>
>
> Indeed, and note that the burden is placed on the
> *application* to synchronize access to shared memory
> location by using such interlocking functions, whose
> very names derive from "mutual exclusion". There is
> no requirement that such functions effect register-
> cache flushing (which isn't even feasible).

What's that supposed to mean? There is no such thing as "register-cache
flushing" defined by the POSIX standard, and so of course there is no
requirement about it. It would be as inappropriate to have such a
requirement as it would be for C99 to talk about specific calling
conventions.

POSIX does require this to the extent that is needed to implement 4.10.
Optimising compilers need to do escape analysis on each variable. This
analysis will detect whether each variable could possibly be shared
between threads. All variables that might be shared must be flushed
whenever a call to one of the functions listed in 4.10 possibly occurs.
This is absolutely clear and I don't see how there can really be any
dispute about it.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

David Hopwood

unread,

Oct 1, 2004, 12:46:18 AM10/1/04

to

Douglas A. Gwyn wrote:
> Tzvetan Mikov wrote:
>
>>Isn't the purpose of the Standard to standardize existing practices ?
>
> There would be no need for a standard if there wasn't a problem
> with existing practice. In the case of programming languages
> such as C that intentionally permit nonportable code, requiring
> all implementations to support a specific instance of what is
> clearly a nonportable, even ill-advised, programming practice
> would perform a disservice for all other applications.
>
>>In that case, requiring volatile for shared data is out of the
>>question. 99% (if not 100%) of users and compilers do not require it.
>>It will never happen. What is the point of arguing theory, when all
>>that can happen is add a new burden on the programmer ?
>
> It *isn't just theory*, it's *necessary* for correct use of
> local shared objects (in C) in a threaded environment. The
> arguments you have heard about it not being necessary are
> simply *wrong*. While nearly all implementations happen to
> do the "right thing" when the shared data has external
> linkage *and* actual function calls are used for mutex
> interlocks, if either the data does not have external
> linkage *or* mutexes are implemented with in-line code,
> several compilers I know of will *not* do what is considered
> the "right thing".

If these are C compilers mandated by POSIX, then those systems are not
POSIX-conforming. It's as simple as that.

--
David Hopwood <david.nospam.

Alexander Terekhov

unread,

Oct 1, 2004, 4:37:35 AM10/1/04

to

Tom Payne wrote:
[...]

> POSIX requires a super sequence point (i.e., one that applies even to
> nonvolatile variables) only at calls to coordination primitives.

POSIX allows certain reodering across "super sequence points." Acquire
operations (mutex lock, sema_wait, etc.) prevent subsequent (in program
order) memory accesses (to shared data) from moving "up in time" to
before the acquire operation; IOW, they prevent hoisting above acquire
operation. Release operations (mutex_unlock, sema_post, etc.) prevents
prior (in program order) memory accesses from moving "down in time" to
after the release operation; they prevent sinking below release
operation.. I mean that for example

x = 1;
lock(mutex);

can be transformed to

lock(mutex);
x = 1;

and

unlock(mutex);
y = 1;

to

y = 1;
unlock(mutex);

Conforming application won't notice it because it must follow memory
synchronization rules (XBD 4.10, not defined "memory location" aside
for a moment).

http://groups.google.de/groups?selm=3F74383F.129FD5EA%40web.de
(see P.S. section, Tom ;-) )

"ddslb_t" was a thinko, BTW.

As for ddhlb, see

http://www.cs.wisc.edu/~cain/pubs/micro01_correct_vp.pdf

regards,
alexander.

Alexander Terekhov

unread,

Oct 1, 2004, 4:45:32 AM10/1/04

to

Tzvetan Mikov wrote:
[...]

> For example, Win32's InterlockedIncrement() guarantees both atomicity
> of execution in respect to invocations of other Interlocked...()
> functions, and guarantees that the caller will not see a cached value.
> This is well defined, precise and very useful.

Win32's Interlocked stuff is brain-dead. InterlockedIncrement()
is fully fenced and I have yet to see a use case that would need
a full fence for it. Thread-safe reference counting GC (providing
"basic" thread-safety), can benefit a lot from "naked" increments
(and from "conditional" acq/rel for decrements).

regards,
alexander.

Alexander Terekhov

unread,

Oct 1, 2004, 5:12:10 AM10/1/04

to

"Douglas A. Gwyn" wrote:
[...]

> very desirable in the big picture.

Standardized atomic<> with msync::stuff is very desirable in the big
picture.

regards,
alexander.

Alexander Terekhov

unread,

Oct 1, 2004, 6:48:33 AM10/1/04

to

David Hopwood wrote:
[...]

> POSIX does require this to the extent that is needed to implement 4.10.
> Optimising compilers need to do escape analysis on each variable. This
> analysis will detect whether each variable could possibly be shared
> between threads. All variables that might be shared must be flushed
> whenever a call to one of the functions listed in 4.10 possibly occurs.

Well, that's not quite correct (I mean more relaxed release/acquire
sink/hoist semantics). Apart from that, 4.10 erroneously lists
pthread_condvar_signal() and pthread_condvar_broadcast().

cond_wait: unlock(mutex); sleep(random()); lock(mutex);

cond_signal: nop

cond_broadcast: nop

is conforming (apart from realtime scheduling, of course).

regards,
alexander.

Douglas A. Gwyn

unread,

Oct 1, 2004, 12:46:36 PM10/1/04

to

Jonathan Adams wrote:
> Forcing the developer to mark everything as volatile to protect against
> a rampant optimizer is not a way to get fast or correct output, ...

That's not what I said.

> and does not fit into the Posix (or Windows, or Solaris Kernel, or Linux Kernel,
> etc.) model of doing multithreaded programming.

Let's see that model. The spec for the mutex functions
doesn't imply what you sem to think it does.

Douglas A. Gwyn

unread,

Oct 1, 2004, 12:59:18 PM10/1/04

to

Tzvetan Mikov wrote:
> As others have said, if the compiler does such extensive

> optimizations, ...

But they're not necessarily "optimizations"; as I noted,
their an essential aspect of the M*CORE ABI, and that
processor has a very nice ISA. What you guys seem to be
advocating is that normal good code generation is simply
not allowed in the presence of threads, and that position
is undesirable for all other application contexts.

> Even if you are sharing local data, you must obtain its address and
> either pass it to a synchronization routine, or store it in a global
> variable.

No. Mutex locks are not the shared data itself, but
just used to protect such objects.

> volatile int b;
> lock();
> b = 10;
> ++b;++b;++b;++b;++b;
> unlock();

Of course that code is absurd in the first place,
but even so, try th following general approach:
volatile int b;
lock();
int b_ = 10; // more likely would load initial b value
++b_;++b_;++b_;++b_;++b_; // probably optimized
b = b_; // flushed to storage
unlock();

> ... For example, C++ STL could never be used in a multithreaded

> environment since the containers do not use volatile (and you can
> guess what the effect of having volatile containers would be :-)

I'm not trying to address deficiencies in C++ or its STL.
Perhaps that language needs additional help.

> In my experience (with different OSes and compilers), volatile is
> almost always useless and sometimes even harmful, considering its
> unclear semantics.

The semantics of volatile are quite clear, although
specific details (width of access, etc.) are defined
by the implementation.

> Its effects are better achieved with a function call (albeit a
> non-standard one).

Wrong!

Douglas A. Gwyn

unread,

Oct 1, 2004, 1:00:29 PM10/1/04

to

Alexander Terekhov wrote:
> Win32's Interlocked stuff is brain-dead.

Apparently everything the rest of the world does
qualifies as "brain-dead" to Mr. Terekhov. It must
be wonderful to be so much smarter than everybody
else.

Douglas A. Gwyn

unread,

Oct 1, 2004, 1:08:13 PM10/1/04

to

David Hopwood wrote:
> What's that supposed to mean? There is no such thing as "register-cache
> flushing" defined by the POSIX standard, and so of course there is no
> requirement about it.

If you don't understand what I meant, then no wonder
you don't see the nature of the problem. Compiler
implementors (should) know what I was referring to.

> POSIX does require this to the extent that is needed to implement 4.10.

How so? 4.10 as cited so far doesn't address that at all.

> This is absolutely clear and I don't see how there can really be any
> dispute about it.

To the contrary, it is clear to me that 4.10 was
telling the *application programmer* about the
necessity to protect access to shared variables
with mutex locks. That is necessary but by no
means sufficient to ensure safe thread programming.

A significant reason to doubt the contrary
interpretation is that POSIX is not a PL spec;
it is an API spec that layers facilities on top of,
not instead of, those specified by PL specs.
If it were truly trying to do otherwise, that would
be an issue of standards miscoordination needing to
be resolved at the ISO JTC1 level.

Douglas A. Gwyn

unread,

Oct 1, 2004, 1:17:01 PM10/1/04

to

Douglas A. Gwyn wrote:
> their

they're

I don't know how that typo occurred..

Message has been deleted

Keith Thompson

unread,

Oct 1, 2004, 3:58:08 PM10/1/04

to

It is. 8-)}

kar...@acm.org

unread,

Oct 1, 2004, 4:03:45 PM10/1/04

to

Douglas A. Gwyn wrote:
> Tzvetan Mikov wrote:
> > As others have said, if the compiler does such extensive
> > optimizations, ...
>
> But they're not necessarily "optimizations"; as I noted,
> their an essential aspect of the M*CORE ABI, and that
> processor has a very nice ISA. What you guys seem to be
> advocating is that normal good code generation is simply
> not allowed in the presence of threads, and that position
> is undesirable for all other application contexts.
>
> > Even if you are sharing local data, you must obtain its address and
> > either pass it to a synchronization routine, or store it in a
global
> > variable.
>
> No. Mutex locks are not the shared data itself, but
> just used to protect such objects.

He's talking about sharing local data: taking its address to pass to a
function etc, in order to invalidate the compiler's cached data values.
karl m

Douglas A. Gwyn

unread,

Oct 1, 2004, 4:31:01 PM10/1/04

to

Richard Kettlewell wrote:
> What does the requirement that the listed functions "synchronize
> memory with respect to other threads" mean, in your interpretation,
> then?

In general, in order for an application to safely share data
among threads, access to that data must be synchronized w.r.t.
thread scheduling. The standard approach these days is to
use mutex objects as "locks" in order to ensure that only one
thread at a time is granted access to the shared data; the
span between return from a mutex lock up to the matching unlock
has historically been called a "conditional critical region".
(Concurrent programming has been studied long before the
recent emergence of p-threads.) Just what data is associated
with a given mutex lock is up to the application. Mutexes
themselves have nothing to do with ensuring that any value
cache for a shared object is flushed from a fast register to
the actual allocated storage, which is the only reliable place
another thread will be able to access the value. In C,
volatile qualification is the only available mechanism for
ensuring that flushing occurs; in particular, it is *not*
implied by function-call semantics (except in limited contexts
as I noted earlier). It has also been noted that a simple-
minded use of volatile can incur a performance penalty
within the conditional critical region, but I gave an example
of how that can easily be avoided by a cognizant programmer.

Generally speaking, the attitude "it just has to work, the
programmer shouldn't have to think about these things"
encourages bad programming practice, leading to products that
have race conditions (or if not immediately, will have them
when ported to other perfectly reasonable platforms).

Douglas A. Gwyn

unread,

Oct 1, 2004, 4:34:18 PM10/1/04

to

David Hopwood wrote:
> ... There is no such thing as "register-cache
> flushing" defined by the POSIX standard, ...

In my previous response, I forgot to mention that not
mentioning the matter is directly related to why the
POSIX mutex functions do not suffice in themselves to
ensure that shared data works right across threads.

Tzvetan Mikov

unread,

Oct 1, 2004, 6:47:51 PM10/1/04

to

Alexander Terekhov <tere...@web.de> wrote in message news:<415D192C...@web.de>...

> Win32's Interlocked stuff is brain-dead. InterlockedIncrement()
> is fully fenced and I have yet to see a use case that would need
> a full fence for it. Thread-safe reference counting GC (providing
> "basic" thread-safety), can benefit a lot from "naked" increments
> (and from "conditional" acq/rel for decrements).

The function has a defined and clear semantics - perhaps more than
what is absolutely needed in some cases, but it does the job, unlike
"volatile". BTW, they recently added more functions with explicit
acquire/release semantics - that should make you happier ... ;-)

regards,
Tzvetan

Tzvetan Mikov

unread,

Oct 1, 2004, 8:57:13 PM10/1/04

to

"Douglas A. Gwyn" <DAG...@null.net> wrote in message news:<fcidnfHNtIx...@comcast.com>...

> Tzvetan Mikov wrote:
> > As others have said, if the compiler does such extensive
> > optimizations, ...
>
> But they're not necessarily "optimizations"; as I noted,
> their an essential aspect of the M*CORE ABI, and that
> processor has a very nice ISA. What you guys seem to be
> advocating is that normal good code generation is simply
> not allowed in the presence of threads, and that position
> is undesirable for all other application contexts.

Can you post some details about the M*CORE ABI and the ISA ? I am not
familiar with them. What makes them different ?

I do see your point that you don't want to lose optimization for all
your data if you are sharing only some of it, but I think that we have
to choose the lesser of the two evils, considering that:
- All mainstream compilers do not optimize across library calls
(threading or not), so synchronization doesn't lower optimization per
se.
- The vast majority of MT programs already written do not use volatile
- There are (many) cases when using "volatile" across the board will
have much worse impact on optimization.

In fact I believe the choice has already been made. Whatever we decide
among ourselves in comp.std.c doesn't affect the state of affairs. I
suspect that even if the next C Standard declared "volatile" required
for MT, people wouldn't start using it. (Sadly, if C0X is anything
like C99, it might become even less relevant ... :-( )

> Of course that code is absurd in the first place,
> but even so, try th following general approach:
> volatile int b;
> lock();
> int b_ = 10; // more likely would load initial b value
> ++b_;++b_;++b_;++b_;++b_; // probably optimized
> b = b_; // flushed to storage
> unlock();

The code isn't that absurd, IMHO. It illustrates the point that
sometimes you might need to do more with the shared data than just to
get or set it at clearly defined spots. Micro optimization like what
you are suggesting quickly becomes impractical.

> I'm not trying to address deficiencies in C++ or its STL.
> Perhaps that language needs additional help.

Isn't it a bit dangerous to treat C separately from C++ ? I dare say
that if something so fundamental is a bad idea for C++, it probably
shouldn't be in C either.
Even if we stayed purely in the C domain, any collection library would
need to have two variants - without and with "volatile".

Perhaps we are talking about different class of problems. I often have
to deal with applications with a GUI and computational thread, sharing
lots of complex data. I assure you, using "volatile" would be a
nightmare.

> > In my experience (with different OSes and compilers), volatile is
> > almost always useless and sometimes even harmful, considering its
> > unclear semantics.
>
> The semantics of volatile are quite clear, although
> specific details (width of access, etc.) are defined
> by the implementation.

It is unusable for portable programming. These important questions are
not answered:
- Is the access atomic ?
- Is "++var" atomic ?
- Is there a memory barrier and what kind ? (Not on any compiler I
know)

> > Its effects are better achieved with a function call (albeit a
> > non-standard one).
>
> Wrong!

I don't see why. I would rather have standard intrinsic functions like
volatile_get() and volatile_set() instead of the "volatile" qualifier.
The benefits are numerous:
- The intent of the action is obvious in the code, at the place where
it happens. One doesn't have to chase declartions to check which
variables happen to be declared volatile.
- The functions can be used selectively only when it really is
necessary.
- It is possible to come with different variants of these functions:
with fencing, acquire/release semantics - whatever is available on the
platform. That could help performance _a lot_.
- It preserves compatibility with existing practices.

But that really is a different discussion. In short, my biggest
objection to your position is that it isn't practical.

regards,
Tzvetan

David Hopwood

unread,

Oct 1, 2004, 10:34:45 PM10/1/04

to

RTFM.

# For performance-critical applications, you should consider using
# InterlockedIncrementAcquire or InterlockedIncrementRelease.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

David Hopwood

unread,

Oct 2, 2004, 3:30:25 AM10/2/04

to

Alexander Terekhov wrote:
> David Hopwood wrote:
> [...]
>
>>POSIX does require this to the extent that is needed to implement 4.10.
>>Optimising compilers need to do escape analysis on each variable. This
>>analysis will detect whether each variable could possibly be shared
>>between threads. All variables that might be shared must be flushed
>>whenever a call to one of the functions listed in 4.10 possibly occurs.
>
> Well, that's not quite correct (I mean more relaxed release/acquire
> sink/hoist semantics).

Yes, you're right. I was too vague; "must be flushed" isn't correct.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

David Hopwood

unread,

Oct 2, 2004, 3:37:04 AM10/2/04

to

Douglas A. Gwyn wrote:
> David Hopwood wrote:
>
>> What's that supposed to mean? There is no such thing as "register-cache
>> flushing" defined by the POSIX standard, and so of course there is no
>> requirement about it.
>
> If you don't understand what I meant, then no wonder
> you don't see the nature of the problem.

The problem is that you were talking about specific implementation techniques
that make no sense at the level of abstraction that the standard deals with.

>> POSIX does require this to the extent that is needed to implement 4.10.
>
> How so? 4.10 as cited so far doesn't address that at all.

I can't help it if you can't read a plainly stated requirement.

>> This is absolutely clear and I don't see how there can really be any
>> dispute about it.
>
> To the contrary, it is clear to me that 4.10 was
> telling the *application programmer* about the
> necessity to protect access to shared variables
> with mutex locks. That is necessary but by no
> means sufficient to ensure safe thread programming.

If 4.10 doesn't specifically say that the requirement constrains only
applications (as it doesn't), then it constrains both applications and
POSIX systems.

> A significant reason to doubt the contrary
> interpretation is that POSIX is not a PL spec;
> it is an API spec that layers facilities on top of,
> not instead of, those specified by PL specs.

POSIX certainly is partly a programming language specification. It's not
possible to specify some of the things it specifies purely at the API level.

--
David Hopwood <david.nosp...@blueyonder.co.uk>

Douglas A. Gwyn

unread,

Oct 2, 2004, 5:29:41 AM10/2/04

to

Tzvetan Mikov wrote:
> - All mainstream compilers do not optimize across library calls

Wrong.

> - The vast majority of MT programs already written do not use volatile

If they're sharing only extern objects, they can generally get
away with it, since the mutex functions are separately compiled.
If they're sharing local objects, they already have race
conditions, to which the programmer is oblivious (and would
remain so if he heeded the bad advice he has been given).

> - There are (many) cases when using "volatile" across the board will
> have much worse impact on optimization.

Volatile qualification should be used only where necessary,
for that very reason. Shared data at synchronization
boundaries is a case where it is necessary.

> In fact I believe the choice has already been made. Whatever we decide
> among ourselves in comp.std.c doesn't affect the state of affairs. I
> suspect that even if the next C Standard declared "volatile" required
> for MT, people wouldn't start using it.

> ... Micro optimization like what

> you are suggesting quickly becomes impractical.

I wasn't suggesting "micro optimization".
I was showing in a simple way how you can avoid the
performance hit that would be caused by *over*use of
volatile qualification in this context.

> Isn't it a bit dangerous to treat C separately from C++ ? I dare say
> that if something so fundamental is a bad idea for C++, it probably
> shouldn't be in C either.

Volatile qualification is important for C systems
programming. That C++ may have some problems of its
own with threads is hardly a surprise.

> Perhaps we are talking about different class of problems. I often have
> to deal with applications with a GUI and computational thread, sharing
> lots of complex data. I assure you, using "volatile" would be a
> nightmare.

I doubt you have made a serious effort to do it right.
Since concurrency correctness requires a study of what
data needs to be shared and where the c.c.r.s are, the
information necessary to properly synch the data is at
hand already.

> It is unusable for portable programming. These important questions are
> not answered:
> - Is the access atomic ?
> - Is "++var" atomic ?

Those are irrelevant when mutexes have been used to protect
the c.c.r.s. They would be relevant only if you were trying
to avoid using synchronization gating (other than what the
hardware logic happens to provide). It is actually *very*
hard to correctly code useful concurrent algorithms taking
such an approach. And if you *do* try to take such a path,
volatile qualification is going to be *crucial* to getting
it to work right.

> - Is there a memory barrier and what kind ? (Not on any compiler I
> know)

I would need to know precisely what you meant to respond
to that point.