# volatile -- what does it mean in relation to member functions?

126 views
Message has been deleted

### Bertin Colpron

Sep 21, 2002, 5:52:14 PM9/21/02
to
Hi,

> /users/apavloff/test/test.cpp: In function int main()':
> /users/apavloff/test/test.cpp:22: passing volatile Foo' as this'
> argument of

'volatile' has an effect similar to 'const' for constructed types. Using a
'volatile' object, you can only access 'volatile' members (except for native
members variables, see below).

> If volatile is similiar to const, why
>
> 1) Does it not complain about the access of v2 in Bar2()

Because volatile qualifier doesn't apply the same way on native types and on
constructed types. On native types (such as 'int'), it simply tells the
compiler not to optimize the variable access. On constructed types, it acts

> 2) Is there not a "volatile_cast"

const_cast can be used to get rid of the volatile qualifier as well.

I suggest the following article:
http://www.cuj.com/experts/1902/alexandr.htm?topic=experts

--Bertin

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

### Ali Cehreli

Sep 22, 2002, 10:09:41 AM9/22/02
to
Alex Pavloff <BLAHap...@BLAHeason.com> wrote in message news:<jshnou0ekadu41c37...@4ax.com>...
> I've been doing some mulithreaded programming recently and have been
> using the volatile keyword. I recently came across something I don't
> quite understand.
>
> (compiling with gcc 3.2)
>
> class Foo
> {
> public:
> void Bar()
> {
> v1 = 1;
> v2 = 2;
> }

Bar will be called for non-const and non-volatile Foo objects:

Foo foo;
foo.Bar();

> void Bar2() volatile
> {
> v1 = 1;
> v2 = 2;
> }

Bar2 will be called for non-const volatile Foo objects:

Foo volatile foo;
foo.Bar2();

> int v1;
> volatile int v2;

v2 is a member that "may change its value in ways not specified by the
language so that aggressive optimizations [by the compiler] must be
avoided." [from TC++PL by Bjarne Stroustrup.]

> };
>
> int main()
> {
> volatile Foo f;
> f.Bar();
> f.Bar2();

> }
>
> /users/apavloff/test/test.cpp: In function int main()':
> /users/apavloff/test/test.cpp:22: passing volatile Foo' as this'
> argument of

That's correct. Foo::Bar can be called on non-const and non-volatile
objects.

> Obviously, volatile is a qualifier that is being discarded.

Kind of... My non-native English reads it as "passing 'volatile Foo'

> If Bar and Bar2 were non-trivial functions, exactly what is the
> difference between the two? They're both going to be called, right?

>
> If volatile is similiar to const, why

They are similar because they are both qualifiers.

> 1) Does it not complain about the access of v2 in Bar2()

Bar2 is non-const and v2 is non-const. No problem...

> 2) Is there not a "volatile_cast"

const_cast works for both qualifiers.

const and volatile qualifiers (cv-qualifiers) are used to pick which
member function to call for a given object. The following class has
four Bar functions that are dispatched according to cv-qualifiers:

#include <iostream>

void report(char const * message)
{
std::cout << message << '\n';
}

class Foo
{
public:

Foo() {}

void Bar()
{
report("Bar");
v1 = 1;
v2 = 2;
}
void Bar() const
{
report("Bar const");

// Cannot change members in 'const' function
// v1 = 1;
// v2 = 2;
}
void Bar() volatile
{
report("Bar volatile");
v1 = 1;
v2 = 2;
}
void Bar() const volatile
{
report("Bar const volatile");

// Cannot change members in 'const' function
// v1 = 1;
// v2 = 2;
}

int v1;
volatile int v2;
};

int main()
{
Foo f;
const Foo fc;
volatile Foo fv;
const volatile Foo fcv;

f .Bar();
fc .Bar();
fv .Bar();
fcv.Bar();
}

Ali

### Thomas Mang

Sep 22, 2002, 10:18:19 AM9/22/02
to
Alex Pavloff <BLAHap...@BLAHeason.com> wrote in message news:<jshnou0ekadu41c37...@4ax.com>...
> I've been doing some mulithreaded programming recently and have been
> using the volatile keyword. I recently came across something I don't
> quite understand.
>
> (compiling with gcc 3.2)
>
> class Foo
> {
> public:
> void Bar()
> {
> v1 = 1;
> v2 = 2;
> }
> void Bar2() volatile
> {
> v1 = 1;
> v2 = 2;
> }
>
> int v1;
> volatile int v2;
> };
>
> int main()
> {
> volatile Foo f;
> f.Bar();
> f.Bar2();
> }
>
> /users/apavloff/test/test.cpp: In function int main()':
> /users/apavloff/test/test.cpp:22: passing volatile Foo' as this'
> argument of
>
> Obviously, volatile is a qualifier that is being discarded.
>
> If Bar and Bar2 were non-trivial functions, exactly what is the
> difference between the two? They're both going to be called, right?
>
> If volatile is similiar to const, why
>
> 1) Does it not complain about the access of v2 in Bar2()
> 2) Is there not a "volatile_cast"

volatile is only to a certain degree similiar to const. As you can
call only const - member functions for const objects, you can only
call volatile-member functions for volatile-objects.
Thats why the call to Foo::Bar() is in error.

In Bar2(), there is no problem with writing data to v2. 'volatile'
doesn't mean here anything like 'const'.
volatile is a hint to the compiler not to optimize aggressively, as
the value might be changed outside the scope of the compiler. It
doesn't mean the value cannot be changed at all.

"volatile_cast" is interesting. I guess it is not implemented because
it would be VERY dangerous (probably more dangerous than const_cast).
With a volatile_cast, you'd give the compiler permission to do
anything with the code it likes to do as long as the results and
sideeffects remain the same. But you _cannot_ control what the
compiler does. And guess what happens when one thread relies on a
volatile object, whereas the other is aggressively optimized, and both
access the same object the same time. At best, a runtime crash.

cheers,

Thomas

### Alexander Terekhov

Sep 23, 2002, 11:15:23 AM9/23/02
to

Alex Pavloff wrote:
>
> I've been doing some mulithreaded programming recently and have been
> using the volatile keyword. ....

Well, "The intent of the "volatile" attribute is to change the code
generated by the compiler on references to memory tagged with that
attribute." More stuff on this can be found on google groups: ;-) ;-)

---
Searched Groups for Andrei group:comp.lang.c++ author:tere...@web.de.
Results 1 - 2 of about 6. Search took 0.09 seconds.

Sorted by relevance Sort by date

Re: thread safety of static initialization
... by Terekhov Attila, I don't give talks (I'm not in this business);
Andrei does. The "newsgroup posts that Terekhov cited" were written
by a) Aleksey Gurtovoy ... comp.lang.c++ - 11 Sep 2002 by Alexander
Terekhov - View Thread (43 articles)

Re: stl deques and "volatile"
Gerhard Prilmeier wrote: [... Opposed to that, Andrei Alexandrescu
programs: http://www.cuj.com ... comp.lang.c++ - 23 Aug 2002 by
Alexander Terekhov - View Thread (9 articles)"
---

regards,
alexander.

### Ron

Sep 24, 2002, 6:53:31 AM9/24/02
to

This is a bit of a tangent, but I believe that article's premise to be
incorrect. It says, in part,

"Inside a critical section defined by a mutex, only one thread has
access. Consequently, inside a critical section, the executing code
has single-threaded semantics. The controlled variable is not volatile
anymore &#8212; you can remove the volatile qualifier."

While it's true that only one thread at a time can access a
mutex-protected variable, nothing guarantees that one thread's changes
to the variable will become visible to the other thread...nothing,
that is, except a "volatile" declaration. Let's have an example, using
a nonvolatile int i and a mutex m. Imagine that i starts at 0:

-------------------------------------------------------------------------

m .lock (); ...

// i is not volatile, so this may
// update a register instead of
// the memory location containing
// i:
++i;

m .unlock (); ...

... m .lock ();

// Since i might not have been
// updated by thread 1, this may
// print "0", even though i is
// mutex-protected.

... printf ("i = %d\n", i);

... m .unlock ();

I don't believe there's anything in the spec that requires a function
to write to memory all its non-const variables before calling another
function [1] (s.1.9(6)) is as close as the spec appears to get). Thus,
nothing appears to guarantee that i will be updated before m. unlock
() returns in thread 1, or, more importantly, before m .lock ()

-- Ron

[1] VC++ 5.x appears always to write a function's non-const variables
to memory before calling any function whose body is unavailable to the
compiler (e.g. externals, virtuals, via function pointers, etc.).

### Alexander Terekhov

Sep 24, 2002, 5:44:54 PM9/24/02
to

Ron wrote:
>
> > I suggest the following article:
> > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
>
> This is a bit of a tangent, but I believe that article's premise to be
> incorrect.

That's correct. ;-) Andrei[/CUJ] should really pull it off, IMO.

> It says, in part,
>
> "Inside a critical section defined by a mutex, only one thread has
> access. Consequently, inside a critical section, the executing code
> has single-threaded semantics. The controlled variable is not volatile
> anymore &#8212; you can remove the volatile qualifier."
>
> While it's true that only one thread at a time can access a
> mutex-protected variable, nothing guarantees that one thread's changes
> to the variable will become visible to the other thread...nothing,
> that is, except a "volatile" declaration.

That's incorrect.

> Let's have an example, using
> a nonvolatile int i and a mutex m. Imagine that i starts at 0:
>
> -------------------------------------------------------------------------
>
> m .lock (); ...
>
> // i is not volatile, so this may
> // update a register instead of
> // the memory location containing
> // i:
> ++i;
>
> m .unlock (); ...
>
> ... m .lock ();
>
> // Since i might not have been
> // updated by thread 1, this may
> // print "0", even though i is
> // mutex-protected.
>
> ... printf ("i = %d\n", i);
>
> ... m .unlock ();
>
> I don't believe there's anything in the spec that requires a function
> to write to memory all its non-const variables before calling another
> function [1] (s.1.9(6)) is as close as the spec appears to get). Thus,
> nothing appears to guarantee that i will be updated before m. unlock
> () returns in thread 1, or, more importantly, before m .lock ()

< somewhere, sometime, usenet, Mr.B >

"....
> - when the 'volatile' keyword must be used in

Never in PORTABLE threaded programs. The semantics of the C and
C++ "volatile" keyword are too loose, and insufficient, to have
any particular value with threads. You don't need it if you're
using portable synchronization (like a POSIX mutex or semaphore)
because the semantics of the synchronization object provide the

The only use for "volatile" is in certain non-portable
"optimizations" to synchronize at (possibly) lower cost in
certain specialized circumstances. That depends on knowing and
understanding the specific semantics of "volatile" under your
particular compiler, and what other machine-specific steps you
might need to take. (For example, using "memory barrier"
builtins or assembly code.)

In general, you're best sticking with POSIX synchronization, in
which case you've got no use at all for "volatile". That is,
unless you have some existing use for the feature having nothing
longjmp(), or in an asynchronous signal handler, or when
accessing hardware device registers.
...."

regards,
alexander.

### Sergey P. Derevyago

Sep 24, 2002, 6:16:40 PM9/24/02
to
Alexander Terekhov wrote:
> Well, "The intent of the "volatile" attribute is to change the code
> generated by the compiler on references to memory tagged with that
> attribute." More stuff on this can be found on google groups: ;-) ;-)
In particular, "Rationale for American National Standard for Information
Systems - Programming Language - C" http://www.lysator.liu.se/c/rat/title.html
has the following explanations:

3.5.3 Type qualifiers

The Committee has added to C two type qualifiers: const and volatile.
Individually and in combination they specify the assumptions a compiler can
and must make when accessing an object through an lvalue.

The syntax and semantics of const were adapted from C++; the concept itself
has appeared in other languages. volatile is an invention of the Committee;
it follows the syntactic model of const.

Type qualifiers were introduced in part to provide greater control over
optimization. Several important optimization techniques are based on the
principle of cacheing'': under certain circumstances the compiler can
remember the last value accessed (read or written) from a location, and use
this retained value the next time that location is read. (The memory, or
cache'', is typically a hardware register.) If this memory is a machine
register, for instance, the code can be smaller and faster using the register
rather than accessing external memory.

The basic qualifiers can be characterized by the restrictions they impose on
access and cacheing:

const
No writes through this lvalue. In the absence of this qualifier,
writes may occur through this lvalue.
volatile
No cacheing through this lvalue: each operation in the abstract
semantics must be performed. (That is, no cacheing assumptions
may be made, since the location is not guaranteed to contain any
previous value.) In the absence of this qualifier, the contents
of the designated location may be assumed to be unchanged (except
for possible aliasing.)

A translator design with no cacheing optimizations can effectively ignore the
type qualifiers, except insofar as they affect assignment compatibility.

It would have been possible, of course, to specify a nonconst keyword instead
of const, or nonvolatile instead of volatile. The senses of these concepts in
the Standard were chosen to assure that the default, unqualified, case was the
most common, and that it corresponded most clearly to traditional practice in
the use of lvalue expressions.

Four combinations of the two qualifiers is possible; each defines a useful set
of lvalue properties. The next several paragraphs describe typical uses of
these qualifiers.

The translator may assume, for an unqualified lvalue, that it may read or
write the referenced object, that the value of this object cannot be changed
except by explicitly programmed actions in the current thread of control, but
that other lvalue expressions could reference the same object.

const is specified in such a way that an implementation is at liberty to put
const objects in read-only storage, and is encouraged to diagnose obvious
attempts to modify them, but is not required to track down all the subtle ways
that such checking can be subverted. If a function parameter is declared
const, then the referenced object is not changed (through that lvalue) in the
body of the function --- the parameter is read-only.

A static volatile object is an appropriate model for a memory-mapped I/O
register. Implementors of C translators should take into account relevant
hardware details on the target systems when implementing accesses to volatile
objects. For instance, the hardware logic of a system may require that a
two-byte memory-mapped register not be accessed with byte operations; a
compiler for such a system would have to assure that no such instructions were
generated, even if the source code only accesses one byte of the register.
Whether read-modify-write instructions can be used on such device registers
must also be considered. Whatever decisions are adopted on such issues must be
documented, as volatile access is implementation-defined. A volatile object
is an appropriate model for a variable shared among multiple processes.

A static const volatile object appropriately models a memory-mapped input
port, such as a real-time clock. Similarly, a const volatile object models a
variable which can be altered by another process but not by this one.

Although the type qualifiers are formally treated as defining new types they
actually serve as modifiers of declarators. Thus the declarations

const struct s {int a,b;} x;
struct s y;

declare x as a const object, but not y. The const property can be associated
with the aggregate type by means of a type definition:

typedef const struct s {int a,b;} stype;
stype x;
stype y;

In these declarations the const property is associated with the declarator
stype, so x and y are both const objects.

The Committee considered making const and volatile storage classes, but this
would have ruled out any number of desirable constructs, such as const members
of structures and variable pointers to const types.

A cast of a value to a qualified type has no effect; the qualification
(volatile, say) can have no effect on the access since it has occurred prior
to the cast. If it is necessary to access a non-volatile object using volatile
semantics, the technique is to cast the address of the object to the
appropriate pointer-to-qualified type, then dereference that pointer.

2Alexander: I hope you didn't miss the mention of multiple processes. ;-) ;-)
--
With all respect, Sergey. http://cpp3.virtualave.net/
mailto : ders at skeptik.net

### Ron

Sep 25, 2002, 6:13:58 PM9/25/02
to
Alexander Terekhov <tere...@web.de> wrote in message news:<3D906223...@web.de>...

> Ron wrote:
> >
> > > I suggest the following article:
> > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
> >
> > This is a bit of a tangent, but I believe that article's premise to be
> > incorrect.
>
> That's correct. ;-) Andrei[/CUJ] should really pull it off, IMO.
>
> > It says, in part,
> >
> > "Inside a critical section defined by a mutex, only one thread has
> > access. Consequently, inside a critical section, the executing code
> > has single-threaded semantics. The controlled variable is not volatile
> > anymore &#8212; you can remove the volatile qualifier."
> >
> > While it's true that only one thread at a time can access a
> > mutex-protected variable, nothing guarantees that one thread's changes
> > to the variable will become visible to the other thread...nothing,
> > that is, except a "volatile" declaration.
>
> That's incorrect.

What's incorrect? The statement that "volatile" guarantees interthread
visibility? Yes, strictly speaking, that statement is incorrect
because the spec doesn't speak in terms of threads, only in terms of
"observable behavior" (s.1.9). However, s.1.9(6) says that

------
The observable behavior of the abstract machine is its sequence of
reads and writes to volatile data and calls to library I/O functions.
------

This raises the question: to whom is this behavior "observable"? A
user or another program can observe the results of "calls to library
I/O functions", but who can observe "reads and writes to volatile
data"? Only the abstract machine hardware or some other program
running in another stream of execution. We would, I think, be
justified in calling that stream of execution a "thread". Thus,
s.1.9(6) seems to indicate that the use of volatile guarantees

> < somewhere, sometime, usenet, Mr.B >

Huh?

> "....
> > - when the 'volatile' keyword must be used in
>

> The semantics of the C and C++ "volatile" keyword are too loose, and
> insufficient, to have any particular value with threads.

I'm not sure this is true. See above re: s.1.9(6).

> In general, you're best sticking with POSIX synchronization,

Yes, as a general rule.

> in which case you've got no use at all for "volatile". That is,
> unless you have some existing use for the feature having nothing
> longjmp(), or in an asynchronous signal handler, or when
> accessing hardware device registers.
> ...."

>From the point of view of the spec., hardware device registers are
just one possible "observer" of a program's "observable behavior".
Nothing in the spec. distinguishes their observations of a program's
"observable behavior" from the observations of independent threads of
execution. And, of course, hardware registers are, in fact, observed
by a "thread" of execution within the hardware.

-- Ron

### James Kanze

Sep 26, 2002, 10:23:44 AM9/26/02
to
rcr...@ictv.com (Ron) wrote in message

> Alexander Terekhov <tere...@web.de> wrote in message
> news:<3D906223...@web.de>...
> > Ron wrote:

> > > > I suggest the following article:
> > > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts

> > > This is a bit of a tangent, but I believe that article's premise
> > > to be incorrect.

> > That's correct. ;-) Andrei[/CUJ] should really pull it off, IMO.

The premise is poorly stated, but the basic idea is interesting, and
certainly correct. The use of volatile tends to confuse people, because
Andrei is only using it for its interaction with the type system, and
not in any way for any particular semantics it happens to have. (You'll
notice that he uses locks to enforce the actual shared semantics.)

> > > It says, in part,

> > > "Inside a critical section defined by a mutex, only one thread has
> > > access. Consequently, inside a critical section, the executing
> > > code has single-threaded semantics. The controlled variable is not
> > > volatile anymore &#8212; you can remove the volatile qualifier."

> > > While it's true that only one thread at a time can access a
> > > mutex-protected variable, nothing guarantees that one thread's
> > > changes to the variable will become visible to the other
> > > thread...nothing, that is, except a "volatile" declaration.

> > That's incorrect.

> What's incorrect? The statement that "volatile" guarantees interthread
> visibility?

No. That statement is incorrect, but that statement isn't present.

What is present is that only one thread at a time can access a
mutex-protected variable. And both freeing and acquiring a mutex (at
least in Posix systems) guarantee memory barriers: when process A frees
the mutex, it is guaranteed that all previous changes are globally
visible, and when process B acquires the mutex, it is guaranteed that it
will see all globally visible changes which precede the acquisition.
(Note that both are necessary. Ensuring that the changes are visible
globally doesn't ensure that other processes will use the global state,
rather than their own.)

> Yes, strictly speaking, that statement is incorrect because the spec
> doesn't speak in terms of threads, only in terms of "observable
> behavior" (s.1.9). However, s.1.9(6) says that

> ------
> The observable behavior of the abstract machine is its sequence of
> reads and writes to volatile data and calls to library I/O functions.
> ------

> This raises the question: to whom is this behavior "observable"?

The standard doesn't say:-). The standard says that the read and writes
*must* take place, but it doesn't define what is meant by read or write
in this context. Is a write to the cache memory sufficient? Or must
the process "write-through" to global memory?

If the implementation guarantees that all volatile accesses are direct
to global (shared) memory, through the cache, then volatile might be of
some use with threads. I don't know of any implementations which make
this guarantee, however.

> A user or another program can observe the results of "calls to library
> I/O functions", but who can observe "reads and writes to volatile
> data"?

It is the act of reading or writing that is observable, not the data.

A typical example might be a write only memory mapped IO port. No one
can ever "observe" the value written after the fact, but the write has
effects; the write itself is observable.

Do compilers actually generate extra code to ensure memory barriers
around volatile accesses? I don't think so, but I'll admit that I
haven't verified.

> Only the abstract machine hardware or some other program running in
> another stream of execution. We would, I think, be justified in
> calling that stream of execution a "thread". Thus, s.1.9(6) seems to
> indicate that the use of volatile guarantees interthread visibility.

> > < somewhere, sometime, usenet, Mr.B >

> Huh?

> > "....
> > > - when the 'volatile' keyword must be used in

> > The semantics of the C and C++ "volatile" keyword are too loose,
> > and insufficient, to have any particular value with threads.

> I'm not sure this is true. See above re: s.1.9(6).

> > In general, you're best sticking with POSIX synchronization,

> Yes, as a general rule.

> > in which case you've got no use at all for "volatile". That is,
> > unless you have some existing use for the feature having nothing to
> > longjmp(), or in an asynchronous signal handler, or when accessing
> > hardware device registers. ...."

> From the point of view of the spec., hardware device registers are
> just one possible "observer" of a program's "observable behavior".

The standard doesn't talk of "observers". It defines two things
(accessing volatile variables, and calls to library IO functions) as
observable behavior. And it doesn't define what it means by "access".

This looseness is intentional. The goal is to leave it up to the
implementation to define something reasonable in its context. Support
for memory mapped IO was one of the intentions, but of course, is not
mentionned in the standard. Support for threading was not one of the
initial intentions (since threads didn't exist, or were extremely rare
at the time), but one could argue that it is compatible with the
original intentions (which also included things like "while ( !
interruptArrived )"). The argument would be stronger if someone could
actually show something useful that could be done with volatile

Posix defines the relationship between threads, including the
signification of threads in a C program. It places no special
requirements on volatile with regards to threads.

> Nothing in the spec. distinguishes their observations of a program's
> "observable behavior" from the observations of independent threads of
> execution. And, of course, hardware registers are, in fact, observed
> by a "thread" of execution within the hardware.

And, of course, "writing" to a hardware register doesn't necessarily
place the value written anywhere. There is no requirement that the
value written be visible by anyone. Just a requirement that the write
has occured, for some implementation defined meaning of write.

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

### Alexander Terekhov

Sep 26, 2002, 4:27:12 PM9/26/02
to

"Sergey P. Derevyago" wrote:

[...quotes from C Rationale about volatiles/"multiple processes"...]

> 2Alexander: I hope you didn't miss the mention of multiple
> processes. ;-) ;-)

The C Rationale you've quoted is wrong. It doesn't reflect the state
of affairs on MODERN hardware/systems. I've mentioned this defect''
several times in the past and on several netnews groups. The last one
your article with more direct pointers and quotes was "reject: [look
like a flame or is off-topic]". (it's now available on request basis
only ;-) ;-) )

regards,
alexander.

### Alexander Terekhov

Sep 27, 2002, 8:06:04 AM9/27/02
to

James Kanze wrote:
>
> rcr...@ictv.com (Ron) wrote in message
> > Alexander Terekhov <tere...@web.de> wrote in message
> > news:<3D906223...@web.de>...
> > > Ron wrote:
>
> > > > > I suggest the following article:
> > > > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
>
> > > > This is a bit of a tangent, but I believe that article's premise
> > > > to be incorrect.
>
> > > That's correct. ;-) Andrei[/CUJ] should really pull it off, IMO.
>
> The premise is poorly stated, but the basic idea is interesting, and
> certainly correct. The use of volatile tends to confuse people, because
> Andrei is only using it for its interaction with the type system, and
> not in any way for any particular semantics it happens to have. (You'll
> notice that he uses locks to enforce the actual shared semantics.)

< mostly copy&paste from my other c.l.c++ posting >

Well, the "general" problem of casting away volatility(*) aside,

"....
Although both C and C++ Standards are conspicuously silent when it
in the form of the volatile keyword. "

That's wrong.

"Just like its better-known counterpart const, volatile is a type
modifier. It's intended to be used in conjunction with variables that
are accessed and modified in different threads. "

That's wrong.

"Basically, without volatile, either writing multithreaded programs
becomes impossible, or the compiler wastes vast optimization
opportunities. "

That's wrong.

"An explanation is in order.

<snip>

So all you have to do to make Gadget's Wait/Wakeup combo work is to
qualify flag_ appropriately:

{
public:
... as above ...
private:
volatile bool flag_;
}; "

That's wrong.

"Most explanations of the rationale and usage of volatile stop here
and advise you to volatile-qualify the primitive types that you use

Well, that's also kinda wrong.

regards,
alexander.

(*) http://www.tru64unix.compaq.com/docs/base_doc/DOCUMENTATION/V51_HTML/ARH9RBTE/DOCU0008.HTM#gran_avoid
(see "3.7.5.2 Maintaining the Composite Data Object's Layout",
"-strong-volatile", for example)

### Ron

Sep 28, 2002, 5:22:29 AM9/28/02
to
ka...@gabi-soft.de (James Kanze) wrote in message >

> > > > > I suggest the following article:
> > > > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
>
> > > > This is a bit of a tangent, but I believe that article's premise
> > > > to be incorrect.
>
> > > That's correct. ;-) Andrei[/CUJ] should really pull it off, IMO.
>
> The premise is poorly stated, but the basic idea is interesting, and
> certainly correct. The use of volatile tends to confuse people, because
> Andrei is only using it for its interaction with the type system, and
> not in any way for any particular semantics it happens to have.

The article says

------
Outside a critical section, any thread might interrupt any other at
any time; there is no control, so consequently variables accessible
from multiple threads are volatile. This is in keeping with the
original intent of volatile &#8212; that of preventing the compiler
from unwittingly caching values used by multiple threads at once.

Inside a critical section defined by a mutex, only one thread has
access. Consequently, inside a critical section, the executing code
has single-threaded semantics. The controlled variable is not volatile
anymore &#8212; you can remove the volatile qualifier.

------

The second paragraph implies that mutex protection, by itself, insures
that changes to a protected variable are visible to all the threads
using it. This is true only for Posix (and Posix-like) mutexing
frameworks.

> > Yes, strictly speaking, that statement is incorrect because the spec
> > doesn't speak in terms of threads, only in terms of "observable
> > behavior" (s.1.9). However, s.1.9(6) says that
>
> > ------
> > The observable behavior of the abstract machine is its sequence of
> > reads and writes to volatile data and calls to library I/O functions.
> > ------
>
> > This raises the question: to whom is this behavior "observable"?
>
> The standard doesn't say:-). The standard says that the read and writes
> *must* take place, but it doesn't define what is meant by read or write
> in this context. Is a write to the cache memory sufficient? Or must
> the process "write-through" to global memory?
>
> If the implementation guarantees that all volatile accesses are direct
> to global (shared) memory, through the cache, then volatile might be of
> some use with threads. I don't know of any implementations which make
> this guarantee, however.

The spec does not, of course, mention caches. All it says is that
reads and writes to volatile variables are part of the abstract
machine's "observable behavior". As for real implementations, at least
the memory location to which the variable is assigned. Cache
consistency is not an issue for uniprocessor-based implementations
[1], and multiprocessor-based implementations generally maintain cache
consistency in hardware.

> > A user or another program can observe the results of "calls to library
> > I/O functions", but who can observe "reads and writes to volatile
> > data"?
>
> It is the act of reading or writing that is observable, not the data.
>
> A typical example might be a write only memory mapped IO port. No one
> can ever "observe" the value written after the fact, but the write has
> effects; the write itself is observable.

I think you're splitting this hair too finely. The value written to a
memory-mapped I/O port is observable to the hardware being addressed;
if it were not, the hardware would be unable to tell the difference
between a write setting the "go" bit and a write clearing it.

> Do compilers actually generate extra code to ensure memory barriers
> around volatile accesses? I don't think so, but I'll admit that I
> haven't verified.

VC++ does. I have not checked other compilers, but I assume that they
do so as well. Compilers that do not do so may be used to write device
drivers only with special RTL support.

> The standard doesn't talk of "observers". It defines two things
> (accessing volatile variables, and calls to library IO functions) as
> observable behavior. And it doesn't define what it means by "access".
>
> This looseness is intentional. The goal is to leave it up to the
> implementation to define something reasonable in its context. Support
> for memory mapped IO was one of the intentions, but of course, is not
> mentionned in the standard. Support for threading was not one of the
> initial intentions (since threads didn't exist, or were extremely rare
> at the time)

Really? Even in those ancient days, the concept of a main thread and
an interrupt thread was well-known. There's almost no conceptual or
actual difference between the memory visibility and synchronization
requirements for a single-user system with interrupt-driven I/O and
those for a multiuser (or multithreaded) system. In the former, the
device interrupt essentially creates a minimal-context thread that
does things that affect the "main" thread. In the latter, the
scheduler creates and manages the threads, usually using a clock
interrupt to do so.

> but one could argue that it is compatible with the
> original intentions (which also included things like "while ( !
> interruptArrived )"). The argument would be stronger if someone could
> actually show something useful that could be done with volatile
> supposing it had these semantics.

I believe I've shown that. 'Volatile' (or something, like Posix synch
objects, that insures interthread visibility of changes to variables)
is necessary, but not sufficient, for multithreaded programming. The
other necessary item is, of course, a set of atomic, interlocked
synchronization primitives. Posix synch objects perform both of these
functions, but other synch object implementations provide only synch
primitives, and assume that interthread visibility is handled by
'volatile'.

> And, of course, "writing" to a hardware register doesn't necessarily
> place the value written anywhere.

Uh, yes it does. In most implementations, it places the value on the
processor's data bus, and thence (possibly through expansion busses)
finally onto the hardware's internal bus.

> There is no requirement that the value written be visible by anyone. Just a
> requirement that the write has occured, for some implementation defined
> meaning of write.

As I noted above, this isn't quite correct. The hardware must be able
to observe the value written. If it can't, it will be unable to
distinguish between a "launch missile" command and a "disable missile"
command.

-- Ron

[1] Unless they're really weird.

### Shannon Barber

Sep 28, 2002, 10:50:06 AM9/28/02
to
rcr...@ictv.com (Ron) wrote in message news:<c668c1f8.02092...@posting.google.com>...
> > I suggest the following article:
> > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
>
> This is a bit of a tangent, but I believe that article's premise to be
> incorrect. It says, in part,
>
> "Inside a critical section defined by a mutex, only one thread has
> access. Consequently, inside a critical section, the executing code
> has single-threaded semantics. The controlled variable is not volatile
> anymore &#8212; you can remove the volatile qualifier."
>
> While it's true that only one thread at a time can access a
> mutex-protected variable, nothing guarantees that one thread's changes
> to the variable will become visible to the other thread...nothing,
> that is, except a "volatile" declaration.

Several things have to make this guarantee, otherwise it'd be
impossible to write code. At the very least it has to happen just
prior to when the function call returns. IANASL, so I cannot tell you
everything that is suppose to make this guarantee, but by induction,
some things must.

Second, you have equivocated the two uses of volatile. Your
sentiments are directed towards the use of volatile on native types.
Andrei's article is about using volatile on structures & classes. The
semantics are slightly different.
Eventually the technique Andrei describes will boil down to twiddling
native types - and as you point out - it will be important that those
types remain volatile to guarantee correct behavior. You're only
suppose to const_cast away the volatile'ness of constructed types
after acquiring thier mutex. Hopefully, no one has a mutex for each
native variable ;)

### Ron

Sep 28, 2002, 9:15:29 PM9/28/02
to
\> > > I suggest the following article:

> > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts
> >
> > This is a bit of a tangent, but I believe that article's premise to be
> > incorrect. It says, in part,
> >
> > "Inside a critical section defined by a mutex, only one thread has
> > access. Consequently, inside a critical section, the executing code
> > has single-threaded semantics. The controlled variable is not volatile
> > anymore &#8212; you can remove the volatile qualifier."
> >
> > While it's true that only one thread at a time can access a
> > mutex-protected variable, nothing guarantees that one thread's changes
> > to the variable will become visible to the other thread...nothing,
> > that is, except a "volatile" declaration.
>
> Several things have to make this guarantee, otherwise it'd be
> impossible to write code.

Please explain why the absence of this guarantee makes it impossible
noted that it would be impossible if neither 'volatile' nor something
else (like the RTL in the case of Posix synch primitives) guaranteed

> You're only
> suppose to const_cast away the volatile'ness of constructed types
> after acquiring thier mutex. Hopefully, no one has a mutex for each
> native variable ;)

Again, acquiring a mutex does not, of itself, guarantee interthread
visibility, unless your mutexing implementation says so, or unless
your compiler does something else -- like implicitly treating all
variables as volatile or flushing all variables to memory on function
calls -- to guarantee it. Some compilers (like VC++) appear to do this
last thing: VC++ flushes all variables to memory prior to calling any
function whose body it cannot access, even if the variables in
question are not volatile. However, the spec certainly does not
require this behavior.

-- Ron

### Johan Johansson

Sep 29, 2002, 1:22:19 AM9/29/02
to
I think this entire quote is in error. AFAIK, volatile does not
necessarily have any effect what so ever at run time. POSIX
synchronization can not have any effect at compile time, unless your
using a very tightly coupled pthreads implementation and compiler.

My point is that declaring a variable volatile makes the compiler
being held in a register. What POSIX synchronization does is make those

You can synchronize all you want if the compiler has not generated a
write to memory.

j

Alexander Terekhov wrote:

> > - when the 'volatile' keyword must be used in
>
> Never in PORTABLE threaded programs. The semantics of the C and
> C++ "volatile" keyword are too loose, and insufficient, to have
> any particular value with threads. You don't need it if you're
> using portable synchronization (like a POSIX mutex or semaphore)
> because the semantics of the synchronization object provide the
> consistency you need between threads.
>
> The only use for "volatile" is in certain non-portable
> "optimizations" to synchronize at (possibly) lower cost in
> certain specialized circumstances. That depends on knowing and
> understanding the specific semantics of "volatile" under your
> particular compiler, and what other machine-specific steps you
> might need to take. (For example, using "memory barrier"
> builtins or assembly code.)
>
> In general, you're best sticking with POSIX synchronization, in
> which case you've got no use at all for "volatile". That is,
> unless you have some existing use for the feature having nothing
> longjmp(), or in an asynchronous signal handler, or when
> accessing hardware device registers.

### Alexander Terekhov

Sep 30, 2002, 5:32:56 PM9/30/02
to

Ron wrote:
[...]

> > This looseness is intentional. The goal is to leave it up to the
> > implementation to define something reasonable in its context. Support
> > for memory mapped IO was one of the intentions, but of course, is not
> > mentionned in the standard. Support for threading was not one of the
> > initial intentions (since threads didn't exist, or were extremely rare
> > at the time)
>
> Really? Even in those ancient days, the concept of a main thread and
> an interrupt thread was well-known. There's almost no conceptual or
> actual difference between the memory visibility and synchronization
> requirements for a single-user system with interrupt-driven I/O and
> those for a multiuser (or multithreaded) system.

That's nothing but a sort of "wishful thinking", I'm afraid.

< Subject: Re: "memory location", comp.std.c, abridged >

well, i was under impression that "sig_atomic_t" alone
does not guarantee thread (or even signal) safety..
only the combination of _static_storage_duration_,
_volatile_ and _sig_atomic_t makes it safe.. and only
for signal handlers.. i could imagine an impl. which
would just disable signal delivery while accessing
"static volatile sig_atomic_t" variable (allocated
in some special storage region - for static volatiles
sig_atomic_t's only) or would do something else which
would NOT work with respect to threads.

or am i missing something?
---

> : What is needed is something similar to the Java memory model requirement
> : that values cannot "come out of thin air"

> (i.e. roughly speaking, a value
> : read from any variable must have been previously written to that variable,
> : with some additional ordering constraints). This has little or nothing to do
> : with the semantics of sig_atomic_t (or volatile), which the C99 Standard
> : only defines for single-threaded programs.
>
> Moreover, the standard only guarantees atomicity of writes by signal
> handlers to data

static data

> of type sig_atomic_t, and only when the object is
> also declared to be volatile. Objects of type sig_atomic_t are not
> guaranteed to be atomic in any other context.

AFAICS, it's even worse than that... in a multithreaded
application that happens to use asynchronous signals [vs.
sigwait and/or SIGEV_THREAD delivery] with static volatile
sig_atomic_t vars you'd have to ensure that such signals
could only be "delivered" to a corresponding ONE SINGLE
volatile sig_atomic_t variable(s). You just can't have
such signal(s) delivered to any other thread.

regards,
alexander.

### Alexander Terekhov

Sep 30, 2002, 5:33:24 PM9/30/02
to

Johan Johansson wrote:
[ top posting repaired ]

< somewhere, sometime, usenet, Mr. B.,... uhmm, well, okay:
From: Dave Butenhof (bute...@zko.dec.com)
Subject: Re: C++, volatile member functions, and threads
Date: 1997/07/03 >

---
The use of "volatile" is not sufficient to ensure proper memory
visibility or synchronization between threads. The use of a mutex is
sufficient, and, except by resorting to various non-portable machine
code alternatives, (or more subtle implications of the POSIX memory
rules that are much more difficult to apply generally, as explained in
my previous post), a mutex is NECESSARY.

Therefore, as Bryan explained, the use of volatile accomplishes nothing
but to prevent the compiler from making useful and desirable
optimizations, providing no help whatsoever in making code "thread
safe". You're welcome, of course, to declare anything you want as
"volatile" -- it's a legal ANSI C storage attribute, after all. Just
don't expect it to solve any thread synchronization problems for you.

Because of this flaw in reasoning, Eric's EXAMPLE of his CONCEPT was
neither correct nor an optimization.

I'd like to stop beating this to death. It's not fair to Eric, who
merely had the misfortune to be someone (like probably 95% of everyone
else) who didn't understand the intricicies of SMP memory systems and
thread synchronization. He proposed a shortcut, he was corrected, and I
suspect he (and certainly I) would like to move on to other matters and
stop dragging this (and him) through the dust. Please?
---

regards,
alexander.

### Shannon Barber

Sep 30, 2002, 6:40:20 PM9/30/02
to
rcr...@ictv.com (Ron) wrote

> > Several things have to make this guarantee, otherwise it'd be
> > impossible to write code.
>
> Please explain why the absence of this guarantee makes it impossible
> noted that it would be impossible if neither 'volatile' nor something
> else (like the RTL in the case of Posix synch primitives) guaranteed

Function calls must meet the contract of their declaration.
Suppose:
void __stdcall some_func(Data* data)
{
//mess with data, optimizer free to keep stuff in registers
return;
//optimizer must put the information into *data, it cannot stay in the
registers
}

If it doesn't, you can't take the function's address and call it at
will.
Dll/so's wouldn't be feasible.

Similarly, when exiting a method call, it must update the instance
data. The next method that is invoked expects the data to be
accessible through this->, not sitting in a register. Even if the
call is inlined, it still needs to store the data in the instance -
becasue the next method invoked will expect the data theere. Results
returned by one function and fed as an argument into the next could
stay in a register, but if a method mutates its object, it must commit
the mutation.

Now, if the method releases the mutex prior to committing these
changes we have a problem. Is this the case you are concerned about?
This could only happen if the compiler shuffled cached data from one
register to another in-between inlined function/method calls (during
which time a mutex would be released and re-acquired). Such a compiler
is unsound at best.

I suppose a pathological compiler could generate (potentially) unique
code for every invocation of each method, so that they know what data
is cached in the registers. In this case Andrei's method would fail
to function as desired.

### Johan Johansson

Sep 30, 2002, 8:18:45 PM9/30/02
to
Alexander Terekhov wrote:

>>> In general, you're best sticking with POSIX synchronization, in
>>> which case you've got no use at all for "volatile". That is,
>>> unless you have some existing use for the feature having nothing
>>> longjmp(), or in an asynchronous signal handler, or when
>>> accessing hardware device registers.

[...]

>>My point is that declaring a variable volatile makes the compiler
>>being held in a register. What POSIX synchronization does is make
those
>>
>>You can synchronize all you want if the compiler has not generated a
>>write to memory.

[...]

> The use of "volatile" is not sufficient to ensure proper memory
> visibility or synchronization between threads. The use of a mutex is
> sufficient, and, except by resorting to various non-portable machine
> code alternatives, (or more subtle implications of the POSIX memory
> rules that are much more difficult to apply generally, as explained in
> my previous post), a mutex is NECESSARY.

I have no idea how your response relates to my post. I didn't ask who
posted the quoted statements on volatile. I stated that a mutex is not
sufficient. I never claimed that volatile would be sufficient. Perhaps
you meant to respond to some other post?

> I'd like to stop beating this to death. It's not fair to Eric, who
> merely had the misfortune to be someone (like probably 95% of everyone
> else) who didn't understand the intricicies of SMP memory systems and
> thread synchronization. He proposed a shortcut, he was corrected, and
I
> suspect he (and certainly I) would like to move on to other matters
and
> stop dragging this (and him) through the dust. Please?

I don't know who Eric is, although I can guess that he started the
interest in beating anything. I would however appriciate a comment on my

statement that volatile is necessary since it ensures that the compiler
generates code that actually writes to memory by disabling certain
optimizations. Without volatile your compiler would need to know that
you are using mutexes, which in general is infeasible.

j

### Alexander Terekhov

Oct 1, 2002, 9:26:01 AM10/1/02
to

Johan Johansson wrote:
[...]

> I don't know who Eric is, although I can guess that he started the
> interest in beating anything.

Eric is the {probably first} one who started the following c.p.t.
(back in 1997):

---
From: Eric M. Hopper (hop...@omnifarious.mn.org)
Subject: C++, volatile member functions, and threads

After realizing that you could make member functions volatile as
well as const, it occured to me that 'thread-safe' member functions
ought to be marked as volatile, and that objects shared between several
threads ought to be declared as volatile.

Perhaps the ability to declare member functions volatile is
limited to IBMs C++/Set for OS/2, or g++ 2.7.2.1, but I would guess not.

My reasoning on this is that if you have a member function that
can be called by multiple threads, it should treat the member variables
as volatile (because multiple threads might be accessing them) until it
obtains a mutex or lock of some sort on the member variables.

Declaring objects as volatile that are accessed by multiple
threads is a simple extension of declaring variables of basic types to
be volatile when they are accessed by multiple threads.

This doesn't seem to be current practice. Is there a good
reason for this that I'm missing, or is it simply an oversight?

Have fun (if at all possible),
--
This space for rent. Prospective tenants must be short, bright, and
witty. No pets allowed. If you wish to live here, or know someone who
does, you may contact me (Eric Hopper) via E-Mail at
hop...@omnifarious.mn.org
-- Eric Hopper, owner and caretaker.
---

[...]

> I would however appriciate a comment on my statement that volatile
> is necessary since it ensures that the compiler generates code that
> actually writes to memory by disabling certain optimizations.

C/C++ "volatile" specifier (an "implementation-defined" thing,
to some extent, BTW) is NOT necessary to do anything useful in
the context of sharing memory between threads/processes.

> Without volatile your compiler would need to know that
> you are using mutexes, which in general is infeasible.

What makes you think so?

regards,
alexander.

### Alexander Terekhov

Oct 1, 2002, 9:28:06 AM10/1/02
to

Shannon Barber wrote:
[...]
> In this case Andrei's method...

Heck, well, just for your (&& Andrei's) information:

----

From: Eric M. Hopper (hop...@omnifarious.mn.org)
Subject: C++, volatile member functions, and threads

Date: 1997/06/28

After realizing that you could make member functions volatile as
well as const, it occured to me that 'thread-safe' member functions
ought to be marked as volatile, and that objects shared between
several threads ought to be declared as volatile.

....
<another posting from Eric>
....
> Declaring your variables volatile will have no useful effect, and will
> simply cause your code to run a *lot* slower when you turn on

*nod* That's why you should use const_cast, and call
non-volatile versions after you've obtained a mutex.
....
<yet another posting from Eric>
....
I also still maintain that volatile might still be useful simply
as a tag stating whether or not you intended to have multiple threads
access an object or not. Having volatile versions of functions simply
aquire the mutex, and call a non-volatile version would be potentially
useful, and wouldn't incure a performance hit.
....
----

Sounds familiar, oder? ;-)

regards,
alexander.

--
"Ignorance of prior art doesn't mean there wasn't prior art.
It just means that you get entertaining usenet postings and
not so entertaining..." CUJ articles.

### James Kanze

Oct 1, 2002, 12:02:26 PM10/1/02
to
Alexander Terekhov <tere...@web.de> wrote in message
news:<3D934ECB...@web.de>...
> James Kanze wrote:

> > rcr...@ictv.com (Ron) wrote in message
> > > Alexander Terekhov <tere...@web.de> wrote in message
> > > news:<3D906223...@web.de>...
> > > > Ron wrote:

> > > > > > I suggest the following article:
> > > > > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts

> > > > > This is a bit of a tangent, but I believe that article's
> > > > > premise to be incorrect.

> > > > That's correct. ;-) Andrei[/CUJ] should really pull it off,
IMO.

> > The premise is poorly stated, but the basic idea is interesting,
> > and certainly correct. The use of volatile tends to confuse
> > people, because Andrei is only using it for its interaction with
> > the type system, and not in any way for any particular semantics it
> > happens to have. (You'll notice that he uses locks to enforce the
> > actual shared semantics.)

> Well, the "general" problem of casting away volatility(*) aside,

Either the article was not the one I thought it was, or you have just
quoted lead in material. Andrei has published a way of using volatile
to ensure thread safety. What is certain is 1) before writing the
article, Andrei didn't really understand the semantics of volatile in
threads (or perhaps generally), and 2) his technique didn't depend on
these semantics, so his misunderstanding didn't affect the usefulness of
the technique. I would encourage looking beyond some rather naïve
statements Andrei made concerning volatile, and studying the actual
mechanism which he proposed.

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

### James Kanze

Oct 1, 2002, 12:03:48 PM10/1/02
to
> ka...@gabi-soft.de (James Kanze) wrote in message >
> > > > > > I suggest the following article:
> > > > > > http://www.cuj.com/experts/1902/alexandr.htm?topic=experts

> > > > > This is a bit of a tangent, but I believe that article's
> > > > > premise to be incorrect.

> > > > That's correct. ;-) Andrei[/CUJ] should really pull it off,
IMO.

> > The premise is poorly stated, but the basic idea is interesting,
> > and certainly correct. The use of volatile tends to confuse
> > people, because Andrei is only using it for its interaction with
> > the type system, and not in any way for any particular semantics it
> > happens to have.

> The article says

> ------
> Outside a critical section, any thread might interrupt any other at
> any time; there is no control, so consequently variables accessible
> from multiple threads are volatile. This is in keeping with the
> original intent of volatile &#8212; that of preventing the compiler
> from unwittingly caching values used by multiple threads at once.

The comments concerning volatile here are incorrect. But the whole
purpose of the article is to create a compiler enforced mechanism which
ensures that all accesses are protected by a mutex.

> Inside a critical section defined by a mutex, only one thread has
> access. Consequently, inside a critical section, the executing code
> has single-threaded semantics. The controlled variable is not volatile
> anymore &#8212; you can remove the volatile qualifier.
> ------

> The second paragraph implies that mutex protection, by itself, insures
> that changes to a protected variable are visible to all the threads
> using it. This is true only for Posix (and Posix-like) mutexing
> frameworks.

Is it? What does the standard for other threading systems say?

> > > Yes, strictly speaking, that statement is incorrect because the
> > > spec doesn't speak in terms of threads, only in terms of
> > > "observable behavior" (s.1.9). However, s.1.9(6) says that

> > > ------
> > > The observable behavior of the abstract machine is its sequence
of
> > > reads and writes to volatile data and calls to library I/O
functions.
> > > ------

> > > This raises the question: to whom is this behavior "observable"?

> > The standard doesn't say:-). The standard says that the read and
> > writes *must* take place, but it doesn't define what is meant by
> > read or write in this context. Is a write to the cache memory
> > sufficient? Or must the process "write-through" to global memory?

> > If the implementation guarantees that all volatile accesses are
> > direct to global (shared) memory, through the cache, then volatile
> > might be of some use with threads. I don't know of any
> > implementations which make this guarantee, however.

> The spec does not, of course, mention caches. All it says is that
> reads and writes to volatile variables are part of the abstract
> machine's "observable behavior".

Right. The variable isn't observable; it is the action of accessing it
which is observable.

> As for real implementations, at least VC++ implements every access to
> a volatile variable as an access to the memory location to which the
> variable is assigned.

Which pretty much corresponds to what was intended. What happens
afterwards is hardware dependent.

> Cache consistency is not an issue for uniprocessor-based
> implementations [1], and multiprocessor-based implementations
> generally maintain cache consistency in hardware.

Really? Can you point out to me where this cache consistency is
guaranteed. (I seem to have seen explicit statements to the effect that
it *isn't* guaranteed on a Compaq Alpha, and that it won't be guaranteed
in future Intel processors.)

> > > A user or another program can observe the results of "calls to
> > > library I/O functions", but who can observe "reads and writes to
> > > volatile data"?

> > It is the act of reading or writing that is observable, not the
> > data.

> > A typical example might be a write only memory mapped IO port. No
> > one can ever "observe" the value written after the fact, but the
> > write has effects; the write itself is observable.

> I think you're splitting this hair too finely. The value written to a
> memory-mapped I/O port is observable to the hardware being addressed;
> if it were not, the hardware would be unable to tell the difference
> between a write setting the "go" bit and a write clearing it.

It depends on the hardware port. In at least some cases, the only
significant information to the hardware is that the write took place.

But that really wasn't the point of my argument. The point was that
what is observable was the write (including the value written), not some
external more or less persistent state. From a hardware point of view,
the observable behavior is a specific type of cycle on the CPU's memory
bus.

> > Do compilers actually generate extra code to ensure memory barriers
> > around volatile accesses? I don't think so, but I'll admit that I
> > haven't verified.

> VC++ does.

Does it? Does it generate a lock prefix for every instruction which
accesses the variable? (In current Intel hardware, the lock prefix
generates the memory barriers that ensure cache consistency. Intel has
announced, however, that this will not necessarily be guaranteed in
future processors.)

> I have not checked other compilers, but I assume that they do so as
> well.

None of the compilers (Sun CC or g++) that I have access to on my Sparc
do.

> Compilers that do not do so may be used to write device drivers only
> with special RTL support.

Nonsense. If the hardware uses memory mapped IO, then it will also
ensure that the IO addresses aren't managed by the cache. So there is
no need for memory barriers or such. The address refers to one and only
one location, unlike normal memory addresses. (Obviously, the IO
addresses won't be paged in virtual memory, either.)

> > The standard doesn't talk of "observers". It defines two things
> > (accessing volatile variables, and calls to library IO functions)
> > as observable behavior. And it doesn't define what it means by
> > "access".

> > This looseness is intentional. The goal is to leave it up to the
> > implementation to define something reasonable in its context.
> > Support for memory mapped IO was one of the intentions, but of
> > course, is not mentionned in the standard. Support for threading
> > was not one of the initial intentions (since threads didn't exist,
> > or were extremely rare at the time)

> Really? Even in those ancient days, the concept of a main thread and
> an interrupt thread was well-known.

True. And volatile was sometimes used for communication between the
two.

Without virtual memory, memory caches, etc., there are no problems.
Such things aren't used for memory mapped IO, and in simpler days, they
didn't exist at all.

> There's almost no conceptual or actual difference between the memory
> visibility and synchronization requirements for a single-user system
> with interrupt-driven I/O and those for a multiuser (or multithreaded)
> system.

There are significant differences in the systems where I have actually
programmed at this level. Interrupt driven IO is normally bound to a
single processor (although it needn't be), and doesn't go through the
normal memory mapping mechanisms (although a cache may be used in
certain cases). The system will explicitly use memory barriers where
ever necessary.

And volatile doesn't play a role in this type of communication today.

> In the former, the device interrupt essentially creates a
> minimal-context thread that does things that affect the "main"
> thread. In the latter, the scheduler creates and manages the threads,
> usually using a clock interrupt to do so.

There are two levels of complexity:

- In simple systems, the "main" thread may actually spin waiting for
the interrupt. In such a system, communication between the
interrupt and the main thread may very well be via a volatile
variable, as you explain.

- In larger systems, including anything on which Unix or Windows is
likely to run, the interrupt routine will generally use some
variation of the same routines the system uses for communicating
between different threads internally. (These are system level
routines, not generally accessible to normal processes.) These
routines will do whatever is necessary for the communications to
work: almost certainly, volatile plays no role here, because locking
is needed at a higher level of granularity.

> > but one could argue that it is compatible with the original
> > intentions (which also included things like "while ( !
> > interruptArrived )"). The argument would be stronger if someone
> > could actually show something useful that could be done with
> > volatile supposing it had these semantics.

> I believe I've shown that. 'Volatile' (or something, like Posix synch
> objects, that insures interthread visibility of changes to variables)
> is necessary, but not sufficient, for multithreaded programming.

Which is it? Posix condition variables (which is what I suppose you
mean by synch objects) have little or nothing in common with volatile.

> The other necessary item is, of course, a set of atomic, interlocked
> synchronization primitives. Posix synch objects perform both of these
> functions, but other synch object implementations provide only synch
> primitives, and assume that interthread visibility is handled by
> 'volatile'.

A system could define volatile in this manner. It wasn't the intent of
volatile, however; it isn't the way Posix defines it; and it sure offers
a very low level interface -- you basically have to implement the Posix
condition variables over it.

If a system did want to define volatile in this manner, it would have to
generate code which ensured the memory barriers (and possibly masked the
interrupts) around each access.

> > And, of course, "writing" to a hardware register doesn't
> > necessarily place the value written anywhere.

> Uh, yes it does. In most implementations, it places the value on the
> processor's data bus, and thence (possibly through expansion busses)
> finally onto the hardware's internal bus.

It doesn't permanently place the data anywhere. I agree that the data
must pass over the processor's data bus, the period of one memory access
cycle.

> > There is no requirement that the value written be visible by
> > anyone. Just a requirement that the write has occured, for some
> > implementation defined meaning of write.

> As I noted above, this isn't quite correct.

It's what the standard says.

> The hardware must be able to observe the value written.

Indirectly, you can argue this, since the action of writing does imply
pushing the data out of the CPU. It's certainly not specified directly,
and I don't think it buys us anything anyway.

> If it can't, it will be unable to distinguish between a "launch
> missile" command and a "disable missile" command.

Sure it can. Those are two different IO ports:-).

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

### Johan Johansson

Oct 1, 2002, 12:06:46 PM10/1/02
to
Alexander Terekhov wrote:
[...]

>>I would however appriciate a comment on my statement that volatile
>>is necessary since it ensures that the compiler generates code that
>>actually writes to memory by disabling certain optimizations.
>
>
> C/C++ "volatile" specifier (an "implementation-defined" thing,
> to some extent, BTW) is NOT necessary to do anything useful in
> the context of sharing memory between threads/processes.
>
>
>>Without volatile your compiler would need to know that
>>you are using mutexes, which in general is infeasible.
>
>
> What makes you think so?

As I wrote in my two earlier posts volatile forces the compiler to
generate reads/writes to memory. Without volatile, the compiler might
decide that it already knows the value that it will be using a few lines

down anyway and keep it in a register instead of writing it back to
memory and then reading it back from memory. This is a portable
behaviour of volatile.

Will a mutex force the compiler to generate memory reads/writes? It's
only even possible if it is aware that you *are* using a mutex. This is
certainly not a portable behaviour if it exists at all.

You can synchronize all you want if what you're sharing was never

Perhaps the source of confusion is that I am talking about basic types
and you are talking about volatile methods? FWIW I think volatile
methods are an interesting idea but I don't expect it to have any
non-type related effects. It's up to the programmer to perform locking
as needed.

j

### Ron

Oct 1, 2002, 3:09:15 PM10/1/02
to
shannon...@myrealbox.com (Shannon Barber) wrote in message news:<de001473.02092...@posting.google.com>...

> rcr...@ictv.com (Ron) wrote
> > > Several things have to make this guarantee, otherwise it'd be
> > > impossible to write code.
> >
> > Please explain why the absence of this guarantee makes it impossible
> > noted that it would be impossible if neither 'volatile' nor something
> > else (like the RTL in the case of Posix synch primitives) guaranteed
>
> Function calls must meet the contract of their declaration.
> Suppose:
> void __stdcall some_func(Data* data)
> {
> //mess with data, optimizer free to keep stuff in registers
> return;
> //optimizer must put the information into *data, it cannot stay in the
> registers
> }

I believe that the spec implicitly requires some_func () to flush its
writes to the memory containing 'data' if the name 'some_func ()' is
visible outside the module in which it's defined. If, on the other
hand, some_func () is purely local, I'm not sure that this is true.
Example:

void confusticate (void)
{
int *pMim = 0x09809800; // some valid shared memory address
mutex .lock ();
*pMim = 17;

struct
{
void bebother (int *pDwarf)
{
*pDwarf += 22;
}
} inner;

inner .bebother (pMim);
printf ("%d\n", *pMim);
mutex .unlock ();
}

AFAIK, nothing in the spec requires bebother () to use the memory
pointed to by pDwarf. In fact, nothing says that the compiler must
generate a pointer for pDwarf at all, as long as the code it generates
produces results identical to those that would have been produced had
it treated pDwarf as a pointer. In fact, in this example, the compiler
would, AFAIK, be free to elide not only pDwarf, but pMim, the call to
bebother, struct inner, and the calls to the mutex methods, replacing
the entire body of confusticate () with the equivalent of

printf ("39\n");

Why can the compiler do this? Because pMim isn't declared 'volatile'.
The lack of 'volatile' means that the compiler is free to conclude
that the only "observable behavior" (s.1.9) of the sample program is
the outputting of the string "39\n" to stdout (s.1.9(6)). With
'volatile', it cannot assume this, but must perform the reads and
writes to pMim, since those operations would, then, be part of the

program's "observable behavior".

> Similarly, when exiting a method call, it must update the instance
> data. The next method that is invoked expects the data to be
> accessible through this->, not sitting in a register. Even if the
> call is inlined, it still needs to store the data in the instance -
> becasue the next method invoked will expect the data theere. Results
> returned by one function and fed as an argument into the next could
> stay in a register, but if a method mutates its object, it must commit
> the mutation.

Not quite. If the mutator functions are all purely local, the instance
data could reside in registers, since the resulting code obeys the "as
if" rule.

> Now, if the method releases the mutex prior to committing these
> changes we have a problem. Is this the case you are concerned about?

I'm not concerned about the case where the programmer fails to use
programmer has assigned the data, and then releases the mutex. If
nothing requires the compiler to commit the data before the mutex is
released, a race condition arises. And, AFAIK, if the data isn't
volatile, and doesn't fall into some other special category (like we
examined above), nothing _does_ require the compiler to commit it.

> I suppose a pathological compiler could generate (potentially) unique
> code for every invocation of each method, so that they know what data
> is cached in the registers. In this case Andrei's method would fail
> to function as desired.

A highly optimizing compiler might well do this.

-- Ron

### Alexander Terekhov

Oct 1, 2002, 4:03:22 PM10/1/02
to

James Kanze wrote:
[...]

> > Well, the "general" problem of casting away volatility(*) aside,
>
> Either the article was not the one I thought it was, or you have just
> quoted lead in material. Andrei has published a way of using volatile

Nah, to ensure *undefined behavior* [7.1.5.1/7], AFAICS (please correct
me if I'm wrong).

regards,
alexander.

### Johan Johansson

Oct 1, 2002, 7:34:44 PM10/1/02
to
Alexander Terekhov wrote:

>>>Well, the "general" problem of casting away volatility(*) aside,
>>
>>Either the article was not the one I thought it was, or you have just
>>quoted lead in material. Andrei has published a way of using volatile
>
>
> Nah, to ensure *undefined behavior* [7.1.5.1/7], AFAICS (please
correct
> me if I'm wrong).

Unless I'm misreading those paragraphs that would mean that you could
not define any object to be volatile. You'd have to work around it by
using volatile pointers/references to objects being defined as
non-volatile. That of course makes the idea less attractive.

Rather interesting that this would be the only portable effect of making

non-fundamental type objects volatile. Surely this cannot be by design?

j

### Ron

Oct 2, 2002, 3:37:22 PM10/2/02
to
> The comments concerning volatile here are incorrect. But the whole
> purpose of the article is to create a compiler enforced mechanism
which
> ensures that all accesses are protected by a mutex.

OK; then we agree.

> > As for real implementations, at least VC++ implements every access
to
> > a volatile variable as an access to the memory location to which the
> > variable is assigned.
>
> Which pretty much corresponds to what was intended. What happens
> afterwards is hardware dependent.

Quite right.

> > Cache consistency is not an issue for uniprocessor-based
> > implementations [1], and multiprocessor-based implementations
> > generally maintain cache consistency in hardware.
>
> Really? Can you point out to me where this cache consistency is
> guaranteed. (I seem to have seen explicit statements to the effect
that
> it *isn't* guaranteed on a Compaq Alpha, and that it won't be
guaranteed
> in future Intel processors.)

Cache consistency depends upon a whole host of
implementation-dependent factors. Some processors allow you to apply
cacheability attributes to various areas of memory, for example.
Usually the compiler wouldn't know about these attributes, but would
generate code on the assumption that the hardware would maintain cache
consistency, and it'd be the programmer's responsibility to insure
that the relevant data was placed in a memory block with the
appropriate attributes. Some compilers may know about these
attributes, and allow you to specify them via pragmas or the like.

This issue is only peripherally related to volatile, however.

Agreed. I never implied that there was necessarily any persistent
state.

> > > Do compilers actually generate extra code to ensure memory
barriers
> > > around volatile accesses? I don't think so, but I'll admit that
I
> > > haven't verified.
>
> > VC++ does.
>
> Does it? Does it generate a lock prefix for every instruction which
> accesses the variable? (In current Intel hardware, the lock prefix
> generates the memory barriers that ensure cache consistency. Intel
has
> announced, however, that this will not necessarily be guaranteed in
> future processors.)

Excuse me, I misunderstood what you meant by "memory barriers", taking
it to mean code that would (assuming simple hardware-maintained cache
consistency logic), insure that volatile variable are read/written
from/to memory, rather than remaining in registers.

> > There's almost no conceptual or actual difference between the memory
> > visibility and synchronization requirements for a single-user system
> > with interrupt-driven I/O and those for a multiuser (or
> > system.
>
> There are significant differences in the systems where I have actually
> programmed at this level. Interrupt driven IO is normally bound to a
> single processor (although it needn't be), and doesn't go through the
> normal memory mapping mechanisms (although a cache may be used in
> certain cases). The system will explicitly use memory barriers where
> ever necessary.
>
> And volatile doesn't play a role in this type of communication today.

Of course it (or some other similar implementation-defined mechanism)
does. Otherwise nothing guarantees that the code

void SetGoBit (void)
{
CSR *pCSR = reinterpret_cast <CSR *> (0x800012f0); // some valid
somemutex .lock ();
CSR reg = *pCSR;
reg .go = 1;
*pCSR = reg;
somemutex .unlock ();
}

will function correctly. AFAIK, if we don't declare pCSR 'volatile',
this function's observable behavior is nil, so the compiler could
elide the entire thing per s.1.9 [1]. Of course, a "nice" compiler
might take the pointer assignment as a sign that something fishy was
up, and treat *pSCR as 'volatile', but nothing in the spec says it
must do so.

And, of course, you could call OS primitives to read and write the
CSR, and those might be implemented in a manner (can you say "assembly
code") that does not use 'volatile', but that's not the point.

> > In the former, the device interrupt essentially creates a
> > minimal-context thread that does things that affect the "main"
> > thread. In the latter, the scheduler creates and manages the
> > usually using a clock interrupt to do so.
>
> There are two levels of complexity:
>
> - In simple systems, the "main" thread may actually spin waiting for
> the interrupt. In such a system, communication between the
> interrupt and the main thread may very well be via a volatile
> variable, as you explain.
>
> - In larger systems, including anything on which Unix or Windows is
> likely to run, the interrupt routine will generally use some
> variation of the same routines the system uses for communicating
> between different threads internally. (These are system level
> routines, not generally accessible to normal processes.) These
> routines will do whatever is necessary for the communications to
> work: almost certainly, volatile plays no role here, because
locking
> is needed at a higher level of granularity.

Sure. The OS may provide communication mechanisms, like pipes, that
allow the transfer of data between threads without requiring the
programmer to take special precautions. The OS may also provide, for
example, shared memory sections which the programmer may access
directly via C++ pointers. If the programmer does so, it's her job to
insure that the compiler does not internally cache data read or
written to those sections beyond the region bounded by the appropriate
synchronization primitives. Thus the compiler may cache data between
the acquisition of a mutex and the release thereof, but not before or
after those points.

> > > And, of course, "writing" to a hardware register doesn't
> > > necessarily place the value written anywhere.
>
> > Uh, yes it does. In most implementations, it places the value on the
> > processor's data bus, and thence (possibly through expansion busses)
> > finally onto the hardware's internal bus.
>
> It doesn't permanently place the data anywhere. I agree that the data
> must pass over the processor's data bus, the period of one memory
access
> cycle.

Good!

> > > There is no requirement that the value written be visible by
> > > anyone. Just a requirement that the write has occured, for some
> > > implementation defined meaning of write.
>
> > As I noted above, this isn't quite correct.
>
> It's what the standard says.

No, I'm afraid it doesn't say that. It does, however, leave the term
"observable" undefined, perhaps foreseeing that people like ourselves
would need such an issue to debate ;-) But seriously, when s.1.9(6)
says "The observable behavior of the abstract machine is its sequence

of reads and writes to volatile data and calls to library I/O

functions", it must also mean that the values read and written are
part of the observable behavior. Otherwise, it would be appropriate
for the program

int main (int argc, int *argv [])
{
printf ("Yes\n");
return 1;
}

to print the string "NO!" to stdout, since the sequence of "calls to
library I/O functions" is the same whether it prints "Yes", "NO!", or
"I've fallen and I can't get up!".

> > The hardware must be able to observe the value written.
>
> Indirectly, you can argue this, since the action of writing does imply
> pushing the data out of the CPU. It's certainly not specified
directly,
> and I don't think it buys us anything anyway.
>
> > If it can't, it will be unable to distinguish between a "launch
> > missile" command and a "disable missile" command.
>
> Sure it can. Those are two different IO ports:-).

Perhaps on your company's missile control systems ;-)

-- Ron

[1] Except, perhaps, the mutex calls. S.1.9 n.6.

### James Kanze

Oct 2, 2002, 3:40:08 PM10/2/02
to
Alexander Terekhov <tere...@web.de> wrote in message
news:<3D99EF5F...@web.de>...

> James Kanze wrote:
> [...]
> > > Well, the "general" problem of casting away volatility(*) aside,

> > Either the article was not the one I thought it was, or you have
> > just quoted lead in material. Andrei has published a way of using
> > volatile to ensure thread safety.

> Nah, to ensure *undefined behavior* [7.1.5.1/7], AFAICS (please
> correct me if I'm wrong).

Not if I've understood it correctly. If I remember correctly, the
actual object being reference wasn't declared volatile; the volatile was
only present in the wrapper.

Even if this is not the case, he is using an "undefined behavior" which
in fact works everywhere. Sort of like assigning the value returned by
getc to a char.

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

### James Kanze

Oct 2, 2002, 3:40:50 PM10/2/02
to
Johan Johansson <joh...@ipunplugged.com> wrote in message
news:<V3im9.1435$hT3....@nntpserver.swip.net>... > Alexander Terekhov wrote: > [...] > >>I would however appriciate a comment on my statement that volatile > >>is necessary since it ensures that the compiler generates code that > >>actually writes to memory by disabling certain optimizations. > > C/C++ "volatile" specifier (an "implementation-defined" thing, to > > some extent, BTW) is NOT necessary to do anything useful in the > > context of sharing memory between threads/processes. > >>Without volatile your compiler would need to know that you are using > >>mutexes, which in general is infeasible. > > What makes you think so? > As I wrote in my two earlier posts volatile forces the compiler to > generate reads/writes to memory. What memory? The local cache? Global (shared) memory? The image in virtual memory on the disk? The standard doesn't force reads and writes to memory. It says that the reads and writes are to be considered "observable behavior". About the most I can say about that is that it implies a read or a write cycle on the CPU bus. Which, of course, doesn't guarantee anything unless the hardware behind it does. > Without volatile, the compiler might decide that it already knows the > value that it will be using a few lines down anyway and keep it in a > register instead of writing it back to memory and then reading it back > from memory. This is a portable behaviour of volatile. As we've been trying to explain, guaranteeing that the compiler will not use a value which it explicitly cached in a register simply doesn't buy you anything. > Will a mutex force the compiler to generate memory reads/writes? There's no such thing as a mutex in C++, so it obviously depends on the system definition of mutex. In Posix (IEEE Std 1003.1, Base Definitions, General Concepts, Memory Synchronization): "The following functions synchronize memory with respect to other threads: [...]". Both pthread_mutex_lock and pthread_mutex_unlock are in the list. > It's only even possible if it is aware that you *are* using a mutex. If I call pthread_mutex_lock, I guess that the compiler can suppose that I am using a mutex. Posix makes certain requirements. (I suppose that Windows threads offer similar guarantees, and make similar requirements.) If my program conforms to those requirements, and the system claims Posix compliance, then it is the compiler's or the system's problem to make my program work. It's none of my business how they do it. With regards to code motion of the compilers, there are two relatively simple solutions: - The compiler knows about the system calls, and knows that it cannot move reads or writes around across them, or - The compiler doesn't know about them, and treats them just as any other external function call. In this case, of course, it had better ensure that the necessary reads and writes have taken place, since it cannot assume that the called code doesn't make use of or modify the variables in question. (Any object accessible from another thread would also be accessible from an external function with unknown semantics.) Most compilers currently use the second strategy, at least partially because they have to implement it anyway -- I can call functions written in assembler from C++, and there is no way that the C++ compiler can know their semantics, so all that is needed is that the C++ compiler treat pthread_mutex_lock et al. as if they were unknown functions written in assembler (which is often the case in fact anyway). Of course, all of this only ensures that the data is read from/written to the local processor specific cache. The system level implementation of the mutex functions must contain the necessary additional instructions to ensure cache consistency. > This is certainly not a portable behaviour if it exists at all. Nothing involving threads is portable in so far as C++ is concerned. If I use the Posix functions correctly, I should be portable to any machine implementing Posix. In practice, I'll certainly have a lot less problems with threading and Posix than I will with the C++ language itself. > You can synchronize all you want if what you're sharing was never > written to/read from memory anyway. > Perhaps the source of confusion is that I am talking about basic types > and you are talking about volatile methods? I don't think so. I'm talking about accessing objects in memory, and what the standards (both C++ and Posix) guarantee. -- James Kanze mailto:jka...@caicheuvreux.com Conseils en informatique orientée objet/ Beratung in objektorientierter Datenverarbeitung [ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ] ### Alexander Terekhov unread, Oct 3, 2002, 10:44:41 AM10/3/02 to James Kanze wrote: > > Alexander Terekhov <tere...@web.de> wrote in message > news:<3D99EF5F...@web.de>... > > James Kanze wrote: > > [...] > > > > Well, the "general" problem of casting away volatility(*) aside, > > > > follow the link: http://www.cuj.com/experts/1902/alexandr.htm > > > > Either the article was not the one I thought it was, or you have > > > just quoted lead in material. Andrei has published a way of using > > > volatile to ensure thread safety. > > > Nah, to ensure *undefined behavior* [7.1.5.1/7], AFAICS (please > > correct me if I'm wrong). > > Not if I've understood it correctly. If I remember correctly, the > actual object being reference wasn't declared volatile; the volatile was > only present in the wrapper. http://www.cuj.com/experts/1902/alexandr.htm?topic=experts <quote> Summary When writing multithreaded programs, you can use volatile to your advantage. You must stick to the following rules: Define all shared objects as volatile. </quote> > Even if this is not the case, he is using an "undefined behavior" > which in fact works everywhere. Well, nope.'' Consider that ["implementation-defined"] struct independently_shared_stuff_bundle { volatile char a,b; // both a and b can be access by different // threads }; is NOT the same as: struct independently_shared_stuff_bundle { char a,b; // both a and b can be access by different // threads (actually, >>might NOT work<<) }; on at least one platform I know of -- Tru64 Unix. regards, alexander. ### Johan Johansson unread, Oct 3, 2002, 10:49:16 AM10/3/02 to James Kanze wrote: >>As I wrote in my two earlier posts volatile forces the compiler to >>generate reads/writes to memory. > > What memory? The local cache? Global (shared) memory? The image in > virtual memory on the disk? The local cache. > The standard doesn't force reads and writes to memory. It says that the > reads and writes are to be considered "observable behavior". About the > most I can say about that is that it implies a read or a write cycle on > the CPU bus. Which, of course, doesn't guarantee anything unless the > hardware behind it does. Of course. I have never claimed otherwise. >>Without volatile, the compiler might decide that it already knows the >>value that it will be using a few lines down anyway and keep it in a >>register instead of writing it back to memory and then reading it back >>from memory. This is a portable behaviour of volatile. > > > As we've been trying to explain, guaranteeing that the compiler will not > use a value which it explicitly cached in a register simply doesn't buy > you anything. As *I* have been trying to explain it's a prerequisite, so it *does* buy you something. >>It's only even possible if it is aware that you *are* using a mutex. > > If I call pthread_mutex_lock, I guess that the compiler can suppose that > I am using a mutex. So a compiler only supports particular mutex implementations? Using another library is not an option? > Posix makes certain requirements. (I suppose that Windows threads offer > similar guarantees, and make similar requirements.) If my program > conforms to those requirements, and the system claims Posix compliance, > then it is the compiler's or the system's problem to make my program > work. It's none of my business how they do it. As far as I know posix threads (the full posix jungle is beyond my knowledge) define their behaviour in terms of reads and writes. Then again I should point out that I haven't read the actual standard, just articles and books about it such as Butenhof's "Programming with POSIX threads". > With regards to code motion of the compilers, there are two relatively > simple solutions: > > - The compiler knows about the system calls, and knows that it cannot > move reads or writes around across them, or I would be upset if I couldn't use an otherwise perfectly correct threading library just because the compiler wasn't tuned for it. > - The compiler doesn't know about them, and treats them just as any > other external function call. In this case, of course, it had > better ensure that the necessary reads and writes have taken place, > since it cannot assume that the called code doesn't make use of or > modify the variables in question. (Any object accessible from > another thread would also be accessible from an external function > with unknown semantics.) Let's hope they are not inline functions then. > Of course, all of this only ensures that the data is read from/written > to the local processor specific cache. Yes. I have never made any statements to the contrary. > The system level implementation > of the mutex functions must contain the necessary additional > instructions to ensure cache consistency. Absolutely. But the reads/writes to the cache must be there in the first place. >>This is certainly not a portable behaviour if it exists at all. > > Nothing involving threads is portable in so far as C++ is concerned. > > If I use the Posix functions correctly, I should be portable to any > machine implementing Posix. In practice, I'll certainly have a lot less > problems with threading and Posix than I will with the C++ language > itself. I am perfectly aware that C++ doesn't mention threads. Nevertheless the standard does define the behaviour of volatile for fundamental types. This doesn't change when you are running several threads. >>Perhaps the source of confusion is that I am talking about basic types >>and you are talking about volatile methods? > > > I don't think so. I'm talking about accessing objects in memory, and > what the standards (both C++ and Posix) guarantee. If you by Posix mean posix threads I don't think it guarantees that reads or writes that never took place are visible across threads. I also doubt that it places requirements on the compiler. If I can find the standard at a reasonable price I'll certainly check. j ### James Kanze unread, Oct 4, 2002, 9:47:09 AM10/4/02 to Johan Johansson <joh...@ipunplugged.com> wrote in message news:<lsJm9.1512$hT3....@nntpserver.swip.net>...
> James Kanze wrote:

> > As we've been trying to explain, guaranteeing that the compiler
> > will not use a value which it explicitly cached in a register
> > simply doesn't buy you anything.

> As *I* have been trying to explain it's a prerequisite, so it *does*

On a nice day, if the compiler vendor is co-operating.

In fact, it buys you exactly what the compiler vendor guarantees, and
not anything more. In practice, at least in a Posix environment, Posix
*requires* the locks. With or without volatile. And if you use the
locks, you don't need volatile. So with regards to threading, volatile

> >>It's only even possible if it is aware that you *are* using a mutex.

> > If I call pthread_mutex_lock, I guess that the compiler can suppose
> > that I am using a mutex.

> So a compiler only supports particular mutex implementations? Using
> another library is not an option?

If the compiler is for Solaris, it probably supports both the older

> > Posix makes certain requirements. (I suppose that Windows threads
> > offer similar guarantees, and make similar requirements.) If my
> > program conforms to those requirements, and the system claims Posix
> > compliance, then it is the compiler's or the system's problem to
> > make my program work. It's none of my business how they do it.

> As far as I know posix threads (the full posix jungle is beyond my
> knowledge) define their behaviour in terms of reads and writes. Then
> again I should point out that I haven't read the actual standard, just
> articles and books about it such as Butenhof's "Programming with POSIX

The actual standard says that "The following functions synchronize
memory with respect to other threads [...]". I presume that means that
all threads see the same data. Which is probably more than is really
meant: my interpretation is that the functions guarantee that the data
is in the global memory shared by all threads is identical to what the

> > With regards to code motion of the compilers, there are two
> > relatively simple solutions:

> > - The compiler knows about the system calls, and knows that it
> > cannot move reads or writes around across them, or

> I would be upset if I couldn't use an otherwise perfectly correct
> threading library just because the compiler wasn't tuned for it.

I don't know of any way you can easily add to the system calls an OS
supports. If you do add to them (or change existing ones), I wouldn't
expect any support from the compiler.

> > - The compiler doesn't know about them, and treats them just as
> > any other external function call. In this case, of course, it
> > had better ensure that the necessary reads and writes have
> > taken place, since it cannot assume that the called code
> > doesn't make use of or modify the variables in question. (Any
> > object accessible from another thread would also be accessible
> > from an external function with unknown semantics.)

> Let's hope they are not inline functions then.

What is not an inline function? The threading primitives can't be,
since they cannot be written in C++.

> > Of course, all of this only ensures that the data is read
> > from/written to the local processor specific cache.

> Yes. I have never made any statements to the contrary.

> > The system level implementation of the mutex functions must contain
> > the necessary additional instructions to ensure cache consistency.

> Absolutely. But the reads/writes to the cache must be there in the
> first place.

Agreed. That's the compilers problem. As I have pointed out above, it
isn't a problem in practice.

> >>This is certainly not a portable behaviour if it exists at all.

> > Nothing involving threads is portable in so far as C++ is concerned.

> > If I use the Posix functions correctly, I should be portable to any
> > machine implementing Posix. In practice, I'll certainly have a lot
> > less problems with threading and Posix than I will with the C++
> > language itself.

> I am perfectly aware that C++ doesn't mention threads. Nevertheless
> the standard does define the behaviour of volatile for fundamental
> types.

No it doesn't. All it says is that accesses to volatiles are
"observable behavior". If nothing else, you lack atomicity.

> This doesn't change when you are running several threads.

But since different threads don't see the same copy of memory, what does
that get us?

> >>Perhaps the source of confusion is that I am talking about basic
> >>types and you are talking about volatile methods?

> > I don't think so. I'm talking about accessing objects in memory,
> > and what the standards (both C++ and Posix) guarantee.

> If you by Posix mean posix threads I don't think it guarantees that
> reads or writes that never took place are visible across threads.

What reads or writes that never took place?

The C++ standard doesn't guarantee anything with regards to threads.
Posix requires that the reads and writes (or the abstract machine) be
visible to other threads at specific points, as a result of specific
machine instructions. A compiler for a Posix machine will conform to
both standards. In the case of threading, of course, the only standard
which interests you is Posix.

> I also doubt that it places requirements on the compiler.

It certainly places requirements on the C compiler. A lot of them.
Some are restrictions (a char must be exactly 8 bits, etc.). Some are
actually in contradiction with the C standard; all C compilers I've seen
on Posix machines have switches to choose whether they are C standard
compliant, or Posix compliant.

Posix defines its system interface in terms of C functions. It very
definitly states that calling certain functions (in C code) synchronizes
the memory.

Posix says nothing about C++. A C++ compiler can do what it wants, and
not affect Posix compliance. Still, I can't imagine a C++ compiler not
extending the guarantees of C to cover its operations. (It does mean,
however, that a fair number of C++ compilers do not generate thread safe
code, particularly when it comes to initializing local statics. This
means that any time you write a multithreaded application in C++, you
have to find out exactly what the compiler does or does not guarantee,

> If I can find the standard at a reasonable price I'll certainly check.

--
James Kanze mailto:jka...@caicheuvreux.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

### Johan Johansson

Oct 4, 2002, 9:01:26 PM10/4/02
to
James Kanze wrote:

>> > As we've been trying to explain, guaranteeing that the compiler
>> > will not use a value which it explicitly cached in a register
>> > simply doesn't buy you anything.
>
>
>>As *I* have been trying to explain it's a prerequisite, so it *does*
>
>
> On a nice day, if the compiler vendor is co-operating.
>
> In fact, it buys you exactly what the compiler vendor guarantees, and
> not anything more. In practice, at least in a Posix environment, Posix
> *requires* the locks.

As I have written numerous times, I have never claimed that volatile is
a substitute for locking. Please stop implying I have.

> With or without volatile. And if you use the
> locks, you don't need volatile. So with regards to threading, volatile

Your evidence in this matter is circumstantial at best. I'd prefer a
statement in a standard that puts a requirement on the compiler to not
cache values in registers across memory barriers.

>>So a compiler only supports particular mutex implementations? Using
>>another library is not an option?
>
> If the compiler is for Solaris, it probably supports both the older

Using a posix thread implementation in the form of a custom library with
an older compiler would then have been impossible and someone trying to
roll his own for educational purposes using os calls and/or assembly
snippets would stand no chance at all of succeeding?

>> > Posix makes certain requirements. (I suppose that Windows threads
>> > offer similar guarantees, and make similar requirements.) If my
>> > program conforms to those requirements, and the system claims Posix
>> > compliance, then it is the compiler's or the system's problem to
>> > make my program work. It's none of my business how they do it.
>
>
>>As far as I know posix threads (the full posix jungle is beyond my
>>knowledge) define their behaviour in terms of reads and writes. Then
>>again I should point out that I haven't read the actual standard, just
>>articles and books about it such as Butenhof's "Programming with POSIX
>
>
> The actual standard says that "The following functions synchronize
> memory with respect to other threads [...]". I presume that means that
> all threads see the same data. Which is probably more than is really
> meant: my interpretation is that the functions guarantee that the data
> is in the global memory shared by all threads is identical to what the

Where is the requirement on the compiler?

>> > With regards to code motion of the compilers, there are two
>> > relatively simple solutions:
>
>
>> > - The compiler knows about the system calls, and knows that it
>> > cannot move reads or writes around across them, or
>
>
>>I would be upset if I couldn't use an otherwise perfectly correct
>>threading library just because the compiler wasn't tuned for it.
>
>
> I don't know of any way you can easily add to the system calls an OS
> supports. If you do add to them (or change existing ones), I wouldn't
> expect any support from the compiler.

I am hesitant to believe that a compiler needs to know names of os
and/or library calls. I can't of course dismiss the possibility
entirely. Things are commonly not nearly as good as I'd expect.

>> > - The compiler doesn't know about them, and treats them just as
>> > any other external function call. In this case, of course, it
>> > had better ensure that the necessary reads and writes have
>> > taken place, since it cannot assume that the called code
>> > doesn't make use of or modify the variables in question. (Any
>> > object accessible from another thread would also be accessible
>> > from an external function with unknown semantics.)
>
>
>>Let's hope they are not inline functions then.
>
>
> What is not an inline function? The threading primitives can't be,
> since they cannot be written in C++.

Many popular compilers support inline assembler.

>> > The system level implementation of the mutex functions must contain
>> > the necessary additional instructions to ensure cache consistency.
>
>
>>Absolutely. But the reads/writes to the cache must be there in the
>>first place.
>
>
> Agreed. That's the compilers problem. As I have pointed out above, it
> isn't a problem in practice.

As you have *claimed* above. My claim is that it's pure luck and because
of how a particular threading implementation (as opposed to a
specification of one, e.g. the POSIX standard) looks. Until I see a
guarantee that the compiler won't cache values across memory barriers I
will maintain this will break sooner or later if a compiler optimizes
aggressively enough.

>>I am perfectly aware that C++ doesn't mention threads. Nevertheless
>>the standard does define the behaviour of volatile for fundamental
>>types.
>
>
> No it doesn't. All it says is that accesses to volatiles are
> "observable behavior".

So in your opinion, it could wave a checkered flag for a write to a
volatile and spit out a pizza for a read from one?

> If nothing else, you lack atomicity.

>>This doesn't change when you are running several threads.
>
>
> But since different threads don't see the same copy of memory, what does
> that get us?

The *chance* to use e.g. a POSIX mutex to make them see the same copy of
memory.

>> > I don't think so. I'm talking about accessing objects in memory,
>> > and what the standards (both C++ and Posix) guarantee.
>
>
>>If you by Posix mean posix threads I don't think it guarantees that
>
>
> What reads or writes that never took place?

If you disagree that access to a volatile fundamental type will ensure

> The C++ standard doesn't guarantee anything with regards to threads.

Yes. Both our posts have stated this multiple times already.

> Posix requires that the reads and writes (or the abstract machine) be
> visible to other threads at specific points,

> as a result of specific
> machine instructions. A compiler for a Posix machine will conform to
> both standards. In the case of threading, of course, the only standard
> which interests you is Posix.

I've used posix threads on machines that were not fully Posix compliant.
It comes as a library, not as a compiler.

>>I also doubt that it places requirements on the compiler.
>
>
> It certainly places requirements on the C compiler. A lot of them.
> Some are restrictions (a char must be exactly 8 bits, etc.). Some are
> actually in contradiction with the C standard; all C compilers I've seen
> on Posix machines have switches to choose whether they are C standard
> compliant, or Posix compliant.

I'll read up before I respond to that.

> Posix defines its system interface in terms of C functions. It very
> definitly states that calling certain functions (in C code) synchronizes
> the memory.

Yes, of course. We are not debating that. I'm not anyway.

> Posix says nothing about C++. A C++ compiler can do what it wants, and
> not affect Posix compliance. Still, I can't imagine a C++ compiler not
> extending the guarantees of C to cover its operations. (It does mean,
> however, that a fair number of C++ compilers do not generate thread safe
> code, particularly when it comes to initializing local statics. This
> means that any time you write a multithreaded application in C++, you
> have to find out exactly what the compiler does or does not guarantee,

Been there done that. That static local initialization doesn't work in a
lot of compilers is very unfortunate. Since the C++ standard clearly
says that it should be initialized on the first call there's really no
excuse. It doesn't say under which circumstances so this means this
should hold under *any* circumstances and if your compiler even has a

>>If I can find the standard at a reasonable price I'll certainly check.
>
>
> Try http://www.UNIX-systems.org/single_unix_specification/.

Thanks!

j

### Alexander Terekhov

Oct 5, 2002, 12:35:04 PM10/5/02
to

Johan Johansson wrote:
[...]

> > Agreed. That's the compilers problem. As I have pointed out above, it
> > isn't a problem in practice.
>
> As you have *claimed* above. My claim is that it's pure luck and because
> of how a particular threading implementation (as opposed to a
> specification of one, e.g. the POSIX standard) looks. Until I see a
> guarantee that the compiler won't cache values across memory barriers I
> will maintain this will break sooner or later if a compiler optimizes
> aggressively enough.

http://www.opengroup.org/onlinepubs/007904975/xrat/xbd_chap04.html#tag_01_04_10
("Memory Synchronization", Rationale)

"....
In summary, a portable multi-threaded program, or a multi-process
program that shares writable memory between processes, has to use
the synchronization primitives to synchronize data access. It
cannot rely on modifications to memory being observed by other
threads in the order written in the application or even on
modification of a single variable being seen atomically.

Conforming applications may only use the functions listed to
synchronize threads of control with respect to memory access.
There are many other candidates for functions that might also be
used. Examples are: signal sending and reception, or pipe writing
control to wait for an action caused by another thread of control
is a candidate. IEEE Std 1003.1-2001 does not require these
additional functions to synchronize memory access since this
would imply the following:

All these functions would have to be recognized by advanced
compilation systems so that memory operations and calls to
these functions are not reordered by optimization.

All these functions would potentially have to have memory
synchronization instructions added, depending on the particular
machine.

The additional functions complicate the model of how memory
is synchronized and make automatic data race detection
techniques impractical.

Formal definitions of the memory model were rejected as unreadable
by the vast majority of programmers. In addition, most of the formal
work in the literature has concentrated on the memory as provided by
the hardware as opposed to the application programmer through the
compiler and runtime system. It was believed that a simple statement
intuitive to most programmers would be most effective. ...."

regards,
alexander.

P.S. I really like the last statement above... ;-) ;-)

### t...@cs.ucr.edu

Oct 7, 2002, 5:56:53 AM10/7/02
to
Johan Johansson <joh...@ipunplugged.com> wrote:
+ Alexander Terekhov wrote:
+ [...]

+>>I would however appriciate a comment on my statement that volatile
+>>is necessary since it ensures that the compiler generates code that
+>>actually writes to memory by disabling certain optimizations.
+>
+>
+> C/C++ "volatile" specifier (an "implementation-defined" thing,
+> to some extent, BTW) is NOT necessary to do anything useful in
+> the context of sharing memory between threads/processes.
+>
+>
+>>Without volatile your compiler would need to know that
+>>you are using mutexes, which in general is infeasible.
+>
+>
+> What makes you think so?

+ As I wrote in my two earlier posts volatile forces the compiler to
+ generate reads/writes to memory. Without volatile, the compiler might
+ decide that it already knows the value that it will be using a few lines
+ down anyway and keep it in a register instead of writing it back to
+ memory and then reading it back from memory. This is a portable
+ behaviour of volatile.

At times the value associated with an object's state will differ from
the value last assigned to the object by a given stream of
computational activity. On the one hand, the object might be
*dirty* --- the object might have been assigned a new value that is not
yet reflected in the object's state. On the other hand, the object
might be {\em stale} (even if its current value resulted from an
lvalue-to-rvalue conversion) --- for example, the object's state might
have been externally modified via a debugger. So, at certain points
in a computational stream, objects must be {\em reconciled}, i.e.:

* Before that point but after last preceding update of the object's
value by the stream, the object's state must be updated from the
object's value.

* After that point but before then next use of the object's value,
unless there is an intervening update of the object's value, the
object's value must be updated from the object's state.

Regardless of when an object's state is fetched, instantly that state
may be modified, say, by a debugger or a signal handler. Since
fetched values are never guaranteed to be absolutely fresh, how fresh
is fresh enough? Ultimately, in the execution of a program there need
to be {\em sequence points}, at which each object gets reconciled and
after which values fetched from memory are assumed to be fresh enough
until the next sequence point. In C/C++ there is a sequence point at
each ocurrence of a semicolon, a function call, or an operator having
a defined order of evaluation.

Although every object must be reconciled at each sequence point, many
of those loads and stores can be optimized away under the as-if rule
and default assumptions that, except where otherwise specified:

* each object is {\em non-observable}, i.e., the object's state has no
effect on other streams of computational activity, in which case the
store operation associated with reconciling this object can usually
be optimized away.

* each object is {\em non-volatile}, i.e., the object's current value is
the value last assigned to it by the current stream, in which case the
fetch operation associated with reconciling this object can usually be
optimized away.

For behavior to be well defined, objects that violate the default
assumptions, e.g., objects that are allocated to input or output
registers or that are to be modified or observed via a debuger or via
non-coordinated streams of computational activity these default
assumptions are suspended for objects of volatile-qualified type,
objects. In C/C++, this is accomplished by giving the object a
volatile-qualified type.\footnote{Note that objects are given a
volatile-qualified type because they are volatile or observable, not
the other way around.} Alternatively, the object could be accessed
only via special assembly-language functions that deal directly with
the object's state. Under current practice, but without guarantee by
the standards, the functions

int get( int* x ) { return *x; }
int put( int* x, int y ) { return *x = y; }

will work so long as they are not in-line and are compiled separately
from the rest of the code.

+ Will a mutex force the compiler to generate memory reads/writes? It's
+ only even possible if it is aware that you *are* using a mutex. This is
+ certainly not a portable behaviour if it exists at all.

The specification for mutex operations should include the suspension
of the default assumptions mentioned above for all variables, i.e.,
mutex operations should be treated as super sequence points where all
variables are treated as though they had volatile-qualified type.
That is not sufficient, however. Mutex operations must also involve
barriers past which instructions may not be hoisted or sunk by
hardware or software. We can't have accesses to shared objects
leaking out the tops or bottoms of critical sections.

Tom Payne

### Johan Johansson

Oct 7, 2002, 6:19:03 PM10/7/02
to
Alexander Terekhov wrote:

[...]

>>Until I see a
>>guarantee that the compiler won't cache values across memory barriers I
>>will maintain this will break sooner or later if a compiler optimizes
>>aggressively enough.

[...]

> Conforming applications may only use the functions listed to
> synchronize threads of control with respect to memory access.
> There are many other candidates for functions that might also be
> used. Examples are: signal sending and reception, or pipe writing
> and reading. In general, any function that allows one thread of
> control to wait for an action caused by another thread of control
> is a candidate. IEEE Std 1003.1-2001 does not require these
> additional functions to synchronize memory access since this
> would imply the following:
>
> All these functions would have to be recognized by advanced
> compilation systems so that memory operations and calls to
> these functions are not reordered by optimization.

Should this be interpreted as saying that "advanced compilation systems"
*are* required to not reorder across calls to the functions that are
required to synchronize threads of control? I suppose that this in turn
could mean that you can't optimize them away. I have spent a limited
time looking at the link that Jamez Kanze provided but I have not been
able to locate anything that exactly addresses the issue. I can't
exclude at this time that it's there though.

> Formal definitions of the memory model were rejected as unreadable
> by the vast majority of programmers. In addition, most of the formal
> work in the literature has concentrated on the memory as provided by
> the hardware as opposed to the application programmer through the
> compiler and runtime system. It was believed that a simple statement
> intuitive to most programmers would be most effective. ...."
>
> regards,
> alexander.
>
> P.S. I really like the last statement above... ;-) ;-)

I on the other hand would have preferred a statement that placed a clear
requirement on my compiler. That said I don't think a formal memory
model definition is called for since the requirement could clearly be
written in a couple of sentences.

j

### Johan Johansson

Oct 7, 2002, 6:22:40 PM10/7/02
to
t...@cs.ucr.edu wrote:
[...]

> + Will a mutex force the compiler to generate memory reads/writes? It's
> + only even possible if it is aware that you *are* using a mutex. This is
> + certainly not a portable behaviour if it exists at all.
>
> The specification for mutex operations should include the suspension
> of the default assumptions mentioned above for all variables, i.e.,
> mutex operations should be treated as super sequence points where all
> variables are treated as though they had volatile-qualified type.

I think we all in this subthread agree that this is the optimal
scenario. The question is merely whether this is indeed the case.

> That is not sufficient, however. Mutex operations must also involve
> barriers past which instructions may not be hoisted or sunk by
> hardware or software. We can't have accesses to shared objects
> leaking out the tops or bottoms of critical sections.

Right. I'm fairly certain that the POSIX threads specification
guarantees that reads and writes that have been issued are performed
before a new read or write can be issued, so the hardware reordering
should be covered.

Whether the software reordering is covered or not is another matter. I
suspect that this is left to "common practice", i.e. blind luck, such as
not reordering across function calls.

j

### Alexander Terekhov

Oct 8, 2002, 9:59:03 AM10/8/02
to

Johan Johansson wrote:
[...]
> > Conforming applications may only use the functions listed to
> > synchronize threads of control with respect to memory access.
> > There are many other candidates for functions that might also be
> > used. Examples are: signal sending and reception, or pipe writing
> > and reading. In general, any function that allows one thread of
> > control to wait for an action caused by another thread of control
> > is a candidate. IEEE Std 1003.1-2001 does not require these
> > additional functions to synchronize memory access since this
> > would imply the following:
> >
> > All these functions would have to be recognized by advanced
> > compilation systems so that memory operations and calls to
> > these functions are not reordered by optimization.
>
> Should this be interpreted as saying that "advanced compilation
> systems" *are* required to not reorder across calls to the functions
> that are required to synchronize threads of control?

Yep.

> I suppose that this in turn could mean that you can't optimize them
> away.

Nope (see "9 Optimizations" in the paper below, for example).

> I have spent a limited time looking at the link that Jamez Kanze
> provided but I have not been able to locate anything that exactly
> addresses the issue. I can't exclude at this time that it's there
> though.

I don't think that memory model requirements/semantics are clearly
spelled out in the POSIX standard(*). That would require something
along the lines (Java's revised >>volatiles<< aside ;-) ) of:

http://www.cs.umd.edu/~pugh/java/memoryModel/semantics.pdf

Note that POSIX "memory model" is, unfortunately, kinda totally
broken (underspecified'' if you like) with respect to C/C++
language(s) and adjacent/compound objects (i.e. memory granules):
< check out this entire thread >

Subject: "memory location"
Newsgroups: comp.std.c
Date: 2002-07-18 09:08:00 PST

regards,
alexander.

(*) You might also want to try this:

Registration and free membership to get access:

http://www.opengroup.org/austin

### t...@cs.ucr.edu

Oct 8, 2002, 4:13:41 PM10/8/02
to
Johan Johansson <joh...@ipunplugged.com> wrote:
+ t...@cs.ucr.edu wrote:
+ [...]

+> + Will a mutex force the compiler to generate memory reads/writes?
It's
+> + only even possible if it is aware that you *are* using a mutex.
This is
+> + certainly not a portable behaviour if it exists at all.
+>
+> The specification for mutex operations should include the suspension
+> of the default assumptions mentioned above for all variables, i.e.,
+> mutex operations should be treated as super sequence points where all
+> variables are treated as though they had volatile-qualified type.

+ I think we all in this subthread agree that this is the optimal
+ scenario.

It is not only "optimal"; it's also necessary for correct behavior.

+ The question is merely whether this is indeed the case.

It is the case but perhaps, in some implementations of threads, as a
side-effect of separate compilation rather than precise specification.

+> That is not sufficient, however. Mutex operations must also involve
+> barriers past which instructions may not be hoisted or sunk by
+> hardware or software. We can't have accesses to shared objects
+> leaking out the tops or bottoms of critical sections.

+ Right. I'm fairly certain that the POSIX threads specification
+ guarantees that reads and writes that have been issued are performed
+ before a new read or write can be issued, so the hardware reordering
+ should be covered.

+ Whether the software reordering is covered or not is another matter. I

+ suspect that this is left to "common practice", i.e. blind luck, such
as
+ not reordering across function calls.

I've not carefully read the POSIX spec, but from the discussions I've
followed and the quotes I've seen posted by Terekov and Butenhof, I'm
confident that the wording covers both software and hardware.

You have a good point however. Hardware folks have long been aware of
the need to suspend reordering at such points. Most software folks
are not so aware of that particular pitfall. For instance, I wonder
if the folks who are working on run-time reoptimizers are careful to
make sure that they don't move code past barrier instructions. Also,
what would happen if we made mutex operations inline?

Tom Payne

### Johan Johansson

Oct 8, 2002, 5:29:51 PM10/8/02
to
t...@cs.ucr.edu wrote:

[...]

> +> The specification for mutex operations should include the suspension
> +> of the default assumptions mentioned above for all variables, i.e.,
> +> mutex operations should be treated as super sequence points where all
> +> variables are treated as though they had volatile-qualified type.
>
> + I think we all in this subthread agree that this is the optimal
> + scenario.
>
> It is not only "optimal"; it's also necessary for correct behavior.

It is necessary to get away with not declaring variables volatile. It is
not necessary for correct behavior if the variables are declared volatile.

> + The question is merely whether this is indeed the case.
>
> It is the case but perhaps, in some implementations of threads, as a
> side-effect of separate compilation rather than precise specification.

This is what I call "blind luck". Separate compilation is not a
guarantee that you can't cache a value in a register. Calling
conventions normally dictate that a function must restore the contents
of certain registers before returning. It is just not normally done.

> +> That is not sufficient, however. Mutex operations must also involve
> +> barriers past which instructions may not be hoisted or sunk by
> +> hardware or software. We can't have accesses to shared objects
> +> leaking out the tops or bottoms of critical sections.

[...]

> + Whether the software reordering is covered or not is another matter. I
>
> + suspect that this is left to "common practice", i.e. blind luck, such
> as
> + not reordering across function calls.
>
> I've not carefully read the POSIX spec, but from the discussions I've
> followed and the quotes I've seen posted by Terekov and Butenhof, I'm
> confident that the wording covers both software and hardware.

I might be able to squeeze some reading hours in this weekend. Of
course, if someone could settle this issue definitely before then I

> You have a good point however. Hardware folks have long been aware of
> the need to suspend reordering at such points. Most software folks
> are not so aware of that particular pitfall. For instance, I wonder
> if the folks who are working on run-time reoptimizers are careful to
> make sure that they don't move code past barrier instructions. Also,
> what would happen if we made mutex operations inline?

I obviously don't know, but I wish compiler writers would make more of
an effort to inform about matters such as these. Especially since it is
explicitly not covered by the language spec.

j

### t...@cs.ucr.edu

Oct 9, 2002, 6:08:30 AM10/9/02
to
Johan Johansson <joh...@ipunplugged.com> wrote:
+ t...@cs.ucr.edu wrote:

+ [...]

+> +> The specification for mutex operations should include the suspension
+> +> of the default assumptions mentioned above for all variables, i.e.,
+> +> mutex operations should be treated as super sequence points where all
+> +> variables are treated as though they had volatile-qualified type.
+>
+> + I think we all in this subthread agree that this is the optimal
+> + scenario.
+>
+> It is not only "optimal"; it's also necessary for correct behavior.

+ It is necessary to get away with not declaring variables volatile. It is
+ not necessary for correct behavior if the variables are declared volatile.

Other than the fact that it is:
* expensive
* unnecessary
* insufficient
there's no reason not to give thread-shared variables volatile-qualified
types.

[...]
+ Separate compilation is not a
+ guarantee that you can't cache a value in a register. Calling
+ conventions normally dictate that a function must restore the contents
+ of certain registers before returning. It is just not normally done.

Quite true. A conforming implementation of C/C++ can hoist accesses
to thread-shared objects past mutex operations. A conforming
implementation of POSIX cannot. (Note that the C compiler is part of
the POSIX implementation.)

[...]
+ I might be able to squeeze some reading hours in this weekend. Of
+ course, if someone could settle this issue definitely before then I
+ could play some squash instead.

Alexander, god bless him, can cite chapter and verse, and save you a

+> You have a good point however. Hardware folks have long been aware of
+> the need to suspend reordering at such points. Most software folks
+> are not so aware of that particular pitfall. For instance, I wonder
+> if the folks who are working on run-time reoptimizers are careful to
+> make sure that they don't move code past barrier instructions. Also,
+> what would happen if we made mutex operations inline?

+ I obviously don't know, but I wish compiler writers would make more of
+ an effort to inform about matters such as these. Especially since it is
+ explicitly not covered by the language spec.

The problem is the language spec. C/C++ should define enough primitives
in its library that one could implement POSIX without resorting to
assembly language.

Tom Payne

### Johan Johansson

Oct 9, 2002, 4:53:44 PM10/9/02
to
Alexander Terekhov wrote:

>>> All these functions would have to be recognized by advanced
>>> compilation systems so that memory operations and calls to
>>> these functions are not reordered by optimization.
>>
>>Should this be interpreted as saying that "advanced compilation
>>systems" *are* required to not reorder across calls to the functions
>>that are required to synchronize threads of control?
>
>
> Yep.

Then I suppose there is another statement somewhere to that effect? Or
is this the clearest requirement of this?

>>I suppose that this in turn could mean that you can't optimize them
>>away.
>
>
> Nope (see "9 Optimizations" in the paper below, for example).

But this is based on the Java memory model and not the POSIX one. As far
as I can tell optimization 5 is in direct violation with POSIX semantics.

I will. Thanks.

j

### Johan Johansson

Oct 9, 2002, 4:55:08 PM10/9/02
to
t...@cs.ucr.edu wrote:

[...]

> +> +> The specification for mutex operations should include the suspension
> +> +> of the default assumptions mentioned above for all variables, i.e.,
> +> +> mutex operations should be treated as super sequence points where all
> +> +> variables are treated as though they had volatile-qualified type.
> +>
> +> + I think we all in this subthread agree that this is the optimal
> +> + scenario.
> +>
> +> It is not only "optimal"; it's also necessary for correct behavior.
>
> + It is necessary to get away with not declaring variables volatile. It is
> + not necessary for correct behavior if the variables are declared volatile.
>
> Other than the fact that it is:
> * expensive
> * unnecessary
> * insufficient
> there's no reason not to give thread-shared variables volatile-qualified
> types.

I still won't argue against it being expensive and insufficient.
However, it is only unnecessary if we are guaranteed that the compiler
works the way you describe in the most quoted statement above. I thought
we agreed that this was more by chance than because of any guarantee by
a standard (as far as we know anyway). I described that as being optimal
since it would still allow optimizations such as register caching
between memory barriers/mutex operations.

> + Separate compilation is not a
> + guarantee that you can't cache a value in a register. Calling
> + conventions normally dictate that a function must restore the contents
> + of certain registers before returning. It is just not normally done.
>
> Quite true. A conforming implementation of C/C++ can hoist accesses
> to thread-shared objects past mutex operations. A conforming
> implementation of POSIX cannot. (Note that the C compiler is part of
> the POSIX implementation.)

Again, it would be interesting to see the statement(s) in POSIX that
guarantee(s) that.

j