The Semantics of 'volatile'

Tim Rentsch

unread,

Jun 1, 2009, 7:43:11 PM6/1/09

to

The Semantics of 'volatile'
===========================

I've been meaning to get to this for a while, finally there's a
suitable chunk of free time available to do so.

To explain the semantics of 'volatile', we consider several
questions about the concept and how volatile variables behave,
etc. The questions are:

1. What does volatile do?
2. What guarantees does using volatile provide? (What memory
regimes must be affected by using volatile?)
3. What limits does the Standard set on how using volatile
can affect program behavior?
4. When is it necessary to use volatile?

We will take up each question in the order above. The comments
are intended to address both developers (those who write C code)
and implementors (those who write C compilers and libraries).

What does volatile do?
----------------------

This question is easy to answer if we're willing to accept an
answer that may seem somewhat nebulous. Volatile allows contact
between execution internals, which are completely under control
of the implementation, and external regimes (processes or other
agents) not under control of the implementation. To provide such
contact, and provide it in a well-defined way, using volatile
must ensure a common model for how memory is accessed by the
implementation and by the external regime(s) in question.

Subsequent answers will fill in the details around this more
high level one.

What guarantees does using volatile provide?
--------------------------------------------

The short answer is "None." That deserves some elaboration.

Another way of asking this question is, "What memory regimes must
be affected by using volatile?" Let's consider some possibilities.
One: accesses occur not just to registers but to process virtual
memory (which might be just cache); threads running in the same
process affect and are affected by these accesses. Two: accesses
occur not just to cache but are forced out into the inter-process
memory (or "RAM"); other processes running on the same CPU core
affect and are affected by these accesses. Three: accesses occur
not just to memory belonging to the one core but to memory shared
by all the cores on a die; other processes running on the same CPU
(but not necessarily the same core) affect and are affected by
these accesses. Four: accesses occur not just to memory belonging
to one CPU but to memory shared by all the CPUs on the motherboard;
processes running on the same motherboard (even if on another CPU
on that motherboard) affect and are affected by these accesses.
Five: accesses occur not just to fast memory but also to some slow
more permanent memory (such as a "swap file"); other agents that
access the "swap file" affect and are affected by these accesses.

The different examples are intended informally, and in many cases
there is no distinction between several of the different layers.
The point is that different choices of regime are possible (and
I'm sure many readers can provide others, such as not only which
memory is affected but what ordering guarantees are provided).
Now the question again: which (if any) of these different
regimes are /guaranteed/ to be included by a 'volatile' access?

The answer is none of the above. More specifically, the Standard
leaves the choice completely up to the implementation. This
specification is given in one sentence in 6.7.3 p 6, namely:

What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

So a volatile access could be defined as coordinating with any of
the different memory regime alternatives listed above, or other,
more exotic, memory regimes, or even (in the claims of some ISO
committee participants) no particular other memory regimes at all
(so a compiler would be free to ignore volatile completely)[*].
How extreme this range is may be open to debate, but I note that
Larry Jones, for one, has stated unequivocally that the possibility
of ignoring volatile completely is allowed under the proviso given
above. The key point is that the Standard does not identify which
memory regimes must be affected by using volatile, but leaves that
decision to the implementation.

A corollary to the above that any volatile-qualified access
automatically introduces an implementation-defined aspect to a
program.

[*] Possibly not counting the specific uses of 'volatile' as it
pertains to setjmp/longjmp and signals that the Standard
identifies, but these are side issues.

What limits are there on how volatile access can affect program behavior?
-------------------------------------------------------------------------

More properly this question is "What limits does the Standard
impose on how volatile access can affect program behavior?".

Again the short answer is None. The first sentence in 6.7.3 p 6
says:

An object that has volatile-qualified type may be modified
in ways unknown to the implementation or have other unknown
side effects.

Nowhere in the Standard are any limitations stated as to what
such side effects might be. Since they aren't defined, the
rules of the Standard identify the consequences as "undefined
behavior". Any volatile-qualified access results in undefined
behavior (in the sense that the Standard uses the term).

Some people are bothered by the idea that using volatile produces
undefined behavior, but there really isn't any reason to be. At
some level any C statement (or variable access) might behave in
ways we don't expect or want. Program execution can always be
affected by peculiar hardware, or a buggy OS, or cosmic rays, or
anything else outside the realm of what the implementation knows
about. It's always possible that there will be unexpected
changes or side effects, in the sense that they are unexpected by
the implementation, whether volatile is used or not. The
difference is, using volatile interacts with these external
forces in a more well-defined way; if volatile is omitted, there
is no guarantee as to how external forces on particular parts
of the physical machine might affect (or be affected by) changes
in the abstract machine.

Somewhat more succinctly: using volatile doesn't affect the
semantics of the abtract machine; it admits undefined behavior
by unknown external forces, which isn't any different from the
non-volatile case, except that using volatile adds some
(implementation-defined) requirements about how the abstract
machine maps onto the physical machine in the external forces'
universe. However, since the Standard mentions unknown side
effects explicitly, such things seem more "expectable" when
volatile is used. (volatile == Expect the unexected?)

When is it necessary to use volatile?
-------------------------------------

In terms of pragmatics this question is the most interesting of
the four. Of course, as phrased the question asked is more of a
developer question; for implementors, the phrasing would be
something more like "What requirements must my implementation
meet to satisfy developers who are using 'volatile' as the
Standard expects?"

To get some details out of the way, there are two specific cases
where it's necessary to use volatile, called out explicitly in
the Standard, namely setjmp/longjmp (in 7.13.2.1 p 3) and
accessing static objects in a signal handler (in 7.14.1.1 p 5).
If you're a developer writing code for one of these situations,
either use volatile, code around it so volatile isn't needed
(this can be done for setjmp), or be sure that the particular
code you're writing is covered by some implementation-defined
guarantees (extensions or whatever). Similarly, if you're an
implementor, be sure that using volatile in the specific cases
mentioned produces code that works; what this means is that the
volatile-using code should behave just like it would under
regular, non-exotic control structures. Of course, it's even
better if the implementation can do more than the minimum, such
as: define and document some additional cases for signal
handling code; make variable access in setjmp functions work
without having to use volatile, or give warnings for potential
transgressions (or both).

The two specific cases are easy to identify, but of course the
interesting cases are everything else! This area is one of the
murkiest in C programming, and it's useful to take a moment to
understand why. For implementors, there is a tension between
code generation and what semantic interpretation the Standard
requires, mostly because of optimization concerns. Nowhere is
this tension felt more keenly than in translating 'volatile'
references faithfully, because volatile exists to make actions in
the abstract machine align with those occurring in the physical
machine, and such alignment prevents many kinds of optimization.
To appreciate the delicacy of the question, let's look at some
different models for how implementations might behave.

The first model is given as an Example in 5.1.2.3 p 8:

EXAMPLE 1 An implementation might define a one-to-one
correspondence between abstract and actual semantics: at
every sequence point, the values of the actual objects would
agree with those specified by the abstract semantics.

We call this the "White Box model". When using implementations
that follow the White Box model, it's never necessary to use
volatile (as the Standard itself points out: "The keyword
volatile would then be redundant.").

At the other end of the spectrum, a "Black Box model" can be
inferred based on the statements in 5.1.2.3 p 5. Consider an
implementation that secretly maintains "shadow memory" for all
objects in a program execution. Regular memory addresses are
used for address-taking or index calculation, but any actual
memory accesses would access only the shadow memory (which is at
a different location), except for volatile-qualified accesses
which would load or store objects in the regular object memory
(ie, at the machine addresses produced by pointer arithmetic or
the & operator, etc). Only the implementation would know how to
turn a regular address into a "shadow" object access. Under the
Black Box model, volatile objects, and only volatile objects, are
usable in any useful way by any activity outside of or not under
control of the implementation.

At this point we might stop and say, well, let's just make a
conservative assumption that the implementation is following the
Black Box model, and that way we'll always be safe. The problem
with this assumption is that it's too conservative; no sensible
implementation would behave this way. Consider some of the
ramifications:

1. Couldn't use a debugger to examine variables (except
volatile variables);

2. Couldn't call an externally defined function written
in assembly or another language, unless the function
is declared with a prototype having volatile-qualified
parameters (and even that case isn't completely clear,
because of the rule at the end of 6.7.5.3 p 15 about
how functions types are compared and composited);

3. Couldn't call ordinary OS functions like read() and
write() unless the memory buffers were accessed
using volatile-qualified expressions.

These "impossible" conditions never happen because no
implementation is silly enough to take the Black Box model
literally. Technically, it would be allowed, but no one would
use it because it breaks too many deep assumptions about how a C
runtime interacts with its environment.

A more realistic model is one of many "Gray Box models" such
as the example implementation mentioned in 5.1.2.3 p 9:

Alternatively, an implementation might perform various
optimizations within each translation unit, such that the
actual semantics would agree with the abstract semantics
only when making function calls across translation unit
boundaries. In such an implementation, at the time of each
function entry and function return where the calling
function and the called function are in different
translation units, the values of all externally linked
objects and of all objects accessible via pointers therein
would agree with the abstract semantics. Furthermore, at
the time of each such function entry the values of the
parameters of the called function and of all objects
accessible via pointers therein would agree with the
abstract semantics. In this type of implementation, objects
referred to by interrupt service routines activated by the
signal function would require explicit specification of
volatile storage, as well as other implementation-defined
restrictions.

Here the implementation has made a design choice that makes
volatile superfluous in many cases. To get variable values to
store-synchronize, we need only call an appropriate function:

extern void okey_dokey( void );
extern int v;

...
v = 49; // storing into v is a "volatile" access
okey_dokey();
foo( v ); // this access is also "volatile"

Note that these "volatile" accesses work the way an actual
volatile access does because of an implementation choice about
calling functions defined in other translation units; obviously
that's implementation dependant.

Let's look at one more model, of interest because it comes up in
operating systems, which are especially prone to want to do
things that won't work without 'volatile'. In our hypothetical
kernel code, we access common blocks by surrounding the access
code with mutexes, which for simplicity are granted with spin
locks. Access code might look like this:

while( block_was_locked() ) { /*spin*/ }
// getting here means we have the lock
// access common block elements here
// ... and access some more
// ... and access some more
// ... and access some more
unlock_block();

Here it's understood that locking ('block_was_locked()') and
unlocking ('unlock_block()') will be done using volatile, but the
accesses inside the critical region of the mutex just use regular
variable access, since the block access code is protected by the
mutex.

If one is implementing a compiler to be used on operating system
kernels, this model (only partially described, but I think the
salient aspects are clear enough) is one worth considering. Of
course, the discussion here is very much simplified, there are
lots more considerations when designing actual operating system
locking mechanisms, but the basic scheme should be evident.

Looking at a broader perspective, is it safe to assume this model
holds in some unknown implementation(s) on our platforms of
choice? No, of course it isn't. The behavior of volatile is
implementation dependant. The model here is relevant because
many kernel developers unconsciously expect their assumptions
about locks and critical regions, etc., to be satisfied by using
volatile in this way. Any sensible implementation would be
foolish to ignore such assumptions, especially if kernel
developers were known to be in the target audience.

Returning to the original question, what answers can we give?

If you're an implementor, know that the Standard offers great
latitude in what volatile is required to do, but choosing any of
the extreme points is likely to be a losing strategy no matter what
your target audience is. Think about what other execution
regime(s) your target audience wants/needs to interact with;
choose an appropriate model that allows volatile to interact with
those regimes in a convenient way; document that model (as 6.7.3p6
requires for this implementation-defined aspect) and follow it
faithfully in producing code for volatile access. Remember that
you're implementing volatile to provide access to alternative
execution regimes, not just because the Standard requires it, and
it should work to provide that access, conveniently and without
undue mental contortions. Depending on the extent of the regimes
or the size of the target audience, several different models might
be given under different compiler options (if so it would help to
record which model is being followed in each object file, since the
different models are likely not to intermix in a constructive way).

If you're a developer, and are intent on being absolutely
portable across all implmentations, the only safe assumption is
the Black Box model, so just make every single variable and
object access be volatile-qualified, and you'll be safe. More
practically, however, a Gray Box model like one of the two
described above probably holds for the implementation(s) you're
using. Look for a description of what the safe assumptions are
in the implementations' documentation, and follow that; and, it
would be good to let the implementors know if a suitable
description isn't there or doesn't describe the requirements
adequately.

Franken Sense

unread,

Jun 1, 2009, 11:27:14 PM6/1/09

to

In Dread Ink, the Grave Hand of Tim Rentsch Did Inscribe:

> The Semantics of 'volatile'
> ===========================

I studied this recently both in C and Fortran. I have a small, embedded
job that motivates this interest.

The original post is 350 lines long, and I snipped it not out of spite; the
length makes it really hard to quote.

A question for OP: are you using C89 or C99?
--
Frank

And just like in 1984, where the enemy is switched from Eurasia to
Eastasia, Bush switched our enemy from al Qaeda to Iraq. Bush's War on
Terror is a war against whomever Bush wants to be at war with.
~~ Al Franken,

Thad Smith

unread,

Jun 1, 2009, 10:49:23 PM6/1/09

to

Tim Rentsch wrote:

> 1. What does volatile do?
> 2. What guarantees does using volatile provide? (What memory
> regimes must be affected by using volatile?)
> 3. What limits does the Standard set on how using volatile
> can affect program behavior?
> 4. When is it necessary to use volatile?

The C Standard only addresses the single thread program model, except
for external signal processing. Interacting threads aren't addressed.
Apparently Posix incorporates a version of the C Standard (I know zilch
of Posix). Since it does support multiple threads, etc., that may be a
better standard to explore those issues of volatile.

> A corollary to the above that any volatile-qualified access
> automatically introduces an implementation-defined aspect to a
> program.

Yes, but when you interface with memory-mapped hardware or concurrent
threads you are stepping outside the realm of Standard C's purview. The
implementation is the appropriate level to define that support.

> More properly this question is "What limits does the Standard
> impose on how volatile access can affect program behavior?".
>
> Again the short answer is None. The first sentence in 6.7.3 p 6
> says:
>
> An object that has volatile-qualified type may be modified
> in ways unknown to the implementation or have other unknown
> side effects.

To me, this is another way of saying that the since the implementation
can't see all the relevant accesses in the source code, it has to do
reads and writes to the volatile objects when the code days to. It's
also saying that the operation of the program may rely on features not
expressed in Standard C, such as DMA hardware, which might not be fully
known to the specific implementation.

> Nowhere in the Standard are any limitations stated as to what
> such side effects might be. Since they aren't defined, the
> rules of the Standard identify the consequences as "undefined
> behavior". Any volatile-qualified access results in undefined
> behavior (in the sense that the Standard uses the term).

This makes sense within the context of Standard C.

> Some people are bothered by the idea that using volatile produces
> undefined behavior, but there really isn't any reason to be. At
> some level any C statement (or variable access) might behave in
> ways we don't expect or want. Program execution can always be
> affected by peculiar hardware, or a buggy OS, or cosmic rays, or
> anything else outside the realm of what the implementation knows
> about.

While true, I interpret the primary meaning that mechanisms not fully
understood by the compiler are at work. As a programmer that addresses
these features, such as hardware registers, I need to understand them,
but the compiler doesn't have to.

Not true. Debuggers can have magic powers to know details that are not
promised by the language. Compilers and debuggers can collude to
provide the debugger all the information needed to show variable values.

> 2. Couldn't call an externally defined function written
> in assembly or another language, unless the function
> is declared with a prototype having volatile-qualified
> parameters (and even that case isn't completely clear,
> because of the rule at the end of 6.7.5.3 p 15 about
> how functions types are compared and composited);

This isn't promised because the C Standard doesn't discuss functions not
provided to the implementation in source form, except for implied OS
operations, etc., to support the standard library routines.

> These "impossible" conditions never happen because no
> implementation is silly enough to take the Black Box model
> literally. Technically, it would be allowed, but no one would
> use it because it breaks too many deep assumptions about how a C
> runtime interacts with its environment.

It would only be usable for programs that only call standard library
functions or functions supplied to the implementation at the time of
program translation.

> A more realistic model is one of many "Gray Box models" such
> as the example implementation mentioned in 5.1.2.3 p 9:
>
> Alternatively, an implementation might perform various
> optimizations within each translation unit, such that the
> actual semantics would agree with the abstract semantics
> only when making function calls across translation unit
> boundaries. In such an implementation, at the time of each
> function entry and function return where the calling
> function and the called function are in different
> translation units, the values of all externally linked
> objects and of all objects accessible via pointers therein
> would agree with the abstract semantics. Furthermore, at
> the time of each such function entry the values of the
> parameters of the called function and of all objects
> accessible via pointers therein would agree with the
> abstract semantics. In this type of implementation, objects
> referred to by interrupt service routines activated by the
> signal function would require explicit specification of
> volatile storage, as well as other implementation-defined
> restrictions.

This makes sense.

> Let's look at one more model, of interest because it comes up in
> operating systems, which are especially prone to want to do
> things that won't work without 'volatile'. In our hypothetical
> kernel code, we access common blocks by surrounding the access
> code with mutexes, which for simplicity are granted with spin
> locks. Access code might look like this:
>
> while( block_was_locked() ) { /*spin*/ }
> // getting here means we have the lock
> // access common block elements here
> // ... and access some more
> // ... and access some more
> // ... and access some more
> unlock_block();
>
> Here it's understood that locking ('block_was_locked()') and
> unlocking ('unlock_block()') will be done using volatile, but the
> accesses inside the critical region of the mutex just use regular
> variable access, since the block access code is protected by the
> mutex.
>
> If one is implementing a compiler to be used on operating system
> kernels, this model (only partially described, but I think the
> salient aspects are clear enough) is one worth considering.

What are the differences between the gray box and this later one? I
understand your description of typical use, but don't see the required
difference in generated code.

Here's a question for the OP: what issues come up in actual
implementations and use that makes this a important issue? An actual
implementation issue would help illuminate the various choices.

--
Thad

Beej Jorgensen

unread,

Jun 1, 2009, 11:14:26 PM6/1/09

to

Thad Smith <Thad...@acm.org> wrote:
>Apparently Posix incorporates a version of the C Standard

For the latest revision, it incorporates C99, FWIW.

-Beej

Richard Heathfield

unread,

Jun 2, 2009, 2:06:34 AM6/2/09

to

Beej Jorgensen said:

...thus instantly rendering most POSIX installations non-conforming.
What an odd strategy.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Forged article? See
http://www.cpax.org.uk/prg/usenet/comp.lang.c/msgauth.php
"Usenet is a strange place" - dmr 29 July 1999

Beej Jorgensen

unread,

Jun 2, 2009, 4:07:58 AM6/2/09

to

Richard Heathfield <r...@see.sig.invalid> wrote:

>Beej Jorgensen said:
>> For the latest revision, it incorporates C99, FWIW.
>
>...thus instantly rendering most POSIX installations non-conforming.
>What an odd strategy.

Any more or less odd than C99 rendering most compilers non-conforming?
:)

A C compiler isn't necessary for a system to be POSIX conformant. From
the description of the "c99" command in the Single Unix Spec:

# On systems providing POSIX Conformance (see the Base Definitions
# volume of IEEE Std 1003.1-2001, Chapter 2, Conformance), c99 is
# required only with the C-Language Development option; XSI-conformant
# systems always provide c99.

WRT strategy:

http://www.opengroup.org/austin/papers/backgrounder.html

# This revision tries to minimize the number of changes required to
# implementations which conform to the earlier versions of the approved
# standards to bring them into conformance with the current standard.
#
# [...]
#
# However, since it references the 1999 version of the ISO C standard,
# and no longer supports "Common Usage C", there are a number of
# unavoidable changes. Applications portability is similarly affected.

If you want to be POSIX conformant with the C-Language Development
Option, you need an ISO-compliant C compiler.

It's not so insane, when thought about in those terms. :)

-Beej

Erik Trulsson

unread,

Jun 2, 2009, 11:32:47 AM6/2/09

to

In comp.lang.c Thad Smith <Thad...@acm.org> wrote:
> Tim Rentsch wrote:
>
>> 1. What does volatile do?
>> 2. What guarantees does using volatile provide? (What memory
>> regimes must be affected by using volatile?)
>> 3. What limits does the Standard set on how using volatile
>> can affect program behavior?
>> 4. When is it necessary to use volatile?
>
> The C Standard only addresses the single thread program model, except
> for external signal processing. Interacting threads aren't addressed.
> Apparently Posix incorporates a version of the C Standard (I know zilch
> of Posix). Since it does support multiple threads, etc., that may be a
> better standard to explore those issues of volatile.

No, because Posix specifies that one does not need to use volatile when
doing communication between threads.

--
<Insert your favourite quote here.>
Erik Trulsson
ertr...@student.uu.se

FreeRTOS.org

unread,

Jun 2, 2009, 1:31:47 PM6/2/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
news:kfnvdnf...@alumnus.caltech.edu...

> The Semantics of 'volatile'
> ===========================
>

I once wrote an article on compiler validation for safety critical systems.
In return somebody sent me a paper they had published regarding different
compilers implementation of volatile. I forget the numbers now, but the
conclusion of their paper was that most compilers don't implement it
correctly anyway!

--
Regards,
Richard.

+ http://www.FreeRTOS.org
Designed for Microcontrollers. More than 7000 downloads per month.

+ http://www.SafeRTOS.com
Certified by T�V as meeting the requirements for safety related systems.

John Devereux

unread,

Jun 2, 2009, 1:49:51 PM6/2/09

to

"FreeRTOS.org" <noe...@given.com> writes:

> "Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
> news:kfnvdnf...@alumnus.caltech.edu...
>> The Semantics of 'volatile'
>> ===========================
>>
>
>
> I once wrote an article on compiler validation for safety critical systems.
> In return somebody sent me a paper they had published regarding different
> compilers implementation of volatile. I forget the numbers now, but the
> conclusion of their paper was that most compilers don't implement it
> correctly anyway!

If it was the same one posted here a few months ago, it started out with
a very basic false assumption about what volatile *means*. Casting the
rest of the paper into doubt as far as I can see.

--

John Devereux

MikeWhy

unread,

Jun 2, 2009, 5:34:38 PM6/2/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
news:kfnvdnf...@alumnus.caltech.edu...

> The Semantics of 'volatile'
> ===========================
>
> I've been meaning to get to this for a while, finally there's a
> suitable chunk of free time available to do so.
>
> To explain the semantics of 'volatile', we consider several
> questions about the concept and how volatile variables behave,
> etc. The questions are:

...

Speaking of undue mental contortions, and without intending to slight or
denigrate your well written explanation, it might be more straightforward to
explain what volatile might mean to the developer, not implementor. To my
knowledge, in context of C++ not C, volatile has meaning only in relation to
optimizations. Specifically, the compiler is made aware that the referenced
value can change from external actions. It implies acquire and release
semantics, and limits reordering operations to not violate the operation
order before and after the volatile access. But, you tell me... This
simplistic view is as much thought as I had given the topic in quite a long
while.

CBFalconer

unread,

Jun 2, 2009, 9:22:40 PM6/2/09

to

Erik Trulsson wrote:
> Thad Smith <Thad...@acm.org> wrote:
>> Tim Rentsch wrote:
>>
>>> 1. What does volatile do?
>>> 2. What guarantees does using volatile provide? (What memory
>>> regimes must be affected by using volatile?)
>>> 3. What limits does the Standard set on how using volatile
>>> can affect program behavior?
>>> 4. When is it necessary to use volatile?
>>
>> The C Standard only addresses the single thread program model,
>> except for external signal processing. Interacting threads
>> aren't addressed. Apparently Posix incorporates a version of the
>> C Standard (I know zilch of Posix). Since it does support
>> multiple threads, etc., that may be a better standard to explore
>> those issues of volatile.
>
> No, because Posix specifies that one does not need to use volatile
> when doing communication between threads.

The OP didn't specify any conditions for using volatile. He may
well be referring to the West Podunk fire safety standards. Mr
Smith brought in the C standard and Posix. We can't trust the
posting to c.l.c, since that has been seriously affected by trolls.

In West Podunk anything volatile is easily ignited, and requires
special storage.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Tim Rentsch

unread,

Jun 3, 2009, 2:24:27 AM6/3/09

to

Franken Sense <fr...@example.invalid> writes:

> In Dread Ink, the Grave Hand of Tim Rentsch Did Inscribe:
>
> > The Semantics of 'volatile'
> > ===========================
>
> I studied this recently both in C and Fortran. I have a small, embedded
> job that motivates this interest.
>
> The original post is 350 lines long, and I snipped it not out of spite; the
> length makes it really hard to quote.
>
> A question for OP: are you using C89 or C99?

My comments were based on the most recent draft of the current
(C99) standard, more specifically n1256. However I think most of
what was said still applies to C89/C90, because the basic
decisions about volatile were made pretty early. Of course, to be
sure it would be necessary to look at the C89/C90 documents and
verify that.

Tim Rentsch

unread,

Jun 3, 2009, 2:43:05 AM6/3/09

to

"FreeRTOS.org" <noe...@given.com> writes:

> "Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
> news:kfnvdnf...@alumnus.caltech.edu...
> > The Semantics of 'volatile'
> > ===========================
> >
>
>
> I once wrote an article on compiler validation for safety critical systems.
> In return somebody sent me a paper they had published regarding different
> compilers implementation of volatile. I forget the numbers now, but the
> conclusion of their paper was that most compilers don't implement it
> correctly anyway!

It's important to realize the "implementation-defined" clause
for what constitutes a volatile access provides a loophole so
large that almost no implementation of volatile can be called
incorrect categorically. As long as the implementation accurately
documents the decision it made here, arbitrarily lax choices are
allowed (or so the claim has been made), and these implementations
aren't incorrect. Example: "An volatile access is the same as
any other access in the abstract machine, except for the first
billionth billionth billionth second of program execution, in
which case the volatile access is done as early as it possibly
can but after all necessarily earlier accesses (in the abstract
machine sense).

Tim Rentsch

unread,

Jun 3, 2009, 2:51:13 AM6/3/09

to

John Devereux <jo...@devereux.me.uk> writes:

To give the paper its due, I have looked it over, and I think its
conclusions are basically right. There are some details that are
either oversimplified or slightly wrong, depending on how certain
statments in the Standard are read; that's a whole other big
discussion, and I don't want to say any more about that right
now. But I would like to repeat that the paper mentioned is,
IMO, quite a good paper, and it contributes some important
observations on volatile and how it is implemented.

Tim Rentsch

unread,

Jun 3, 2009, 3:02:41 AM6/3/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

My intention was to provide discussion for both developers and
implementors, because I think it's important for both sets of
people to understand the thinking of the other. Having said that,
I concede that the comments could have been better on the developer
side.

Certainly it is true that using volatile will (usually) inhibit
certain optimizations, but the question is, which optimizations and
under what conditions? The basic answer is that this is
implementation-defined, and there is no single right answer.
/Usually/ volatile implies canonical ordering of at least those
accesses that are volatile-qualified, but does/should/must it imply
more? Depending on implementation choice, volatile /might/ (for
example) cause a write-barrier to be put into the storage pipeline,
but that isn't absolutely required. Other considerations similarly.

The lack of a single model for which other memory regimes are
synchronized is part of the murkiness of volatile; my main point
is that each implementation must identify which memory regime(s)
are coordinated with by using volatile (for that implementation).

Tim Rentsch

unread,

Jun 3, 2009, 3:05:58 AM6/3/09

to

Erik Trulsson <ertr...@student.uu.se> writes:

And therefore Posix implicitly imposes requirements on what
implementations must do to be Posix-compliant, because
otherwise volatile would be necessary in such cases.

Tim Rentsch

unread,

Jun 3, 2009, 3:57:03 AM6/3/09

to

Thad Smith <Thad...@acm.org> writes:

> Tim Rentsch wrote:
>
> > 1. What does volatile do?
> > 2. What guarantees does using volatile provide? (What memory
> > regimes must be affected by using volatile?)
> > 3. What limits does the Standard set on how using volatile
> > can affect program behavior?
> > 4. When is it necessary to use volatile?
>
> The C Standard only addresses the single thread program model, except
> for external signal processing. Interacting threads aren't addressed.
> Apparently Posix incorporates a version of the C Standard (I know zilch
> of Posix). Since it does support multiple threads, etc., that may be a
> better standard to explore those issues of volatile.

The Posix model is only one model. Other environments may
choose to use different models. It's important to understand
these possibilities also.

> > A corollary to the above that any volatile-qualified access
> > automatically introduces an implementation-defined aspect to a
> > program.
>
> Yes, but when you interface with memory-mapped hardware or concurrent
> threads you are stepping outside the realm of Standard C's purview. The
> implementation is the appropriate level to define that support.

I think you missed the point here. The implementation-defined
aspect doesn't have to do with memory-mapped hardware or
concurrent threads. Typically an implementation doesn't define
these things at all. It may (or may not) be aware of them, but
usually it doesn't define them.

> > More properly this question is "What limits does the Standard
> > impose on how volatile access can affect program behavior?".
> >
> > Again the short answer is None. The first sentence in 6.7.3 p 6
> > says:
> >
> > An object that has volatile-qualified type may be modified
> > in ways unknown to the implementation or have other unknown
> > side effects.
>
> To me, this is another way of saying that the since the implementation
> can't see all the relevant accesses in the source code, it has to do
> reads and writes to the volatile objects when the code days to. It's
> also saying that the operation of the program may rely on features not
> expressed in Standard C, such as DMA hardware, which might not be fully
> known to the specific implementation.

It's true that volatile-qualified accesses must occur strictly
according to the abstract semantics, but it isn't this sentence
that requires that. Nor does it say that the program can rely
on features not specified in the Standard; in fact, what it is
saying is that the Standard isn't sure /what/ can be relied on,
and the implementation isn't sure either. It's /because/ the
implementation can't be sure how volatile will affect program
behavior that it imposes such severe requirements on program
evaluation in the presence of volatile.

> > Nowhere in the Standard are any limitations stated as to what
> > such side effects might be. Since they aren't defined, the
> > rules of the Standard identify the consequences as "undefined
> > behavior". Any volatile-qualified access results in undefined
> > behavior (in the sense that the Standard uses the term).
>
> This makes sense within the context of Standard C.
>
> > Some people are bothered by the idea that using volatile produces
> > undefined behavior, but there really isn't any reason to be. At
> > some level any C statement (or variable access) might behave in
> > ways we don't expect or want. Program execution can always be
> > affected by peculiar hardware, or a buggy OS, or cosmic rays, or
> > anything else outside the realm of what the implementation knows
> > about.
>
> While true, I interpret the primary meaning that mechanisms not fully
> understood by the compiler are at work. As a programmer that addresses
> these features, such as hardware registers, I need to understand them,
> but the compiler doesn't have to.

The key point is that the Standard doesn't place any limitations
on what might happen.

Here you have again missed the point. Certainly a /cooperative/
implementation could make its information available to a
debugger, but conversely an arbitrarily perverse implementation
could make it arbitrarily difficult for a debugger to get out
this information; at least practically impossible, even if
perhaps not theoretically impossible. And that's the point:
no implementation chooses to be that perverse, which means
it doesn't really implement the Black Box model.

> > 2. Couldn't call an externally defined function written
> > in assembly or another language, unless the function
> > is declared with a prototype having volatile-qualified
> > parameters (and even that case isn't completely clear,
> > because of the rule at the end of 6.7.5.3 p 15 about
> > how functions types are compared and composited);
>
> This isn't promised because the C Standard doesn't discuss functions not
> provided to the implementation in source form, except for implied OS
> operations, etc., to support the standard library routines.

It isn't promised by the Standard, but essentially every
implementation provides it, because almost no one would
choose an implementation that didn't provide it.

> > These "impossible" conditions never happen because no
> > implementation is silly enough to take the Black Box model
> > literally. Technically, it would be allowed, but no one would
> > use it because it breaks too many deep assumptions about how a C
> > runtime interacts with its environment.
>
> It would only be usable for programs that only call standard library
> functions or functions supplied to the implementation at the time of
> program translation.

Yes, and for that reason no one would use it (or at least almost
no one).

The earlier Gray Box model synchronized on all external function
calls. The "OS/kernel" Gray Box model doesn't have to synchronize
on function calls, only around volatile accesses (and so it could
do inter-TU optimization that would reorder function calls if
there were no volatile accesses close by).

> Here's a question for the OP: what issues come up in actual
> implementations and use that makes this a important issue? An actual
> implementation issue would help illuminate the various choices.

Consider the example mutex/spinlocking code. Here is a slightly
more specific version:

while( v = my_process_id, v != my_process_id ) {/*spin*/}
shared = shared + 1;
v = 0;

Here 'v' is a volatile variable (that incidently has a magic
property that makes it work in the spinlock example shown here,
but that's a side issue. The variable 'shared' is not volatile.

Question: can the accesses to 'shared' be reordered so that
they come before the 'while()' loop (or after the subsequent
assignment to v)?

This kind of question comes up frequently in writing OS code.
It's not a simple question either, because it can depend on
out-of-order memory storage units in a highly parallel
multi-processor system. Clearly if we're trying to do locking
in the OS we care about the answer, because if the accesses
to 'shared' can be reordered then the locking just won't work.
We want the implementation to choose a model where the
accesses to 'shared' are guaranteed to be in the same order
as the statements give above. Make sense?

MikeWhy

unread,

Jun 3, 2009, 6:47:08 AM6/3/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfn63fd...@alumnus.caltech.edu...

If you want a discussion, you'll have to write more directly and less
generally.

Do you mean a cache- or bus-lock? Good heavens, no. I'd throw out that
compiler with yesterday's bad news. If the developer wants a cache lock,
he'll have to code one explicitly. Volatile means nothing more than to limit
the optimizations.

>
> The lack of a single model for which other memory regimes are
> synchronized is part of the murkiness of volatile; my main point
> is that each implementation must identify which memory regime(s)
> are coordinated with by using volatile (for that implementation).

Again, I must be misunderstanding your intent. Please write more concretely.

A compiler should compile the developer's code with the least amount of
surprises. Synchronization remains the purview of the hardware
(read-modify-write, for examplee) or operating system (mutual exclusion).
Volatile doesn't imply any of that. It means simply to not make assumptions
about the value referenced by the variable. If I wanted the compiler to do
more for me, I would write in some other language, Java or VisualBasic
perhaps. Volatile, in fact, is just the opposite of exciting. It is dull and
boring. It tells the compiler to not do any of that fancy shuffling around
stuff.

Eric Sosman

unread,

Jun 3, 2009, 8:18:30 AM6/3/09

to

"Necessary?" Why? Didn't you just finish telling us that
`volatile' means almost nothing, in the sense that nearly all
its semantics are implementation-defined? If we can't say what
effects `volatile' has (and I agree that we mostly can't), I
don't see how we can class those unknown effects as "necessary."

Further discussion of threading topics should probably occur
on comp.programming.threads, <OT> where one can learn that
`volatile' is neither necessary nor sufficient for data shared
by multiple threads. </OT>

--
Eric Sosman
eso...@ieee-dot-org.invalid

Tim Rentsch

unread,

Jun 3, 2009, 11:24:23 AM6/3/09

to

Eric Sosman <eso...@ieee-dot-org.invalid> writes:

> Tim Rentsch wrote:
> > Erik Trulsson <ertr...@student.uu.se> writes:
> >
> >> In comp.lang.c Thad Smith <Thad...@acm.org> wrote:
> >>> Apparently Posix incorporates a version of the C Standard (I know zilch
> >>> of Posix). Since it does support multiple threads, etc., that may be a
> >>> better standard to explore those issues of volatile.
> >> No, because Posix specifies that one does not need to use volatile when
> >> doing communication between threads.
> >
> > And therefore Posix implicitly imposes requirements on what
> > implementations must do to be Posix-compliant, because
> > otherwise volatile would be necessary in such cases.
>
> "Necessary?" Why? Didn't you just finish telling us that
> `volatile' means almost nothing, in the sense that nearly all
> its semantics are implementation-defined? If we can't say what
> effects `volatile' has (and I agree that we mostly can't), I
> don't see how we can class those unknown effects as "necessary."

In the absence of information to the contrary, using volatile is
essentially always necessary when making use of extralinguistic
mechanisms. This conclusion follows from 5.1.2.3 p 5, which gives
the minimum requirements that a conforming implementation must
meet (and explained in more detail in the earlier comments on the
Black Box model). What I mean by 'necessary' is that, if volatile
is omitted, a developer has no grounds for complaint if something
doesn't work, just as there are no grounds for complaint if one
expects 'sizeof(int) <= sizeof(long)' and it isn't, or that signed
arithmetic will wrap on overflow (to pick an example at each end of
the spectrum).

For a long time, the "information to the contrary" was supplied
implicitly by a largely shared (and mostly correct) understanding of
how optimization is done and of the various machine environments in
which programs execute. Basically, volatile worked the same way
everywhere, within certain error bars. As time went on, optimizers
got smarter, and machine environments got more diverse, to the point
where the shared understanding is no longer a reliable indicator.
That's why there's confusion about what volatile means, and why
the community is starting to have discussions about what it does
mean and what it should mean. The Posix stance on threads is one
example of that.

> Further discussion of threading topics should probably occur
> on comp.programming.threads, <OT> where one can learn that
> `volatile' is neither necessary nor sufficient for data shared
> by multiple threads. </OT>

I concur, except that I think the <OT></OT> marking isn't necessary
in this case. As a generic statement, an observation that volatile
is neither necessary nor sufficient for thread-shared data is (IMO)
quite apropos in comp.lang.c. It's only when the discussion starts
being limited to particular threading models that it becomes a
significant impedance mismatch for CLC.

Tim Rentsch

unread,

Jun 3, 2009, 11:56:21 AM6/3/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

Probably good advice. About the best I can offer in response is
that I did what I could considering the subject matter and the
time available.

> Do you mean a cache- or bus-lock? Good heavens, no. I'd throw out that
> compiler with yesterday's bad news. If the developer wants a cache lock,
> he'll have to code one explicitly. Volatile means nothing more than to limit
> the optimizations.

What volatile means is determined almost entirely by a decision
that is implementation-defined. If an implementation chooses to
interpret volatile so that it supplies a cache lock, it may do
so. Or a bus lock. Or no lock at all. The Standard grants an
enormous amount of latitude to the implementation in this area.

> > The lack of a single model for which other memory regimes are
> > synchronized is part of the murkiness of volatile; my main point
> > is that each implementation must identify which memory regime(s)
> > are coordinated with by using volatile (for that implementation).
>
> Again, I must be misunderstanding your intent. Please write more concretely.
>
> A compiler should compile the developer's code with the least amount of
> surprises. Synchronization remains the purview of the hardware
> (read-modify-write, for examplee) or operating system (mutual exclusion).
> Volatile doesn't imply any of that. It means simply to not make assumptions
> about the value referenced by the variable. If I wanted the compiler to do
> more for me, I would write in some other language, Java or VisualBasic
> perhaps. Volatile, in fact, is just the opposite of exciting. It is dull and
> boring. It tells the compiler to not do any of that fancy shuffling around
> stuff.

The choice of "synchronized" was a poor choice; "aligned" probably
would have been better.

In the absence of volatile, the only limits for how the abstract
machine can map onto a physical machine is the "as if" rule and
the minimum requirements on a conforming implementation as stated
in 5.1.2.3. Using volatile tightens those limits, but (and here
is the point), the Standard /doesn't say what physical machine
model must match the abstract machine at volatile access points/.
Your comment, "[volatile] means simply to not make assumptions
about the value referenced by the variable", isn't completely
wrong, but it's an oversimplification. The reason is, when one
talks about "the" value referenced by a variable, those comments
must take place in the context of a particular memory model.
Since the Standard doesn't specify one, we can't really talk about
what volatile does without taking that implementation-defined
choice into account. A cache lock memory model is different from
a bus lock memory model, etc.

Do those comments make more sense now?

MikeWhy

unread,

Jun 3, 2009, 2:35:23 PM6/3/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfny6s9...@alumnus.caltech.edu...

> "MikeWhy" <boat042...@yahoo.com> writes:
>
>> Do you mean a cache- or bus-lock? Good heavens, no. I'd throw out that
>> compiler with yesterday's bad news. If the developer wants a cache lock,
>> he'll have to code one explicitly. Volatile means nothing more than to
>> limit
>> the optimizations.
>
> What volatile means is determined almost entirely by a decision
> that is implementation-defined. If an implementation chooses to
> interpret volatile so that it supplies a cache lock, it may do
> so. Or a bus lock. Or no lock at all. The Standard grants an
> enormous amount of latitude to the implementation in this area.
>

...

>
> In the absence of volatile, the only limits for how the abstract
> machine can map onto a physical machine is the "as if" rule and
> the minimum requirements on a conforming implementation as stated
> in 5.1.2.3. Using volatile tightens those limits, but (and here
> is the point), the Standard /doesn't say what physical machine
> model must match the abstract machine at volatile access points/.
> Your comment, "[volatile] means simply to not make assumptions
> about the value referenced by the variable", isn't completely
> wrong, but it's an oversimplification. The reason is, when one
> talks about "the" value referenced by a variable, those comments
> must take place in the context of a particular memory model.
> Since the Standard doesn't specify one, we can't really talk about
> what volatile does without taking that implementation-defined
> choice into account. A cache lock memory model is different from
> a bus lock memory model, etc.
>
> Do those comments make more sense now?

Give me an example. Given a volatile qualifier, what memory context makes it
reasonable and useful for the compiler to generate a cache lock where the
developer didn't specifically code one? I can't think of one. If the
compiler can generate opcodes to do so, the developer can also write code to
do so if he had wanted one. Or are we talking something much simpler,
something like flipping a bit in a R/W register, as on a baby PIC?

The point of standardizing a language is to give some assurance that the
same code run through different, compliant compilers will generate
functionally identical behavior. This puts a limit on just how unspecified
"implementation defined" loopholes are in the language spec. Volatile
limits the actions of the compiler, not grants more freedom through the
implementation-defined loophoole.

Tim Rentsch

unread,

Jun 3, 2009, 3:24:01 PM6/3/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

Oh, I never said it was reasonable, only that it's allowed.

> The point of standardizing a language is to give some assurance that the
> same code run through different, compliant compilers will generate
> functionally identical behavior. This puts a limit on just how unspecified
> "implementation defined" loopholes are in the language spec. Volatile
> limits the actions of the compiler, not grants more freedom through the
> implementation-defined loophoole.

You're right that volatile limits the actions of the compiler, but how
much they are limited is determined by an implementation-defined
choice, and that choice has so much leeway that we cannot in general
make any guarantees about restrictions volatile imposes.

I'm not trying to advocate that the Standard adopt a particular
memory model for volatile, or even that it limit the set of
choices of memory model that an implementation can choose amongst.
Maybe that's a good idea, maybe it isn't, I just don't know.
I /do/ think what memory models each implementation supports should
be described both more explicitly and more specifically, for the
benefit of both developers and implementors.

MikeWhy

unread,

Jun 3, 2009, 4:30:39 PM6/3/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfnmy8p...@alumnus.caltech.edu...

> "MikeWhy" <boat042...@yahoo.com> writes:
>
>> Give me an example. Given a volatile qualifier, what memory context makes
>> it
>> reasonable and useful for the compiler to generate a cache lock where the
>> developer didn't specifically code one? I can't think of one. If the
>> compiler can generate opcodes to do so, the developer can also write code
>> to
>> do so if he had wanted one. Or are we talking something much simpler,
>> something like flipping a bit in a R/W register, as on a baby PIC?
>
> Oh, I never said it was reasonable, only that it's allowed.

Yeah, I also can't dream up a scenario where that would be justified.

As to what is allowed in context of conforming to the standard... The
specific example takes an extreme, unwarranted liberty that has noticeable
side effects and performance implications. That's a defect. A bug.

Chris M. Thomasson

unread,

Jun 3, 2009, 7:58:40 PM6/3/09

to

"MikeWhy" <boat042...@yahoo.com> wrote in message
news:MnzVl.18280$%54....@nlpi070.nbdc.sbc.com...

FWIW, MSVC compilers, versions 8 and above, automatically insert
load-acquire/store-release barriers for volatile loads and stores on certain
architectures (e.g., PowerPC):
_______________________________________________________________
void*
volatile_load(
void** p
) {
void* v = ATOMIC_LOAD(p);
MEMBAR #LoadStore | #LoadLoad;
return v;
}

void*
volatile_store(
void** p,
void* v
) {
MEMBAR #LoadStore | #StoreStore;
ATOMIC_STORE(p, v);
return v;
}
_______________________________________________________________

Please note that this is NOT strong enough for mutual-exclusion such that
you cannot use these memory barrier guarantees to build Petersons Algorithm.
You would need to insert a stronger membar (e.g., MEMBAR #StoreLoad |
#StoreStore) in order to prevent subsequent load from hoisting up above the
previous store in the lock acquire portion of the algorithm. However, it is
strong enough for a DCL algorithm or a producer/consumer example:
_______________________________________________________________
int data = 0;
atomic_word volatile g_flag = 0;

void single_producer() {
data = 1234;
g_flag = 1;
}

void multiple_consumers() {
while (! g_flag) backoff();
assert(data == 1234);
}
_______________________________________________________________

However, these barriers are actually too strong here such that the
#LoadStore restrain is not required for the example... On the SPARC you
could write it as:
_______________________________________________________________
int data = 0;
atomic_word g_flag = 0;

void single_producer() {
data = 1234;
MEMBAR #StoreStore;
g_flag = 1;
}

void multiple_consumers() {
while (! g_flag) backoff();
MEMBAR #LoadLoad;
assert(data == 1234);
}
_______________________________________________________________

Even then, the barriers can be too strong for architectures that support
implicit data-dependant load barriers (e.g., all but the DEC Alpha). You can
write it this way:
_______________________________________________________________
int data = 0;
int* g_flag = 0;

void single_producer() {
data = 1234;
g_flag = &data;
}

void multiple_consumers() {
while (! g_flag) backoff();
assert(*g_flag == 1234);
}
_______________________________________________________________

Tim Rentsch

unread,

Jun 5, 2009, 2:17:56 AM6/5/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

I don't agree. There are lots of ways that the Standard gives
license to implementations to do extreme, absurd, or ridiculous
things, but practically speaking the effects are minimal. For
example, there's nothing stopping an implementation from choosing
sizeof(int) == 1000000000. It's hard to grant /some/ freedoms
like this to implementations without also allowing extreme
choices that seem ridiculous.

Should the ridiculous choices be forbidden (even assuming that
suitable language could be found to do that)? I would say no,
because what seems ridiculous in one context might very well have
useful implications in another context.

Where I think the Standard falls short here is in explaining what
it is that must be documented when an implementation adopts a
particular "volatile access" model. Also, the question of what
consequences must follow given a particular choice of memory
model, which I believe the Standard /does/ mean to specify, is
not spelled out very clearly in the current Standard text.

Tim Rentsch

unread,

Jun 5, 2009, 2:20:34 AM6/5/09

to

"Chris M. Thomasson" <n...@spam.invalid> writes:

> "MikeWhy" <boat042...@yahoo.com> wrote in message

>[snip]

> > Give me an example. Given a volatile qualifier, what memory context makes
> > it reasonable and useful for the compiler to generate a cache lock where
> > the developer didn't specifically code one? I can't think of one.
>
>
> FWIW, MSVC compilers, versions 8 and above, automatically insert
> load-acquire/store-release barriers for volatile loads and stores on certain
> architectures (e.g., PowerPC):

> [...examples omitted...]

Thank you! An excellent set of examples.

MikeWhy

unread,

Jun 5, 2009, 12:42:11 PM6/5/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfnbpp3...@alumnus.caltech.edu...

I didn't say that wasn't conforming. Whether it is or isn't is a separate
issue. I said I consider it a bug, a product defect, and if uncorrected
makes it unusable for my needs.

Tim Rentsch

unread,

Jun 6, 2009, 9:38:00 AM6/6/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

Did we both misunderstand each other? I was talking about
whether or not the Standard should be considered to have a bug
(and in this regard I don't think it should), not whether
the example implementation should be considered to have a bug.

In fact, I wouldn't say this particular implementation decision
merits the term "bug" either. It may be a poor decision in terms
of your needs. Possibly it's a poor decision in terms of most
other people's needs also. But as long as the implementation is
conforming and does what its developers intended, at worst the
"feature" represents poor judgment. And who knows, it may be
just what someone is looking for, and they may be quite glad
to find an implementation that implements volatile "correctly".

MikeWhy

unread,

Jun 6, 2009, 12:46:42 PM6/6/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfn3aad...@alumnus.caltech.edu...

I think we're agreeing. Conformance to the standard is orthogonal to
correctness in the implementation.

James Kuyper

unread,

Jun 6, 2009, 7:40:50 PM6/6/09

to

MikeWhy wrote:
...

> I think we're agreeing. Conformance to the standard is orthogonal to
> correctness in the implementation.

Conformance with the standard is part of correctness, not orthogonal to
it, at least for any compiler which is intended to implement that standard.

Tim Rentsch

unread,

Jun 7, 2009, 11:18:58 AM6/7/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

I take your point, although I wouldn't use the term "correctness"
to label the attribute that (I think) you mean to reference.
Perhaps "quality" or "appropriateness" or something along those
lines. "Correctness" is measured relative to a specification,
but what (I think) you're talking about is some sort of independent
judgment, which has no specification.

luserXtrog

unread,

Jun 7, 2009, 7:19:03 PM6/7/09

to

On Jun 7, 10:18 am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:

> "MikeWhy" <boat042-nos...@yahoo.com> writes:
> > "Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
> >news:kfn3aad...@alumnus.caltech.edu...

> > > "MikeWhy" <boat042-nos...@yahoo.com> writes:
>
> > >> "Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
> > >>news:kfnbpp3...@alumnus.caltech.edu...

> > >> > "MikeWhy" <boat042-nos...@yahoo.com> writes:
>
> > >> >> "Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
> > >> >>news:kfnmy8p...@alumnus.caltech.edu...

Perhaps according to the "right thinking" of the eightfold path?

Now that you're awake, this gem from the archive might be interesting.
It appears to be the genesis of "volatile".

-- \nlxt

Message-ID: <bnews.n44a.144>
Newsgroups: net.lang.c
Path: utzoo!decvax!cca!ima!n44a!dan
X-Path: utzoo!decvax!cca!ima!n44a!dan
From: n44a!dan
Date: Thu Apr 28 05:47:51 1983
Subject: C and real hardware
Posted: Wed Apr 27 15:34:30 1983
Received: Thu Apr 28 05:47:51 1983

In C code such as device drivers, one often encounters code like:

while (ADDR->c_reg&BUSY)
;

where one is testing bits of a hardware device register, and waiting
for a change.

In there anything in the C specification which would prevent a good
(but obviously not perfect) optimizer from reading the value of the
device
register just once, stuffing it into a CPU general purpose register
and looping
on its value ?
I have noticed that at least some of the 4xbsd makefiles seem to
purposely avoid the optimizer - is this sort of thing the reason ?
Might not C benefit from some additional type (e.g. "unregister")
which would tell an optimizer what was up. I believe DEC's MicroPower
Pascal has
a type called "volatile" which prevents this sort of thing.

Dan Ts'o

Tim Rentsch

unread,

Jun 8, 2009, 1:03:47 AM6/8/09

to

luserXtrog <mij...@yahoo.com> writes:

> On Jun 7, 10:18=A0am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> > "MikeWhy" <boat042-nos...@yahoo.com> writes:

>>>[SNIP]

> > >
> > > I think we're agreeing. Conformance to the standard is orthogonal to
> > > correctness in the implementation.
> >
> > I take your point, although I wouldn't use the term "correctness"
> > to label the attribute that (I think) you mean to reference.
> > Perhaps "quality" or "appropriateness" or something along those
> > lines. "Correctness" is measured relative to a specification,
> > but what (I think) you're talking about is some sort of independent
> > judgment, which has no specification.
>
> Perhaps according to the "right thinking" of the eightfold path?

Or whatever criteria the judger chooses -- that's why it's
an /independent/ judgment (and explains why there is no
specification, because each judger makes their own decision
about what requirements should be satisfied).

> Now that you're awake, this gem from the archive might be interesting.
> It appears to be the genesis of "volatile".

> [snip historial message]

There are two pertinent aspects that may be considered relevant
here: the choice of term ('volatile'), and what it does (ie, the
intended meaning, or what it's for).

For choice of term, I think the ordinary English meaning is
close enough so that it's reasonable to consider the ordinary
word as the origin of the term. That it was used earlier in
another language simply means they found the standard meaning
close enough for their purposes -- no different than 'if',
'break', 'while', etc.

For what it does, people more familiar with the history than I am
have said (and I believe there's a fair amount of truth in this)
that members of the committee were divided between two related
but distinct meanings/uses, namely, for accessing special memory
locations or hardware registers, and for inhibiting optimization
for some sort of inter-process sychronization. Rather than
settle on a single purpose, the two different uses were conflated
into a single one, both brought into play using the keyword
'volatile'. So even if the earlier precedent may have played a
role in motivating a need, I think 'volatile' in its current form
has a broader base than just the "we need to access a special
memory location" kind of purpose. Remember, language features
put in to enable/inhibit various low-level machine details
weren't new even in the 1980's. To give just one example, PL/I
had such things going back to the 1960's.

luserXtrog

unread,

Jun 8, 2009, 1:20:51 AM6/8/09

to

On Jun 8, 12:03 am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:

Solid.

> For what it does, people more familiar with the history than I am
> have said (and I believe there's a fair amount of truth in this)
> that members of the committee were divided between two related
> but distinct meanings/uses, namely, for accessing special memory
> locations or hardware registers, and for inhibiting optimization
> for some sort of inter-process sychronization.

Aren't these the same thing: To inhibit optimizations so as enforce
memory access? Is the distinction to be made between sharing the
memory with another process vs. a piece of hardware?

Rather than
> settle on a single purpose, the two different uses were conflated
> into a single one, both brought into play using the keyword
> 'volatile'. So even if the earlier precedent may have played a
> role in motivating a need, I think 'volatile' in its current form
> has a broader base than just the "we need to access a special
> memory location" kind of purpose. Remember, language features
> put in to enable/inhibit various low-level machine details
> weren't new even in the 1980's. To give just one example, PL/I
> had such things going back to the 1960's.

That's so much clearer: nice and vague with an anecdotey feel.

I know I'd shouting from the back-row, here; so feel free to
be as curt as you wish. I really don't know what I'm talking
about, never having used the thing. But I thought I had this
pretty-well gelled; now it's all runny.

--
lxt

MikeWhy

unread,

Jun 8, 2009, 10:41:25 AM6/8/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfneitv...@alumnus.caltech.edu...

> luserXtrog <mij...@yahoo.com> writes:
>
>> On Jun 7, 10:18=A0am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
>> > "MikeWhy" <boat042-nos...@yahoo.com> writes:
>>>>[SNIP]
>> > >
>> > > I think we're agreeing. Conformance to the standard is orthogonal to
>> > > correctness in the implementation.
>> >
>> > I take your point, although I wouldn't use the term "correctness"
>> > to label the attribute that (I think) you mean to reference.
>> > Perhaps "quality" or "appropriateness" or something along those
>> > lines. "Correctness" is measured relative to a specification,
>> > but what (I think) you're talking about is some sort of independent
>> > judgment, which has no specification.
>>
>> Perhaps according to the "right thinking" of the eightfold path?
>
> Or whatever criteria the judger chooses -- that's why it's
> an /independent/ judgment (and explains why there is no
> specification, because each judger makes their own decision
> about what requirements should be satisfied).

No more so than any other part of the standard. "Volatile" means "volatile",
not "do what you damned well please".

Tim Rentsch

unread,

Jun 8, 2009, 11:19:20 AM6/8/09

to

luserXtrog <mij...@yahoo.com> writes:

> On Jun 8, 12:03=A0am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
[snip]

> > For what it does, people more familiar with the history than I am
> > have said (and I believe there's a fair amount of truth in this)
> > that members of the committee were divided between two related
> > but distinct meanings/uses, namely, for accessing special memory
> > locations or hardware registers, and for inhibiting optimization

> > for some sort of inter-process sychronization. =A0

>
> Aren't these the same thing: To inhibit optimizations so as enforce
> memory access? Is the distinction to be made between sharing the
> memory with another process vs. a piece of hardware?

I suppose they could be, depending on how generically the terms
were meant. In the particular case, I don't think they were,
because of different assumptions about what determines the
behavior on the other accesses. For hardware memory registers, it
could be just about anything; for inter-process synchronization,
the natural assumption would be that the other process would also
be a C program (and so would view 'volatile' in the same way as
this program). There's a big difference between those two.

Like I said, I wasn't there, but that's basically how I took
the comments about what actually happened.

Tim Rentsch

unread,

Jun 8, 2009, 11:28:55 AM6/8/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

I think you need to make up your mind. If what you mean by
"correctness in the implementation" depends on implementing
volatile in the way the Standard says, then it is not orthogonal
to "conformance to the standard". Either the two conditions
/are/ orthogonal, in which case judging correctness is independent
of the Standard, or judging correctness depends on the Standard,
in which case they are /not/ orthogonal. It can't be both.
Or maybe you mean something different by "orthogonal" than
how the term is usually meant? If so maybe you could explain
that.

Boon

unread,

Jun 8, 2009, 11:53:11 AM6/8/09

to

John Devereux wrote:

> FreeRTOS.org wrote:
>
>> I once wrote an article on compiler validation for safety critical systems.
>> In return somebody sent me a paper they had published regarding different
>> compilers implementation of volatile. I forget the numbers now, but the
>> conclusion of their paper was that most compilers don't implement it
>> correctly anyway!
>
> If it was the same one posted here a few months ago, it started out with
> a very basic false assumption about what volatile *means*. Casting the
> rest of the paper into doubt as far as I can see.

Are you both referring to the following paper?

"Volatiles Are Miscompiled, and What to Do about It"
http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

Regards.

MikeWhy

unread,

Jun 8, 2009, 1:24:15 PM6/8/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfnab4i...@alumnus.caltech.edu...

If the implementation can conform to the standard and not generate the
desired result, conformance is orthogonal to correctness. Isn't that your
argument? That implementation-defined means anything and everything? (Don't
bother. That was rhetorical. I'm outa this one.)

John Devereux

unread,

Jun 8, 2009, 1:38:21 PM6/8/09

to

Boon <root@localhost> writes:

That is the one I meant. Their first example (2.1) is wrong I think:

======================================================================

volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
int i;
for (i=0; i<BUF_SIZE; i++)
buffer[i] = 0;
buffer_ready = 1;
}

"The for-loop does not access any volatile locations, nor does it
perform any side-effecting operations. Therefore, the compiler is free
to move the loop below the store to buffer_ready, defeating the
developer's intent."

======================================================================

The problem is that the compiler is *not* free to do this (as far as I
can see). Surely clearing the buffer *is* a side effect?

The example is meant to illustrate "what does volatile mean". If it does
not mean what they think it does, the other claims seem suspect.

--

John Devereux

kid joe

unread,

Jun 8, 2009, 6:09:30 PM6/8/09

to

Hi Tim,

Its been said that volatile is the multithreaded programmers best friend.

Cheers,
Joe

--
...................... o _______________ _,
` Good Evening! , /\_ _| | .-'_|
`................, _\__`[_______________| _| (_|
] [ \, ][ ][ (_|

Nobody

unread,

Jun 8, 2009, 10:46:44 PM6/8/09

to

On Mon, 08 Jun 2009 18:38:21 +0100, John Devereux wrote:

> The problem is that the compiler is *not* free to do this (as far as I can
> see). Surely clearing the buffer *is* a side effect?

That seems to me to be what C99 says:

5.1.2.3 Program execution

...

[#2] Accessing a volatile object, modifying an object,
modifying a file, or calling a function that does any of
those operations are all side effects,10) which are changes
in the state of the execution environment. Evaluation of an
expression may produce side effects. At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place. (A summary of the sequence points is given in annex
C.)

This implies that all writes are side-effects, as are reads of volatile
objects.

If buffer originally contained non-zero values, and another thread
was monitoring its contents via a "volatile char *", it should be
guaranteed to see the elements being cleared in ascending order, with
buffer_ready only being set after all elements of buffer were cleared.

Tim Rentsch

unread,

Jun 8, 2009, 11:44:32 PM6/8/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

I wasn't making an argument. I was only clarifying my earlier
comments. Those comments were based just on your initial comment
(the most nested quoted portion above); they weren't offering
any conclusions about implementation-defined behavior.

Is what you're saying any different than the specification of
volatile bothers you because it allows things that you think it
shouldn't allow? I'm having trouble getting any other meaning
out of it.

Tim Rentsch

unread,

Jun 9, 2009, 12:15:50 AM6/9/09

to

John Devereux <jo...@devereux.me.uk> writes:

Clearing the buffer is a side-effect, certainly, as the Standard uses
the term. More generally, the writes to buffer[i] are accesses, and
so at end of 'buffer_ready = 1;' those accesses must be complete (at
least, that's one way of reading 5.1.2.3 p 5, although some people
consider another reading to be more consistent with how that clause
is supposed to be read).

> The example is meant to illustrate "what does volatile mean". If it does
> not mean what they think it does, the other claims seem suspect.

Points to consider:

1. What they are saying may be right, but they just may have said
it poorly.

2. Certainly some knowledgeable people consider an alternative
reading of 5.1.2.3 p 5 more appropriate, which would agree with
the conclusion following the above example, even though the
phrasing about side-effects is wrong (or at least misleading).

3. The other claims may not depend on the mistakes made in this
example.

Certainly I would agree that other comments in the paper deserve
scrutiny. But if you're asking whether the statements cited
above make it reasonable to simply dismiss the rest of the paper,
I would have to say No. At the very least, it is useful to
consider their model for what 'volatile' must imply, and
see what evidence gets turned up under those assumptions.

Tim Rentsch

unread,

Jun 9, 2009, 12:32:08 AM6/9/09

to

Nobody <nob...@nowhere.com> writes:

The statement about seeing elements cleared in ascending order is
wrong. Even under the most stringent reading of 5.1.2.3 p 5 and the
description of volatile in 6.7.3 p 6, the stores into buffer[i] are
not guaranteed to occur in any particular order, because the
assignements to buffer[i] are not made through a volatile-qualified
type. Hence the reads in the other thread, even though made through
a volatile-qualified access, might not see the same storage order.

Furthermore, there is disagreement about whether the changes to buffer
must occur before the change to the volatile variable 'buffer_ready'
completes, in /every/ conforming implementation. Certainly they must
in some implementations, but the more general statement is open to
different interpretations, even assuming the same model for what
constitutes a volatile-qualified access.

nick_keigh...@hotmail.com

unread,

Jun 9, 2009, 4:06:38 AM6/9/09

to

On 7 June, 16:18, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> "MikeWhy" <boat042-nos...@yahoo.com> writes:

<snip>

> > I think we're agreeing. Conformance to the standard is orthogonal to
> > correctness in the implementation.
>
> I take your point, although I wouldn't use the term "correctness"
> to label the attribute that (I think) you mean to reference.
> Perhaps "quality" or "appropriateness" or something along those
> lines.

I think Quality of Implementation (QoI) is the usual term.

Though I consider the entire "Quality" industry to be be based a
willfully
wrong re-definition of "quality" ("compliance with a standard").

> "Correctness" is measured relative to a specification,
> but what (I think) you're talking about is some sort of independent
> judgment, which has no specification

--
Nick Keighley

"The quality I have in mind is all-absorbing,
not just a way of doing things but a way of being."
a PHB

nick_keigh...@hotmail.com

unread,

Jun 9, 2009, 4:11:56 AM6/9/09

to

On 8 June, 16:19, Tim Rentsch <t...@alumnus.caltech.edu> wrote:

> luserXtrog <mijo...@yahoo.com> writes:
> > On Jun 8, 12:03=A0am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:

> > > For what it does, people more familiar with the history than I am
> > > have said (and I believe there's a fair amount of truth in this)
> > > that members of the committee were divided between two related
> > > but distinct meanings/uses, namely, for accessing special memory
> > > locations or hardware registers, and for inhibiting optimization
> > > for some sort of inter-process sychronization. =A0
>
> > Aren't these the same thing: To inhibit optimizations so as enforce
> > memory access? Is the distinction to be made between sharing the
> > memory with another process vs. a piece of hardware?
>
> I suppose they could be, depending on how generically the terms
> were meant. In the particular case, I don't think they were,
> because of different assumptions about what determines the
> behavior on the other accesses. For hardware memory registers, it
> could be just about anything; for inter-process synchronization,
> the natural assumption would be that the other process would also
> be a C program

why? People have been programming in multiple languages since about
1959 what evidence was there that they were going to stop in 1989?

Richard Bos

unread,

Jun 9, 2009, 7:37:44 AM6/9/09

to

John Devereux <jo...@devereux.me.uk> wrote:

> That is the one I meant. Their first example (2.1) is wrong I think:

It is, but not for the reason you think it is.

> volatile int buffer_ready;
> char buffer[BUF_SIZE];
> void buffer_init() {
> int i;
> for (i=0; i<BUF_SIZE; i++)
> buffer[i] = 0;
> buffer_ready = 1;
> }
>
> "The for-loop does not access any volatile locations, nor does it
> perform any side-effecting operations. Therefore, the compiler is free
> to move the loop below the store to buffer_ready, defeating the
> developer's intent."
> ======================================================================
>
> The problem is that the compiler is *not* free to do this (as far as I
> can see). Surely clearing the buffer *is* a side effect?

No. That is, yes, it's a side effect, but it's not a side effect _on the
volatile object_. buffer_ready is volatile, so all accesses to it must
be done according to the abstract machine; but no other objects are
volatile, so they may be shuffled as you like.
The error in the example is that the developer has not properly
described his own intent. Clearly, from the text, his intent was that
buffer_ready was volatile _with respect to the buffer_; equally clearly,
from the code, that's not what he has written. What he should do, if he
wants the relative accesses of buffer_ready _and_ buffer itself to be
done in the exact order of the abstract machine, he should make them
both volatile, not just one or the other.

Richard

Boudewijn Dijkstra

unread,

Jun 9, 2009, 7:44:23 AM6/9/09

to

Op Tue, 09 Jun 2009 10:06:38 +0200 schreef
<nick_keigh...@hotmail.com>:

> On 7 June, 16:18, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
>> "MikeWhy" <boat042-nos...@yahoo.com> writes:
>
<snip>

> Though I consider the entire "Quality" industry to be be based a

> willfully wrong re-definition of "quality" ("compliance with a
> standard").

Tool vendors cannot sell tools that translate developer intentions into
quality code (yet). They can however sell tools that help automate the
part of a quality _process_ that deals with standards compliance. The
fact that some people make the mistake of thinking that "compliance with a
standard" automagically causes quality, doesn't make the tool vendors the
bad guys.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/

nick_keigh...@hotmail.com

unread,

Jun 9, 2009, 8:25:53 AM6/9/09

to

On 9 June, 12:44, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:
> Op Tue, 09 Jun 2009 10:06:38 +0200 schreef

> <nick_keighley_nos...@hotmail.com>:> On 7 June, 16:18, Tim Rentsch <t...@alumnus.caltech.edu> wrote:

> > Though I consider the entire "Quality" industry to be be based a
> > willfully wrong re-definition of "quality" ("compliance with a
> > standard").
>
> Tool vendors cannot sell tools that translate developer intentions into
> quality code (yet). They can however sell tools that help automate the
> part of a quality _process_ that deals with standards compliance. The
> fact that some people make the mistake of thinking that "compliance with a
> standard" automagically causes quality, doesn't make the tool vendors the
> bad guys.

I wasn't particularly picking on tool vendors.
What is ISO 9000 all about?

Rich Webb

unread,

Jun 9, 2009, 8:48:19 AM6/9/09

to

On Tue, 09 Jun 2009 11:37:44 GMT, ral...@xs4all.nl (Richard Bos) wrote:

>John Devereux <jo...@devereux.me.uk> wrote:
>
>> That is the one I meant. Their first example (2.1) is wrong I think:
>
>It is, but not for the reason you think it is.
>
>> volatile int buffer_ready;
>> char buffer[BUF_SIZE];
>> void buffer_init() {
>> int i;
>> for (i=0; i<BUF_SIZE; i++)
>> buffer[i] = 0;
>> buffer_ready = 1;
>> }
>>
>> "The for-loop does not access any volatile locations, nor does it
>> perform any side-effecting operations. Therefore, the compiler is free
>> to move the loop below the store to buffer_ready, defeating the
>> developer's intent."
>> ======================================================================
>>
>> The problem is that the compiler is *not* free to do this (as far as I
>> can see). Surely clearing the buffer *is* a side effect?
>
>No. That is, yes, it's a side effect, but it's not a side effect _on the
>volatile object_. buffer_ready is volatile, so all accesses to it must
>be done according to the abstract machine; but no other objects are
>volatile, so they may be shuffled as you like.

No.

5.1.2.3 Para 2 "... Evaluation of an expression may produce side

effects. At certain specified points in the execution sequence called
sequence points, all side effects of previous evaluations shall be
complete and no side effects of subsequent evaluations shall have taken
place."

There's a sequence point at the end of "buffer[i] = 0;"

*If* the compiler can determine that there are no other accesses to
buffer[] then it would be allowed to optimize-away the entire
expression, since "[a]n actual implementation need not evaluate part of
an expression if it can deduce that its value is not used and that no
needed side effects are produced."

However, it is not permitted to arbitrarily change the order of
evaluation of successive sequence points. Otherwise the compiler could,
legally, re-order the expressions in, say, alphabetical order and still
be conforming.

--
Rich Webb Norfolk, VA

Boudewijn Dijkstra

unread,

Jun 9, 2009, 10:20:30 AM6/9/09

to

Op Tue, 09 Jun 2009 14:25:53 +0200 schreef
<nick_keigh...@hotmail.com>:

> On 9 June, 12:44, "Boudewijn Dijkstra" <boudew...@indes.com> wrote:

>> Op Tue, 09 Jun 2009 10:06:38 +0200 schreef ï¿œ

>> <nick_keighley_nos...@hotmail.com>:> On 7 June, 16:18, Tim Rentsch
>> <t...@alumnus.caltech.edu> wrote:
>
>> > Though I consider the entire "Quality" industry to be be based a

>> > willfully wrong re-definition of "quality" ("compliance with a ï¿œ

>> > standard").
>>
>> Tool vendors cannot sell tools that translate developer intentions into

>> quality code (yet). ï¿œThey can however sell tools that help automate the
>> part of a quality _process_ that deals with standards compliance. ï¿œThe ï¿œ

>> fact that some people make the mistake of thinking that "compliance
>> with a standard" automagically causes quality, doesn't make the tool
>> vendors the bad guys.
>
> I wasn't particularly picking on tool vendors.
> What is ISO 9000 all about?

Don't know, but seems like the same difference to me.

Chris M. Thomasson

unread,

Jun 9, 2009, 7:46:37 PM6/9/09

to

"John Devereux" <jo...@devereux.me.uk> wrote in message
news:87ab4ir...@cordelia.devereux.me.uk...

The code is totally busted if your on a compiler that does not automatically
insert a store-release memory barrier before volatile stores, and
load-acquire membars after volatile loads. I assume another thread will
eventually try to do something like:

int check_and_process_buffer() {
if (buffer_ready) {
/* use buffer */
return 1;
}
return 0;
}

AFAICT, MSVC 8 and above is the only compiler I know about that
automatically inserts membars on volatile accesses:

http://groups.google.com/group/comp.lang.c/msg/54d730b2650c996c

Otherwise, you would need to manually insert the correct barriers for a
particular architecture. Here is a portable version for Solaris that will
work on all arch's support by said OS:

#include <atomic.h>

volatile int buffer_ready;
char buffer[BUF_SIZE];
void buffer_init() {
int i;
for (i=0; i<BUF_SIZE; i++)
buffer[i] = 0;

membar_producer();
buffer_ready = 1;
}

int check_and_process_buffer() {
if (buffer_ready) {
membar_consumer();
/* use buffer */
return 1;
}
return 0;
}

Nobody

unread,

Jun 9, 2009, 7:49:40 PM6/9/09

to

I never mentioned 5.1.2.3 p 5 or 6.7.3 p 6. I did mention 5.1.2.3 p 2,
which you decline to address.

Maybe I'm misinterpreting it; if you think so, say so (saying *why* would
also be useful).

> the stores into buffer[i] are
> not guaranteed to occur in any particular order, because the
> assignements to buffer[i] are not made through a volatile-qualified
> type.

5.1.2.3 p2 (which no-one seems to want to mention) seems to imply that
volatile makes no difference to writes, only to reads:

>> [#2] Accessing a volatile object, modifying an object,

...
>> are all side effects
...

>> At certain specified
>> points in the execution sequence called sequence points, all
>> side effects of previous evaluations shall be complete and
>> no side effects of subsequent evaluations shall have taken
>> place.

Modifying an object is a side-effect, and side-effects are supposed to
have completed at the end of an expression statement (e.g. "buffer[i]=0;").

AFAICT, most of the problems with "volatile" appear to rely upon ignoring
5.1.2.3 p2, which may be why everyone seems to avoid mentioning 5.1.2.3 p2.

Furthermore 5.1.2.3 p3 says:

[#3] In the abstract machine, all expressions are evaluated
as specified by the semantics. An actual implementation

need not evaluate part of an expression if it can deduce
that its value is not used and that no needed side effects

are produced (including any caused by calling a function or
accessing a volatile object).

IOW, if an implementation wishes to elide any side-effects as "unneeded",
the onus is on the implementation to deduce that the side-effects really
are unneeded (e.g. if the value isn't used inside the translation unit and
there is no way it could be used from outside of the translation unit).

Tim Rentsch

unread,

Jun 9, 2009, 10:26:45 PM6/9/09

to

nick_keigh...@hotmail.com writes:

> On 8 June, 16:19, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> > luserXtrog <mijo...@yahoo.com> writes:
> > > On Jun 8, 12:03=A0am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
>
> > > > For what it does, people more familiar with the history than I am
> > > > have said (and I believe there's a fair amount of truth in this)
> > > > that members of the committee were divided between two related
> > > > but distinct meanings/uses, namely, for accessing special memory
> > > > locations or hardware registers, and for inhibiting optimization
> > > > for some sort of inter-process sychronization. =A0
> >
> > > Aren't these the same thing: To inhibit optimizations so as enforce
> > > memory access? Is the distinction to be made between sharing the
> > > memory with another process vs. a piece of hardware?
> >
> > I suppose they could be, depending on how generically the terms
> > were meant. In the particular case, I don't think they were,
> > because of different assumptions about what determines the
> > behavior on the other accesses. For hardware memory registers, it
> > could be just about anything; for inter-process synchronization,
> > the natural assumption would be that the other process would also
> > be a C program
>
> why? People have been programming in multiple languages since about
> 1959 what evidence was there that they were going to stop in 1989?

Because of the context in which the discussions were taking place,
namely, trying to standardize C. It's much easier to specify
inter-process synchronization if it's limited to processes all
implemented in C, because the C language is under the control of
those standardizing the language. I'm sure other environments
would have been considered, but the emphasis would be on just
intra-language semantics, because that's within their scope and
under their control.

Of course, this is just my guess based on second-hand information.
Other people may have other guesses, and I wouldn't want to argue
that one guess is better than another.

Tim Rentsch

unread,

Jun 9, 2009, 10:32:01 PM6/9/09

to

nick_keigh...@hotmail.com writes:

> On 7 June, 16:18, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> > "MikeWhy" <boat042-nos...@yahoo.com> writes:
>
> <snip>
>
> > > I think we're agreeing. Conformance to the standard is orthogonal to
> > > correctness in the implementation.
> >
> > I take your point, although I wouldn't use the term "correctness"
> > to label the attribute that (I think) you mean to reference.
> > Perhaps "quality" or "appropriateness" or something along those

> > lines. =A0

>
> I think Quality of Implementation (QoI) is the usual term.

Certainly QoI is /a/ term, and I think it makes up part of what
the earlier poster was talking about. But I don't think it
captures the whole story (of course, I don't know for sure since I
don't know exactly what he was thinking, but that's what I think).
Anyway that's why I used the less definite terms in my comments.

Tim Rentsch

unread,

Jun 9, 2009, 11:51:05 PM6/9/09

to

Nobody <nob...@nowhere.com> writes:

Yes, 5.1.2.3 p 2 certainly bears on the discussion, and it would
be good to address it.

It's important to understand, when considering how 'volatile' behaves,
that there are two "machines" under consideration: the physical
machine, and the abstract machine.

The physical machine is the computer as we experience it in our
program and how they behave. (Note: I'm speaking as though there is
only one persective on a physical machine, but in actuality there are
(at least) several. I'm going to ignore these distinctions for the
moment.) A physical machine always does /something/ -- possibly only
probabilistically, but still something -- and we can find out what it
does through experimentation. The physical machine exists in the
physical universe, and we can discover what it does in different
situations.

The abstract machine is a conceptual notion; it has no physical
existence but "exists" mainly in the minds of implementors. The
abstract machine is sort of a mathematical tool for defining
behavior -- C is defined in terms of how the "abstract machine"
behaves, not how a physical machine behaves.

The first and most important point of contact between the abstract
machine and the physical machine is the so-called "as-if" rule.
What this rule says, basically, is that the physical machine can
do anything at all, as long as the 'outputs' of a program match
what would happen if the physical machine and abstract machine
were always in lock step agreement.

The second point of contact between the abstract machine and
the physical machine is volatile-qualified access. Basically,
using volatile places additional restrictions on how aligned
(or unaligned) the abstract machine and the physical machine
may be.

The question you raised (about another thread monitoring the state of
different elements in the 'buffer' array) is concerned with the
physical machine. The reason for this is, the abtract machine
concerns only what happens /inside/ an implementation, so what happens
for another thread is determined not by the abstract machine but by
the physical machine. Threads are not a part of C; the Standard
doesn't say anything about them (at least not directly).

The paragraph you mention (5.1.2.3 p 2) imposes a requirement on the
/abstract/ machine, not on the /physical/ machine. In the abstract
machine the writes to 'buffer[i]' must occur before the subsequent
assignment to 'buffer_ready'. However, they don't have to actually
occur that way in the physical machine. In fact, frequently they
don't, because (to name one example) stores done in a particular order
can be rearranged by the memory management unit. The stores are /in
order/ as seen by the abstract machine, but /out of order/ as seen by
the actual memory -- that is, the physical machine of the other
thread.

So, what the other thread sees has to match what the abstract machine
does (as explained in 5.1.2.3 p 2) /only if/ the physical machine is
required to match the abstract machine through additional requirements
that occur because of using 'volatile'. Because (as I explained
earlier) the use of 'volatile' in the example is not enough to make
the 'buffer[i]' writes in the /abstract/ machine match up with what
happens in the /physical/ machine, in the physical machine (which is
what the other thread sees) those writes can happen in any order.

Does that all make sense?

> > the stores into buffer[i] are
> > not guaranteed to occur in any particular order, because the
> > assignements to buffer[i] are not made through a volatile-qualified
> > type.
>
> 5.1.2.3 p2 (which no-one seems to want to mention) seems to imply that
> volatile makes no difference to writes, only to reads:
>
> >> [#2] Accessing a volatile object, modifying an object,
> ...
> >> are all side effects
> ...
> >> At certain specified
> >> points in the execution sequence called sequence points, all
> >> side effects of previous evaluations shall be complete and
> >> no side effects of subsequent evaluations shall have taken
> >> place.
>
> Modifying an object is a side-effect, and side-effects are supposed to
> have completed at the end of an expression statement (e.g. "buffer[i]=0;").
>
> AFAICT, most of the problems with "volatile" appear to rely upon ignoring
> 5.1.2.3 p2, which may be why everyone seems to avoid mentioning 5.1.2.3 p2.

Again, 5.1.2.3 p 2 is talking only about the abstract machine, not
about the physical machine. Using 'volatile' doesn't affect what
happens in the abstract machine (except for the two special cases
named explicitly in the Standard, setjmp/longjmp and signal
handlers). Using 'volatile' does impose additional requirements on
how and where the physical machine and the abstract machine must be
in alignment, but those requirements do not extend to imposing
5.1.2.3 p 2 in each previous statement (that doesn't use a
volatile-qualified access) before a volatile access. There are
different opinions about just how lax or how strict these additional
requirements are, but even in the most strict interpretation it's
only required that all the assignments to 'buffer[i]' be completed
before the store into the (volatile) buffer_ready; the previous
stores don't have to be done in any particular order in the
/physical/ machine, even though they must occur in a particular
order in the /abstract/ machine.

> Furthermore 5.1.2.3 p3 says:
>
> [#3] In the abstract machine, all expressions are evaluated
> as specified by the semantics. An actual implementation
> need not evaluate part of an expression if it can deduce
> that its value is not used and that no needed side effects
> are produced (including any caused by calling a function or
> accessing a volatile object).
>
> IOW, if an implementation wishes to elide any side-effects as "unneeded",
> the onus is on the implementation to deduce that the side-effects really
> are unneeded (e.g. if the value isn't used inside the translation unit and
> there is no way it could be used from outside of the translation unit).

In a sense this paragraph is just a special case of the "as if"
rule -- in the abstract machine certain operations are required
to happen, and in a particular order, but in the physical machine
they don't have to happen in that order, or even happen at all,
/provided/ the end result is "as if" they happened as the abstract
machine would do them.

Note that using 'volatile' either would, or might, (some people
would say "would", others would only say "might") force some
expressions to be evaluated that could remain unevaluated if
'volatile' weren't used. (I think most people would say "would",
and personally I believe that's the most defensible interpretation.
However I don't want to dismiss the considered statements of
those who have expressed the less restrictive viewpoint here.)

MikeWhy

unread,

Jun 10, 2009, 1:19:56 AM6/10/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:hvCXl.16$mX2...@newsfe05.iad...

> AFAICT, MSVC 8 and above is the only compiler I know about that
> automatically inserts membars on volatile accesses:
>
>
> http://groups.google.com/group/comp.lang.c/msg/54d730b2650c996c
>
>
>
> Otherwise, you would need to manually insert the correct barriers for a
> particular architecture. Here is a portable version for Solaris that will
> work on all arch's support by said OS:

Yup. Definitely worth re-posting that for those who missed the relevance and
details the first time.

Phil Carmody

unread,

Jun 10, 2009, 2:51:45 AM6/10/09

to

There is no "volatile with respect to something else". There's
"volatile", and that's it.

> from the code, that's not what he has written. What he should do, if he
> wants the relative accesses of buffer_ready _and_ buffer itself to be
> done in the exact order of the abstract machine, he should make them
> both volatile, not just one or the other.

I strongly disagree with your interpretation of the spec.

Phil
--
Marijuana is indeed a dangerous drug.
It causes governments to wage war against their own people.
-- Dave Seaman (sci.math, 19 Mar 2009)

Tim Rentsch

unread,

Jun 10, 2009, 11:06:49 AM6/10/09

to

ral...@xs4all.nl (Richard Bos) writes:

I would like to offer some counterpoint.

First, 6.7.3 p 6 (defining volatile) says, in part:

An object that has volatile-qualified type may be modified
in ways unknown to the implementation or have other unknown
side effects. Therefore any expression referring to such an
object shall be evaluated strictly according to the rules of
the abstract machine, as described in 5.1.2.3.

5.1.2.3 includes this paragraph (p2):

Accessing a volatile object, modifying an object, modifying
a file, or calling a function that does any of those

operations are all side effects,11) which are changes in the

state of the execution environment. Evaluation of an
expression may produce side effects. At certain specified
points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and
no side effects of subsequent evaluations shall have taken
place.

Because access to volatile requires (per 6.7.3 p 6) that 5.1.2.3
be faithfully observed, all the stores to buffer[i] must be
completed (although in no particular order) before the (volatile)
store into buffer_ready.

Also, 5.1.2.3 says this (in p 8):

The least requirements on a conforming implementation are:

-- At sequence points, volatile objects are stable in the
sense that previous accesses are complete and subsequent
accesses have not yet occurred.

Notice the wording -- "previous accesses must be complete". It
doesn't say "previous volatile accesses". It says "previous
accesses."

Admittedly, the wording here is ambiguous; it could mean that
previous accesses to the same volatile object be complete (and
similarly for subsequent accesses). However, access to a
volatile object requires evaluation per 6.7.3 p 6, and therefore
per 5.1.2.3 p 2. So, completing an access to a volatile object
also means that all the assignments done in all previous
expressions must have been completed before the volatile access
side-effect occurs.

Tim Rentsch

unread,

Jun 10, 2009, 11:31:04 AM6/10/09

to

Again, thank you for posting some excellent specific examples.

I would like to add one comment. Despite the differences, both
the MSVC 8 implementation and the Solaris implementations can
be conforming. The reason is the last sentence in 6.7.3 p 6,

What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

Presumably the MSVC implementors and the Solaris implementors
reached different conclusions about how to define what
constitutes an access to a volatile-qualified object. Or, to put
that in the language I used earlier, what memory regime will be
aligned to under 'volatile'. It's possible, for example, that
the Solaris notion of volatile makes it work with some thread
implementations but not inter-process communication (or other,
differently implemented thread packages). (I'm only guessing
here; certainly I wouldn't call myself a Solaris expert.) In
any case, whichever choice is "better", both are allowed under
6.7.3 (provided of course the implementation-defined choice is
documented with the implementation).

MikeWhy

unread,

Jun 10, 2009, 2:13:39 PM6/10/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfnhbyo...@alumnus.caltech.edu...

I think you still missed the relevance and the point. Set aside the
implementation-defined part and its language for the moment. This specific
example points out that optimization takes place in both the hardware and
the compiler. The processor re-orders memory access, just as the compiler's
optimizations can as well. MEMBAR before and after volatile access enforces
at the hardware level that the specified operation order is maintained. In
other words -- that is, in the language of the standard -- it maintains the
state of the abtract machine to what the developer wrote. There doesn't seem
to me to be much room for interpretation. It would be instructive to review
the standard with this in mind as a specific, concrete example of what
implementation-defined might mean in context of volatile.

Chris M. Thomasson

unread,

Jun 10, 2009, 10:56:13 PM6/10/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:hvCXl.16$mX2...@newsfe05.iad...

> [...]

> Otherwise, you would need to manually insert the correct barriers for a
> particular architecture. Here is a portable version for Solaris that will
> work on all arch's support by said OS:
>
>
> #include <atomic.h>
>
>
> volatile int buffer_ready;

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

volatile int buffer_ready = 0;

of course!

;^)

> char buffer[BUF_SIZE];
> void buffer_init() {
> int i;
> for (i=0; i<BUF_SIZE; i++)
> buffer[i] = 0;
> membar_producer();
> buffer_ready = 1;
> }
>
>
> int check_and_process_buffer() {
> if (buffer_ready) {
> membar_consumer();
> /* use buffer */
> return 1;
> }
> return 0;
> }

Now, there is another issue. On NUMA systems with a non-cache-coherent
network of processing nodes, the code above might not work. One may need to
issue a special instruction in order to force the store issued on
`buffer_ready' to propagate from the intra-node level, up to the inter-node
level. Think if `check_and_process_buffer()' was running on
`Node1-CpuA-Core2-Thread-3', and `buffer_init()' was running on
`Node4-CpuD-Core3-Thread-1', and the memory which makes up `buffer_ready'
and `buffer' was local to `Node4'. There is no guarantee that the store to
`buffer_ready' will become visible to the CPU's on `Node1'. You may need to
use special instructions, such as message passing via channel interface or
something. Think of the PPC wrt communication between the memory which
belongs to the main PowerPC's, and the local private memory that belong to
each SPU. volatile alone is not going to help here, in any way shape or
form...

Tim Rentsch

unread,

Jun 11, 2009, 1:39:48 AM6/11/09

to

"MikeWhy" <boat042...@yahoo.com> writes:

Hmmmm... how can I say this gently? I think you may be confusing
the notions of abstract machine and physical machine.

You say, in part, "[MEMBAR] maintains the state of the abtract
machine to what the developer wrote." In fact MEMBAR is not
necessary for correct functioning of the abstract machine. If
MEMBAR is necessary at all, it's necessary only for producing
appropriate physical machine semantics for use of volatile. If all
MEMBAR's were taken out, and no variables were accessed externally,
the program would still execute correctly. In other words the
abstract machine would still get a faithful mapping -- it's only
external accesses that might be affected, and such accesses are not
part of the abstract machine.

Also, you talk about "the hardware level". There isn't a single
hardware level. There are at least two, namely, the hardware state
as seen by execution of a single instruction stream (where MEMBAR
isn't needed), and the hardware state as seen by execution of
another thread or process, perhaps on another CPU (where MEMBAR may
be necessary to preserve some sort of partial ordering as seen by
the single instruction stream "virtual machine"). It isn't required
that volatile take the latter perspective -- it could just as well
take the first perspective, under the provision that what
constitutes a volatile-qualified access is implementation-defined.
Depending on what enviroments the implementation is intended to
support, either choice might be a good one.

I fully understand that hardware plays a role in "optimization"
(used in a slightly different sense here) -- I mentioned store
reordering in another posting, and there is also out-of-order
execution, and even speculative execution, etc. However these
"optimizations" are irrelevant as far as the implementation is
concerned (for non-volatile access), because the state as viewed by
the single executing instruction stream is carefully maintained to
appear exactly as though storage ordering is preserved, instruction
ordering is preserved, speculative branches that end up not being
taken are suppressed, etc[*]. It's only when 'volatile' is involved
that these effects might matter, because the program execution
stated is being viewed by an agent (process, thread, device logic,
etc) external to this program's execution state.

Finally, to repeat/restate my earlier comment, it's only true that
these effects /might/ matter, and not that they /must/ matter,
because an implementation isn't obligated to take the other-process
perspective as to what constitutes a volatile-qualified access.
Depending on what choice is made for this, the hardware-level
"optimizations" might or might not need to be taken into
account for what volatile does.

Does that make a little more sense now?

[*] It's a different story for machines like MIPS where the
underlying pipeline stages are exposed at the architectural
level. But that's not important for this discussion.

Tim Rentsch

unread,

Jun 11, 2009, 2:11:12 AM6/11/09

to

"Chris M. Thomasson" <n...@spam.invalid> writes:

Right, the different types of memory actions correspond to
different memory regimes -- the intra-node level is one memory
regime, and the inter-node level is another memory regime.

> Think if `check_and_process_buffer()' was running on
> `Node1-CpuA-Core2-Thread-3', and `buffer_init()' was running on
> `Node4-CpuD-Core3-Thread-1', and the memory which makes up `buffer_ready'
> and `buffer' was local to `Node4'. There is no guarantee that the store to
> `buffer_ready' will become visible to the CPU's on `Node1'.

I take what you're saying here to mean that the implementation
shown above, with MEMBAR's but not special instructions for
inter-node stores, will not guarantee that the store to
'buffer_ready' will be visible, because so much depends on
the specific memory architectures and how they interact.

> You may need to
> use special instructions, such as message passing via channel interface or
> something. Think of the PPC wrt communication between the memory which
> belongs to the main PowerPC's, and the local private memory that belong to
> each SPU.

Yes -- arbitrarily diverse memory architectures mean potentially
arbitrarily complicated memory coherence mechanisms.

> volatile alone is not going to help here, in any way shape or
> form...

Most likely it won't, but in principle it could. Assuming first
that the necessary memory linkage could be established, so memory
in the 'buffer_init()' process could be accessed by code in the
'check_and_process_buffer()' process (such linkage could), an
implemenation could choose to implement volatile so it
synchronized the two memories appropriately when the volatile
accesses are done.

In practical terms I agree this sort of implementation isn't
likely, but the Standard allows it -- in particular, as to
how 'volatile' would behave in this respect, because that
choice is implementation-defined.

Chris M. Thomasson

unread,

Jun 11, 2009, 4:48:51 AM6/11/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfneitr...@alumnus.caltech.edu...

Well, I ___expect___ MEMBAR to behave well within an intra-node point of
view. AFAICT, for inter-node communications, on NON ccNUMA (e.g., real cache
incoherent NUMA), if you can find MEMBAR pushing out coherency "pings"
across inter-node boundaries, well, that would not be good, IMVVVHO at
least!

However, what does ANY of that have to do with volatile?

A: NOTHING!

;^o

>> You may need to
>> use special instructions, such as message passing via channel interface
>> or
>> something. Think of the PPC wrt communication between the memory which
>> belongs to the main PowerPC's, and the local private memory that belong
>> to
>> each SPU.
>
> Yes -- arbitrarily diverse memory architectures mean potentially
> arbitrarily complicated memory coherence mechanisms.

Indeed. Well, I totally disagree on a vibe I am getting from your statement.
I get the vibe that you seem to think diverse highly specific memory models
seem to potentially require complicated coherence... Well, the term
`complicated' is in the eye/ear of the individual beholder, or perhaps
softened across a plurality of a specific local group of beholders...
Statistics are so precise!

Jesting of course... Perhaps? ;^D

Anyway, your 100% correct. Sometimes a parallelization of an algorithm might
simply require so many rendezvous' of some, perhaps "dubious", sort that
they simply cannot ever be made to scale in there present form.

>> volatile alone is not going to help here, in any way shape or
>> form...
>
> Most likely it won't, but in principle it could.

Yes. Absolutely.

> Assuming first
> that the necessary memory linkage could be established, so memory
> in the 'buffer_init()' process could be accessed by code in the
> 'check_and_process_buffer()' process (such linkage could), an
> implemenation could choose to implement volatile so it
> synchronized the two memories appropriately when the volatile
> accesses are done.

An implementation can do its thing and define volatile accordingly.

> In practical terms I agree this sort of implementation isn't
> likely, but the Standard allows it -- in particular, as to
> how 'volatile' would behave in this respect, because that
> choice is implementation-defined.

I PERSONALLY WANT volatile to be restricted to compiler optimizations wrt
the context of the abstract virtual machine. Of course the abstract machine
is single-threaded. Great! That means a physical machine can implement a
million threads that each implement a single local abstract C machine. They
never communicate until the end of computation. Lets that takes a month.
That whole month is governed by the local abstract machines. Lets say they
have the ability to network and cleverly rendezvous in a NUMA system after
they were finished? I say yes... No volatile needed; well, volatile can
probably be efficiently used by node-local only code...

As for optimizations on loop conditions, well, that's newbie stuff...

;^o

Tim Rentsch

unread,

Jun 19, 2009, 11:59:16 AM6/19/09

to

Sorry, I guess I wasn't quite clear enough. My comment was meant
basically as an implicit question, trying to clarify your intended
meaning. Rephrasing, the two salient features are, one, using
MEMBAR is enough to guarantee intra-node access consistency, and
two, using MEMBAR (and nothing else) is not enough to guarantee
inter-node access consistency. That what I thought you meant
before, and I read this response as confirming that.

> However, what does ANY of that have to do with volatile?
>
> A: NOTHING!
>
> ;^o

I think it's relevant to the discussion (ie, of volatile) because
there are two clearly distinct memory regimes (intra-node and
inter-node), and it's perfectly reasonable to consider an
implementation's volatile supporting one but not the other.
Indeed, I take you're saying to mean it's reasonable to /expect/
volatile to support intra-node access but not inter-node access
in this case. And I think that's right, in the sense that many
people experienced in such architectures would expect the same
thing. (And I wouldn't presume to contradict them, even if I
expected something else, which actually I don't.)

> >> You may need to
> >> use special instructions, such as message passing via channel interface
> >> or
> >> something. Think of the PPC wrt communication between the memory which
> >> belongs to the main PowerPC's, and the local private memory that belong
> >> to
> >> each SPU.
> >
> > Yes -- arbitrarily diverse memory architectures mean potentially
> > arbitrarily complicated memory coherence mechanisms.
>
> Indeed. Well, I totally disagree on a vibe I am getting from your statement.
> I get the vibe that you seem to think diverse highly specific memory models
> seem to potentially require complicated coherence... Well, the term
> `complicated' is in the eye/ear of the individual beholder, or perhaps
> softened across a plurality of a specific local group of beholders...
> Statistics are so precise!

My statement was more in the nature of an abstract, "mathematical"
conclusion than a comment on what architectures are actually out
there. I think we're actually pretty much on the same page here.
(OOPS! No pun intended...)

> Jesting of course... Perhaps? ;^D
>
> Anyway, your 100% correct. Sometimes a parallelization of an algorithm might
> simply require so many rendezvous' of some, perhaps "dubious", sort that
> they simply cannot ever be made to scale in there present form.

To say this another way, parallelizing an algorithm in a particular
way might work well for one kind of synchronization (eg, intra-node
coherence) but not for another kind of synchronization (eg, inter-node
coherence).

> >> volatile alone is not going to help here, in any way shape or
> >> form...
> >
> > Most likely it won't, but in principle it could.
>
> Yes. Absolutely.
>
>
>
>
> > Assuming first
> > that the necessary memory linkage could be established, so memory
> > in the 'buffer_init()' process could be accessed by code in the
> > 'check_and_process_buffer()' process (such linkage could), an
> > implemenation could choose to implement volatile so it
> > synchronized the two memories appropriately when the volatile
> > accesses are done.
>
> An implementation can do its thing and define volatile accordingly.
>
>
>
>
> > In practical terms I agree this sort of implementation isn't
> > likely, but the Standard allows it -- in particular, as to
> > how 'volatile' would behave in this respect, because that
> > choice is implementation-defined.
>
> I PERSONALLY WANT volatile to be restricted to compiler optimizations wrt
> the context of the abstract virtual machine.

I had to read this sentence over several times to try to make sense of
it. I think I understand what you're saying; let me try saying it a
different way and see if we're in sync. Optimizations don't happen in
the abtract machine -- it's a single thread, one-step-at-a-time model,
exactly faithful to the original program source. However, in the
course of running a program on an actual computer, there needs to be a
degree of coherence between the abstract machine's "memory system" and
the computer's memory system. (The abstract machine's "memory system"
doesn't really exist except in some sort of conceptual sense, but it
seems useful to pretend it exists, to talk about coherence between it
and the actual computer memory). The coherence between the abstract
machine's memory system and the actual computer's memory doesn't have
to be exact, it only has to match up to the point where the "as if"
rule holds. Does that make sense?

Under this model, I think you're saying that you would like volatile
to impose coherence between the abstract machine memory system and
the "most local" physical machine memory system (ie, the same thread
executing on the same CPU), and not more than that. This coherence
is stronger than the non-volatile coherence, because the two memory
systems must be completely in sync (and not just "as if" in sync)
at points of volatile access.

In other words, the memory regime you're identifying (that volatile
would or should align with) is the same thread, same CPU memory
regime. Anything more than that, including inter-core (but still
intra-CPU), or even inter-thread (but still intra-core and intra-CPU)
would not be covered just by volatile. Is that what you mean, or
are do you mean to say something different?

To come at this a different way, let me ask it this way: which level
of communication/coherence (do you mean to say that) volatile should
support

(a) only same-thread, same core, same CPU, same node
(b) inter-thread, intra-core, intra-CPU, intra-node
(c) inter-thread, inter-core, intra-CPU, intra-node
(d) inter-thread, inter-core, inter-CPU, intra-node
(e) inter-thread, inter-core, inter-CPU, inter-node
(f) something else? (I didn't even mention intra/inter-process...)

I first thought you meant (a), but now I'm not so sure.

> Of course the abstract machine
> is single-threaded. Great! That means a physical machine can implement a
> million threads that each implement a single local abstract C machine. They
> never communicate until the end of computation. Lets that takes a month.
> That whole month is governed by the local abstract machines. Lets say they
> have the ability to network and cleverly rendezvous in a NUMA system after
> they were finished? I say yes... No volatile needed; well, volatile can
> probably be efficiently used by node-local only code...

I'm not sure what a same-thread/same-core/same-CPU/same-node
definition of volatile buys us, except some sort of guarantee
for variable access in intra-thread signal handlers. (There
is also setjmp()/longjmp(), but I think that's incidental
since whatever guarantees there are there will be true no
matter what memory regime volatile identifies.)

Certainly it's possible to do inter-thread or inter-process
communication/synchronization using extra-linguistic mechanisms and
not using volatile, even under model (a) above. Ideally an
implementation would support several different choices of which
volatile model it follows (eg, selected by a compiler flag), and
developers could choose the model appropriate to the needs of the
program being developed. Before that can happen, however, we
have to have a language to talk about what the different choices
mean. My intention and hope in this thread has been to start
to develop that language, so that different choices can be
identifed, discussed, compared, and ideally selected -- easily.

Chris M. Thomasson

unread,

Jun 19, 2009, 1:11:58 PM6/19/09

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message

news:kfn3a9w...@alumnus.caltech.edu...
[...]

> To come at this a different way, let me ask it this way: which level
> of communication/coherence (do you mean to say that) volatile should
> support
>
> (a) only same-thread, same core, same CPU, same node
> (b) inter-thread, intra-core, intra-CPU, intra-node
> (c) inter-thread, inter-core, intra-CPU, intra-node
> (d) inter-thread, inter-core, inter-CPU, intra-node
> (e) inter-thread, inter-core, inter-CPU, inter-node
> (f) something else? (I didn't even mention intra/inter-process...)
>
> I first thought you meant (a), but now I'm not so sure.

[...]

First of all I need to read your entire detailed response carefully in order
to give a complete response. However, I can answer the question above:

I choose `a'

I do not agree with the fact that MSVC automatically inserts memory barriers
on volatile accesses because it can creates unnecessary overheads. What
happens if I don't need to use any membars at all, but still need to use
volatile? Well, the damn MSVC compiler will insert the membars right under
my nose. Also, what if I need a membar, but something not as strict as
store-release and load-acquire? Again, the MSVC compiler will force the more
expensive membars down my neck. Or, what if I need the membar, but in a
different place than the compiler automatically inserts them at? I am
screwed and have to code custom synchronization primitives in assembly
language, turn link time optimizations off, and use external function
declarations so they are accessible to a C program.

So, I want volatile to only inhibit certain compiler optimizations. I do not
want volatile to automatically stick in any membars1

;^o

Tim Rentsch

unread,

Jun 20, 2009, 10:17:27 AM6/20/09

to

"Chris M. Thomasson" <n...@spam.invalid> writes:

Good, this makes clear (or at least mostly clear) what you want.
I also think I understand why you want it; not that that's
important necessarily, but to some degree the why clarifies the
what in this case.

At the same time, I think many other developers would prefer
other choices, including most of b-f above. It would be good to
support other choices also, perhaps through compiler options or
by using #pragma's. There isn't a single "right" choice for what
volatile should do -- it depends a lot on what kind of program is
being developed and on what assumptions hold for the environments
in which the program, or programs, will run. Ideally both the
development community and the implementation community will start
to realize this (or, realize it more fully). After that happens,
there needs to be a common language describing different possible
meanings for volatile -- more specifically, language more precise
than the kind of informal prose that's been used in the past --
so that developers and implementors can talk about the different
choices, and identify which choices are available in which
implementations.

karthikbalaguru

unread,

Jun 22, 2009, 7:46:15 AM6/22/09

to

On Jun 2, 4:43 am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> The Semantics of 'volatile'
> ===========================
>
> I've been meaning to get to this for a while, finally there's a
> suitable chunk of free time available to do so.
>
> To explain the semantics of 'volatile', we consider several
> questions about the concept and how volatile variables behave,
> etc. The questions are:
>
> 1. What does volatile do?
> 2. What guarantees does using volatile provide? (What memory
> regimes must be affected by using volatile?)
> 3. What limits does the Standard set on how using volatile
> can affect program behavior?
> 4. When is it necessary to use volatile?
>
> We will take up each question in the order above. The comments
> are intended to address both developers (those who write C code)
> and implementors (those who write C compilers and libraries).
>
> What does volatile do?
> ----------------------
>
> This question is easy to answer if we're willing to accept an
> answer that may seem somewhat nebulous. Volatile allows contact
> between execution internals, which are completely under control
> of the implementation, and external regimes (processes or other
> agents) not under control of the implementation. To provide such
> contact, and provide it in a well-defined way, using volatile
> must ensure a common model for how memory is accessed by the
> implementation and by the external regime(s) in question.
>
> Subsequent answers will fill in the details around this more
> high level one.
>
> What guarantees does using volatile provide?
> --------------------------------------------
>
> The short answer is "None." That deserves some elaboration.
>
> Another way of asking this question is, "What memory regimes must
> be affected by using volatile?" Let's consider some possibilities.
> One: accesses occur not just to registers but to process virtual
> memory (which might be just cache); threads running in the same
> process affect and are affected by these accesses. Two: accesses
> occur not just to cache but are forced out into the inter-process
> memory (or "RAM"); other processes running on the same CPU core
> affect and are affected by these accesses. Three: accesses occur
> not just to memory belonging to the one core but to memory shared
> by all the cores on a die; other processes running on the same CPU
> (but not necessarily the same core) affect and are affected by
> these accesses. Four: accesses occur not just to memory belonging
> to one CPU but to memory shared by all the CPUs on the motherboard;
> processes running on the same motherboard (even if on another CPU
> on that motherboard) affect and are affected by these accesses.
> Five: accesses occur not just to fast memory but also to some slow
> more permanent memory (such as a "swap file"); other agents that
> access the "swap file" affect and are affected by these accesses.
>
> The different examples are intended informally, and in many cases
> there is no distinction between several of the different layers.
> The point is that different choices of regime are possible (and
> I'm sure many readers can provide others, such as not only which
> memory is affected but what ordering guarantees are provided).
> Now the question again: which (if any) of these different
> regimes are /guaranteed/ to be included by a 'volatile' access?
>
> The answer is none of the above. More specifically, the Standard
> leaves the choice completely up to the implementation. This
> specification is given in one sentence in 6.7.3 p 6, namely:

>
> What constitutes an access to an object that has
> volatile-qualified type is implementation-defined.
>

> So a volatile access could be defined as coordinating with any of
> the different memory regime alternatives listed above, or other,
> more exotic, memory regimes, or even (in the claims of some ISO
> committee participants) no particular other memory regimes at all
> (so a compiler would be free to ignore volatile completely)[*].
> How extreme this range is may be open to debate, but I note that
> Larry Jones, for one, has stated unequivocally that the possibility
> of ignoring volatile completely is allowed under the proviso given
> above. The key point is that the Standard does not identify which
> memory regimes must be affected by using volatile, but leaves that
> decision to the implementation.
>
> A corollary to the above that any volatile-qualified access
> automatically introduces an implementation-defined aspect to a
> program.
>
> [*] Possibly not counting the specific uses of 'volatile' as it
> pertains to setjmp/longjmp and signals that the Standard
> identifies, but these are side issues.
>
> What limits are there on how volatile access can affect program behavior?
> -------------------------------------------------------------------------
>
> More properly this question is "What limits does the Standard
> impose on how volatile access can affect program behavior?".
>
> Again the short answer is None. The first sentence in 6.7.3 p 6

> says:
>
> An object that has volatile-qualified type may be modified
> in ways unknown to the implementation or have other unknown
> side effects.
>

> Nowhere in the Standard are any limitations stated as to what
> such side effects might be. Since they aren't defined, the
> rules of the Standard identify the consequences as "undefined
> behavior". Any volatile-qualified access results in undefined
> behavior (in the sense that the Standard uses the term).
>
> Some people are bothered by the idea that using volatile produces
> undefined behavior, but there really isn't any reason to be. At
> some level any C statement (or variable access) might behave in
> ways we don't expect or want. Program execution can always be
> affected by peculiar hardware, or a buggy OS, or cosmic rays, or
> anything else outside the realm of what the implementation knows
> about. It's always possible that there will be unexpected
> changes or side effects, in the sense that they are unexpected by
> the implementation, whether volatile is used or not. The
> difference is, using volatile interacts with these external
> forces in a more well-defined way; if volatile is omitted, there
> is no guarantee as to how external forces on particular parts
> of the physical machine might affect (or be affected by) changes
> in the abstract machine.
>
> Somewhat more succinctly: using volatile doesn't affect the
> semantics of the abtract machine; it admits undefined behavior
> by unknown external forces, which isn't any different from the
> non-volatile case, except that using volatile adds some
> (implementation-defined) requirements about how the abstract
> machine maps onto the physical machine in the external forces'
> universe. However, since the Standard mentions unknown side
> effects explicitly, such things seem more "expectable" when
> volatile is used. (volatile == Expect the unexected?)
>
> When is it necessary to use volatile?
> -------------------------------------
>
> In terms of pragmatics this question is the most interesting of
> the four. Of course, as phrased the question asked is more of a
> developer question; for implementors, the phrasing would be
> something more like "What requirements must my implementation
> meet to satisfy developers who are using 'volatile' as the
> Standard expects?"
>
> To get some details out of the way, there are two specific cases
> where it's necessary to use volatile, called out explicitly in
> the Standard, namely setjmp/longjmp (in 7.13.2.1 p 3) and
> accessing static objects in a signal handler (in 7.14.1.1 p 5).
> If you're a developer writing code for one of these situations,
> either use volatile, code around it so volatile isn't needed
> (this can be done for setjmp), or be sure that the particular
> code you're writing is covered by some implementation-defined
> guarantees (extensions or whatever). Similarly, if you're an
> implementor, be sure that using volatile in the specific cases
> mentioned produces code that works; what this means is that the
> volatile-using code should behave just like it would under
> regular, non-exotic control structures. Of course, it's even
> better if the implementation can do more than the minimum, such
> as: define and document some additional cases for signal
> handling code; make variable access in setjmp functions work
> without having to use volatile, or give warnings for potential
> transgressions (or both).
>
> The two specific cases are easy to identify, but of course the
> interesting cases are everything else! This area is one of the
> murkiest in C programming, and it's useful to take a moment to
> understand why. For implementors, there is a tension between
> code generation and what semantic interpretation the Standard
> requires, mostly because of optimization concerns. Nowhere is
> this tension felt more keenly than in translating 'volatile'
> references faithfully, because volatile exists to make actions in
> the abstract machine align with those occurring in the physical
> machine, and such alignment prevents many kinds of optimization.
> To appreciate the delicacy of the question, let's look at some
> different models for how implementations might behave.
>
> The first model is given as an Example in 5.1.2.3 p 8:
>
> EXAMPLE 1 An implementation might define a one-to-one
> correspondence between abstract and actual semantics: at
> every sequence point, the values of the actual objects would
> agree with those specified by the abstract semantics.
>
> We call this the "White Box model". When using implementations
> that follow the White Box model, it's never necessary to use
> volatile (as the Standard itself points out: "The keyword
> volatile would then be redundant.").
>
> At the other end of the spectrum, a "Black Box model" can be
> inferred based on the statements in 5.1.2.3 p 5. Consider an
> implementation that secretly maintains "shadow memory" for all
> objects in a program execution. Regular memory addresses are
> used for address-taking or index calculation, but any actual
> memory accesses would access only the shadow memory (which is at
> a different location), except for volatile-qualified accesses
> which would load or store objects in the regular object memory
> (ie, at the machine addresses ...
>
> read more »

One of the biggest analysis of 'volatile' i have come across :):)

Karthik Balaguru

Keith Thompson

unread,

Jun 22, 2009, 12:10:56 PM6/22/09

to

karthikbalaguru <karthikb...@gmail.com> writes:
> On Jun 2, 4:43 am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
>> The Semantics of 'volatile'
>> ===========================
>>
>> I've been meaning to get to this for a while, finally there's a
>> suitable chunk of free time available to do so.

[199 lines deleted]

>
> One of the biggest analysis of 'volatile' i have come across :):)

Why did you feel the need to re-post the whole thing just to add a
fairly meaningless one-line comment?

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Richard Bos

unread,

Jun 22, 2009, 4:53:32 PM6/22/09

to

karthikbalaguru <karthikb...@gmail.com> wrote:

> On Jun 2, 4:43 am, Tim Rentsch <t...@alumnus.caltech.edu> wrote:
> > The Semantics of 'volatile'

> One of the biggest analysis of 'volatile' i have come across :):)

And you _had_ to quote it in its entirety, adding nothing but that
trivial remark, for _what_ reason?

Furrfu.

Richard

CBFalconer

unread,

Jun 22, 2009, 6:40:56 PM6/22/09

to

Keith Thompson wrote:
> karthikbalaguru <karthikb...@gmail.com> writes:
>
... snip ...

>
>>> I've been meaning to get to this for a while, finally there's a
>>> suitable chunk of free time available to do so.
> [199 lines deleted]
>>
>> One of the biggest analysis of 'volatile' i have come across :)
>

> Why did you feel the need to re-post the whole thing just to add
> a fairly meaningless one-line comment?

Agreed. Why do people do these silly things? Who said 'They do it
only to annoy' in Alice in Wonderland?

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

David Brown

unread,

Jun 23, 2009, 5:28:33 AM6/23/09

to

CBFalconer wrote:
> Keith Thompson wrote:
>> karthikbalaguru <karthikb...@gmail.com> writes:
>>
> ... snip ...
>>>> I've been meaning to get to this for a while, finally there's a
>>>> suitable chunk of free time available to do so.
>> [199 lines deleted]
>>> One of the biggest analysis of 'volatile' i have come across :)
>> Why did you feel the need to re-post the whole thing just to add
>> a fairly meaningless one-line comment?
>
> Agreed. Why do people do these silly things? Who said 'They do it
> only to annoy' in Alice in Wonderland?
>

I believe it was the cook, in her song about sneezing children (from
memory, so it might not be word-perfect):

Speak roughly to your little boy,
And beat him when he sneezes,
He only does it to annoy
Because he knows it teases.

Boudewijn Dijkstra

unread,

Jun 23, 2009, 5:48:39 AM6/23/09

to

Op Tue, 23 Jun 2009 11:28:33 +0200 schreef David Brown
<da...@westcontrol.removethisbit.com>:

Word-perfect, but failed on punctuation.

http://en.wikisource.org/wiki/Alice's_Adventures_in_Wonderland/Chapter_6