Sequence points and volatile objects

ScottM

unread,

Sep 30, 2011, 9:50:59 AM9/30/11

to

Someone pointed me to an article about the dangers of C (parts 2 and 3
of http://blog.regehr.org/archives/213) and that article made the
surprising-to-me claim, that an assignment to a volatile object wasn't
guaranteed to occur, if the compiler can determine that the assignment
won't matter in subsequent execution. In the case of the article, it
was discussing optimization in the face of undefined behaviour - if a
divide by zero MIGHT occur right after an assignment to a volatile
object, the compiler is considered within its rights to haul the
divide operation up before the assignment to the volatile, because if
division by 0 happens, everything is undefined anyway and it won't
matter if the volatile operation happens. That's a surprise; it
matters to me.

This raises an awful possibility to me. If I create a "volatile
unsigned char*", assign it an address, modify memory through it, and
at no other point in the program reference any global object
(including dereferencing that pointer again)... then what stops an
optimizer from deciding that my assignment amounts to a "side effect
which has no effect", and optimizing it away?

Clearly no compiler I've used does this - one of the main uses of
volatile objects is memory-mapped I/O ports, and just because that
address is never touched again, doesn't mean the invisible side effect
of touching it once isn't critical to the functioning of the program.
But it suddenly sounds as if the standard doesn't support this
thinking. I'm looking specifically at 1989 C, 2.1.2.3, 3rd paragraph.

I do a fair amount of threaded programming and some bare-metal
programming, and I've commonly used volatile to insist that some read
or write operation happen NOW, regardless of the optimizer's views.
It's looking like I've placed my faith on a weak reed. Can anyone
comment?

If C really doesn't have a mechanism for "do this now, here, no
argument, just do it", then something essential is missing.

James Kuyper

unread,

Sep 30, 2011, 10:47:09 AM9/30/11

to

On 09/30/2011 09:50 AM, ScottM wrote:
> Someone pointed me to an article about the dangers of C (parts 2 and 3
> of http://blog.regehr.org/archives/213) and that article made the
> surprising-to-me claim, that an assignment to a volatile object wasn't
> guaranteed to occur, if the compiler can determine that the assignment

> won't matter in subsequent execution. ...

That sounds reasonable, if "won't matter" is interpreted broadly enough.
If such an optimization could possibly cause problems, for strictly
conforming code, then "won't matter" is false.

> ... In the case of the article, it

> was discussing optimization in the face of undefined behaviour - if a
> divide by zero MIGHT occur right after an assignment to a volatile
> object, the compiler is considered within its rights to haul the
> divide operation up before the assignment to the volatile, because if
> division by 0 happens, everything is undefined anyway and it won't
> matter if the volatile operation happens.

If the division by 0 is only possible, and not certain, and if it's
separated from the volatile assignment by at least one sequence point,
then the program must behave exactly as if the assignment actually
occurred, and more specifically, as if it occurred before the possible
division by 0. However, if the compiler can be certain that the
assignment has no effects, then optimizing the assignment away will, in
fact, produce exactly the same behavior as executing it.

> ... That's a surprise; it

> matters to me.
> This raises an awful possibility to me. If I create a "volatile
> unsigned char*", assign it an address, modify memory through it, and
> at no other point in the program reference any global object
> (including dereferencing that pointer again)... then what stops an
> optimizer from deciding that my assignment amounts to a "side effect
> which has no effect", and optimizing it away?

Any level of uncertainty by the compiler about whether or not assignment
to the memory referred to by that pointer has any consequences outside
your program. Such uncertainty would mean that he side-effect qualifies
as "needed", and section 5.1.2.3p3 would not apply.

> Clearly no compiler I've used does this - one of the main uses of
> volatile objects is memory-mapped I/O ports, and just because that
> address is never touched again, doesn't mean the invisible side effect
> of touching it once isn't critical to the functioning of the program.
> But it suddenly sounds as if the standard doesn't support this
> thinking. I'm looking specifically at 1989 C, 2.1.2.3, 3rd paragraph.

I don't have a copy of that version of the standard. The current version
is 1999, and has three Technical Corrigenda that have been officially
approved. I do have n1256.pdf, which contains C99 with all three TC's
applied. It is technically only a draft, but it's free, and more
convenient than the official documents, and has only two known defects
(one of which is a misspelling: "Septermber").

I suspect that 5.1.2.3p3 is the corresponding section of the C99
standard. When the ANSI standard, C89, was adapted to become the ISO
standard, C90, the only change was the addition of 3 sections at the
beginning, which caused all of the section numbers to be increased by 3.

5.1.2.3p3 says: "In the abstract machine, all expressions are evaluated
as specified by the semantics. An actual implementation need not
evaluate part of an expression if it can deduce that its value is not
used and that no needed side effects are produced (including any caused
by calling a function or accessing a volatile object)."

I don't see any cause for alarm in that clause. If you care for some
reason about whether or not the assignment actually occurred, then the
reason you care is also a reason why that side effect is "needed", so
that side effect would not be covered by that clause.
--
James Kuyper

Richard Kettlewell

unread,

Sep 30, 2011, 12:38:16 PM9/30/11

to

ScottM <scott....@gmail.com> writes:
> Someone pointed me to an article about the dangers of C (parts 2 and 3
> of http://blog.regehr.org/archives/213) and that article made the
> surprising-to-me claim, that an assignment to a volatile object wasn't
> guaranteed to occur, if the compiler can determine that the assignment
> won't matter in subsequent execution.

I think you've mis-stated the problem in this paragraph.

> In the case of the article, it was discussing optimization in the face
> of undefined behaviour - if a divide by zero MIGHT occur right after
> an assignment to a volatile object, the compiler is considered within
> its rights to haul the divide operation up before the assignment to
> the volatile, because if division by 0 happens, everything is
> undefined anyway and it won't matter if the volatile operation
> happens. That's a surprise; it matters to me.

The compiler does not determine that the volatile assignment "won't
subsequently matter". The compiler optimizes as if the UB case never
arises at runtime. You can find other examples of this that don't
involve volatile, for instance:

struct foo {
int x, y;
};

int spong(struct foo *s) {
int z = s->x;
if(!s)
return -1;
return z;

}

> This raises an awful possibility to me. If I create a "volatile
> unsigned char*", assign it an address, modify memory through it, and
> at no other point in the program reference any global object
> (including dereferencing that pointer again)... then what stops an
> optimizer from deciding that my assignment amounts to a "side effect
> which has no effect", and optimizing it away?

If your program has no undefined behavior then the answer is the
requirement that all references to volatile objects strictly match the
abstract machine (in s6.7.3 in the copy of C99 that I have, and of
course assuming the compiler has no bugs). If it does invoke UB all
bets are off (and not just regarding volatile).

> Clearly no compiler I've used does this - one of the main uses of
> volatile objects is memory-mapped I/O ports, and just because that
> address is never touched again, doesn't mean the invisible side effect
> of touching it once isn't critical to the functioning of the program.
> But it suddenly sounds as if the standard doesn't support this
> thinking. I'm looking specifically at 1989 C, 2.1.2.3, 3rd paragraph.
>
> I do a fair amount of threaded programming and some bare-metal
> programming, and I've commonly used volatile to insist that some read
> or write operation happen NOW, regardless of the optimizer's views.
> It's looking like I've placed my faith on a weak reed. Can anyone
> comment?

In the case of threads the rules will depend on the threading API, but
in at least some popular ones volatile is neither necessary nor
sufficient.

--
http://www.greenend.org.uk/rjk/

Marcin Grzegorczyk

unread,

Sep 30, 2011, 5:56:00 PM9/30/11

to

ScottM wrote:
[snip]

> I do a fair amount of threaded programming and some bare-metal
> programming, and I've commonly used volatile to insist that some read
> or write operation happen NOW, regardless of the optimizer's views.
> It's looking like I've placed my faith on a weak reed. Can anyone
> comment?

Yes, it's true that (as you put it) you've placed your faith on a weak
reed; volatile accesses are actually guaranteed to be ordered only with
respect to one another, not to ordinary accesses, and within a single
thread only. In particular, volatile accesses do not provide memory
ordering (which is important in multiprocessor environments in which the
hardware does not enforce cache coherency between CPUs), and are not
guaranteed to be atomic (with the exception of volatile sig_atomic_t).

In general, both C89 and C99 are completely silent on the issue of
multi-threaded execution. C1X is going to rectify this problem with the
introduction of atomic types and synchronization primitives; until that,
the safest option is to use whatever the platform provides (like GCC's
__sync_* atomic built-ins, or the Win32 interlocked API).
--
Marcin Grzegorczyk

Phil Carmody

unread,

Oct 13, 2011, 4:12:05 PM10/13/11

to

Marcin Grzegorczyk <mgrz...@poczta.onet.pl> writes:
> ScottM wrote:
> [snip]
> > I do a fair amount of threaded programming and some bare-metal
> > programming, and I've commonly used volatile to insist that some read
> > or write operation happen NOW, regardless of the optimizer's views.
> > It's looking like I've placed my faith on a weak reed. Can anyone
> > comment?
>
> Yes, it's true that (as you put it) you've placed your faith on a weak
> reed; volatile accesses are actually guaranteed to be ordered only
> with respect to one another, not to ordinary accesses,

C&V for that please?

"Actions on objects so declared shall not be [...] reordered except as
permitted by the rules of evaluating expressions" is what footnote 116
says. I.e. not "Actions on objects so declared shall not be [...]
reordered with respect to one another except as permitted by the rules
of evaluating expressions". I take that as meaning any re-ordering,
including with respect to ordinary accesses, is something that shall
not be done.

Phil
--
"Religion is what keeps the poor from murdering the rich."
-- Napoleon

Hans-Bernhard Bröker

unread,

Oct 13, 2011, 5:07:14 PM10/13/11

to

On 13.10.2011 22:12, Phil Carmody wrote:

> "Actions on objects so declared shall not be [...] reordered except as
> permitted by the rules of evaluating expressions" is what footnote 116
> says.

Careful with arguing by footnotes. They're not normative.

> I.e. not "Actions on objects so declared shall not be [...]
> reordered with respect to one another except as permitted by the rules
> of evaluating expressions". I take that as meaning any re-ordering,
> including with respect to ordinary accesses, is something that shall
> not be done.

The root of the problem with that argument is that you assume
"reordering" to be a symmetric operation. Following the actual wording,
nothing forbids reodering non-volatile accesses any which way, even past
volatile ones. So there's no need to reorder any volatile accesses to
go from

volatile int a;
int b;

a = 5;
b = 3;
a = 1;
b = 0;

to, say,

b = 0;
a = 5;
a = 3;

So yes, the net effect is that volatile objects only get protection from
being reordered amongs one another, but not with regard to others.

Antoine Leca

unread,

Oct 17, 2011, 5:19:44 AM10/17/11

to

Hans-Bernhard Bröker wrote:
> to reorder any volatile accesses to go from
>
> volatile int a;
> int b;
>
> a = 5;
> b = 3;
> a = 1;
> b = 0;
>
> to, say,
>
> b = 0;
> a = 5;
> a = 3;

<NIT-PICKING>
a = 1;
</NIT-PICKING>

Antoine

Tim Rentsch

unread,

Mar 7, 2012, 7:53:04 PM3/7/12

to

Phil Carmody <thefatphi...@yahoo.co.uk> writes:

> Marcin Grzegorczyk <mgrz...@poczta.onet.pl> writes:
>> ScottM wrote:
>> [snip]
>> > I do a fair amount of threaded programming and some bare-metal
>> > programming, and I've commonly used volatile to insist that some read
>> > or write operation happen NOW, regardless of the optimizer's views.
>> > It's looking like I've placed my faith on a weak reed. Can anyone
>> > comment?
>>
>> Yes, it's true that (as you put it) you've placed your faith on a weak
>> reed; volatile accesses are actually guaranteed to be ordered only
>> with respect to one another, not to ordinary accesses,
>
> C&V for that please?

Marcin's statement is wrong. Accessing a volatile requires _all_
side-effects (not just those on volatiles) of any previous
statements to be complete, and no side-effects of subsequent
statements to be started.

The catch to that is that what is done (ie, in real hardware) to
access a volatile-qualified reference is implementation-defined.
So, for example, a write could be issued to a memory controller
that might do out-of-order writes. But that doesn't remove the
requirement to have finished all previous side-effects (and not
started subsequent ones); it just means you can't be sure what
the effect of the requirement will be without consulting the
implementation's documentation on memory access.

Tim Rentsch

unread,

Mar 7, 2012, 8:09:14 PM3/7/12

to

Hans-Bernhard Broeker <HBBr...@t-online.de> writes:

> On 13.10.2011 22:12, Phil Carmody wrote:
>
>> "Actions on objects so declared shall not be [...] reordered except as
>> permitted by the rules of evaluating expressions" is what footnote 116
>> says.
>
> Careful with arguing by footnotes. They're not normative.
>
>> I.e. not "Actions on objects so declared shall not be [...]
>> reordered with respect to one another except as permitted by the rules
>> of evaluating expressions". I take that as meaning any re-ordering,
>> including with respect to ordinary accesses, is something that shall
>> not be done.
>
> The root of the problem with that argument is that you assume
> "reordering" to be a symmetric operation. Following the actual
> wording, nothing forbids reodering non-volatile accesses any which

> way, even past volatile ones. [snip elaboration]

That's wrong. 6.7.3p7 (previously 6.7.3p6), in conjunction with
the referenced section 5.1.2.3, requires _all_ evaluations -- not
just those related to volatiles -- before the last sequence point
to be complete, and no subsequent evaluations to have started, on
each volatile-qualified access.

Wojtek Lerch

unread,

Mar 7, 2012, 10:25:56 PM3/7/12

to

On 07/03/2012 7:53 PM, Tim Rentsch wrote:
> Marcin's statement is wrong. Accessing a volatile requires _all_
> side-effects (not just those on volatiles) of any previous
> statements to be complete, and no side-effects of subsequent
> statements to be started.

Yes, in the abstract machine.

In the real hardware, the only requirement is that accesses to the
abstract machine's volatile objects must match accesses to the
corresponding real hardware objects (in an implementation-defined way).
The abstract machine's accesses to non-volatile objects don't have to
map to anything at all in the real hardware, and the compiler is free to
optimize them away, reorder them, or do whatever it feels necessary to
ensure that volatile accesses and file contents match those of the
abstract machine.

Tim Rentsch

unread,

Mar 20, 2012, 5:24:03 PM3/20/12

to

Wojtek Lerch <wojt...@yahoo.ca> writes:

> On 07/03/2012 7:53 PM, Tim Rentsch wrote:
>> Marcin's statement is wrong. Accessing a volatile requires _all_
>> side-effects (not just those on volatiles) of any previous
>> statements to be complete, and no side-effects of subsequent
>> statements to be started.
>
> Yes, in the abstract machine.

This is a silly statement. It's _always_ true in the abstract
machine that every expression, including every subexpression, is
evaluated strictly in terms of the abstract semantics. Inferring
that a requirement to evaluate in accordance with abstract
semantics is meant to refer to evaluation in the abstract machine
is like saying a contractual clause "to obey all applicable laws"
is meant to refer only to the laws of physics. There is no point
in putting in such a requirement, because there is no way it
could be violated.

> In the real hardware, the only requirement is that accesses to the
> abstract machine's volatile objects must match accesses to the
> corresponding real hardware objects (in an implementation-defined
> way). The abstract machine's accesses to non-volatile objects don't
> have to map to anything at all in the real hardware, and the compiler
> is free to optimize them away, reorder them, or do whatever it feels
> necessary to ensure that volatile accesses and file contents match
> those of the abstract machine.

This interpretation doesn't agree with what the Standard requires.
Describing the semantics for 'volatile', 6.7.3 p7 (unchanged from
N1256 6.7.3 p6) says in part:

[A]ny expression referring to such an object shall be evaluated
strictly according to the rules of the abstract machine, as
described in 5.1.2.3.

Obviously the word "evaluated" here refers to actual evaluation, not
evaluation in the abstract machine. Looking at 5.1.2.3, paragraph 3
says in part:

/Sequenced before/ is an asymmetric, transitive, pair-wise
relation between evaluations executed by a single thread,
which induces a partial order among those evaluations. Given
any two evaluations A and B, if A is sequenced before B, then
the execution of A shall precede the execution of B. [...]
The presence of a /sequence point/ between the evaluation of
expressions A and B implies that every value computation and
side effect associated with A is sequenced before every value
computation and side effect associated with B.

This description requires the execution of all value computations
and all side effects before the last sequence point (and none of
the subsequent ones) prior to each volatile-qualified access. The
statement in 6.7.3 p7 requires the actual evaluation (and not just
that of the abstract machine, to which they always apply), to obey
these rules. They are not selective: they apply to all previous
expressions (and forbid all subsequent expressions), not just those
related to volatile access.

It's true that previous computations can be reordered (as long as
they are all still previous), and those that have no downstream
effect don't have to be executed at all. But any computation
sequenced before the volatile access that has a downstream effect
(ie, one sequenced after the volatle access) must be executed in the
actual compiled code _before_ accessing the volatile, and all
downstream-effect computations must be executed _after_ accessing
the volatile.

If you continue to disagree, please cite portions of the Standard
that support an alternate interpretation and that contradict the
reasoning given above.

Marcin Grzegorczyk

unread,

Mar 25, 2012, 5:46:51 PM3/25/12

to

Tim Rentsch wrote:

> Wojtek Lerch<wojt...@yahoo.ca> writes:
>> In the real hardware, the only requirement is that accesses to the
>> abstract machine's volatile objects must match accesses to the
>> corresponding real hardware objects (in an implementation-defined
>> way). The abstract machine's accesses to non-volatile objects don't
>> have to map to anything at all in the real hardware, and the compiler
>> is free to optimize them away, reorder them, or do whatever it feels
>> necessary to ensure that volatile accesses and file contents match
>> those of the abstract machine.
>
> This interpretation doesn't agree with what the Standard requires.
> Describing the semantics for 'volatile', 6.7.3 p7 (unchanged from
> N1256 6.7.3 p6) says in part:
>
> [A]ny expression referring to such an object shall be evaluated
> strictly according to the rules of the abstract machine, as
> described in 5.1.2.3.
>
> Obviously the word "evaluated" here refers to actual evaluation, not
> evaluation in the abstract machine. Looking at 5.1.2.3, paragraph 3
> says in part:

[snip the description of sequencing]

> This description requires the execution of all value computations
> and all side effects before the last sequence point (and none of
> the subsequent ones) prior to each volatile-qualified access. The
> statement in 6.7.3 p7 requires the actual evaluation (and not just
> that of the abstract machine, to which they always apply), to obey
> these rules. They are not selective: they apply to all previous
> expressions (and forbid all subsequent expressions), not just those
> related to volatile access.

Note, however, that by 5.1.2.3p6, accesses to non-volatile objects are
not a part of the observable behaviour of a program. Therefore, whether
they actually occur in the order prescribed by the abstract machine or
not is a moot point, as long as the requirements of 5.1.2.3p6 are not
violated.

Now, the argument started with the question of using volatile for
inter-thread access synchronization to non-volatile objects. I say that
it is not guaranteed to work, because in the absence of other explicit
ordering guarantees (like C11 5.1.2.4), the only thing that threads can
infer about each other is their observable behaviour, which does not
include non-volatile object accesses. This is what I meant when I said
volatile accesses are not guaranteed to be ordered with respect to
ordinary accesses -- ordered in a way observable *outside* of a thread
of execution. (For the record, even making all objects volatile would
not necessarily work for inter-thread synchronization, due to the
implications of 6.7.3p7).
--
Marcin Grzegorczyk

Tim Rentsch

unread,

Mar 28, 2012, 1:47:25 AM3/28/12

to

The first item in 5.1.2.3 p6 says

Accesses to volatile objects are evaluated strictly
according to the rules of the abstract machine.

This does not say, like the next item does about data written
into files, that the sequence of values written to or read from
volatile objects must be identical to the sequence of values
that would written/read if execution proceeded according to the
abstract semantics. What it does say is that accesses to
volatile objects _are evaluated strictly according to the rules
of the abstract machine_ (my emphasis), and that is what is
observable. In other words, this stipulation gives a license to
examine the state of the actual machine, at each volatile
access, and verify that it matches the state of an abstract
machine (modulo of course some unspecified state related to
computations since the last sequence point). We aren't allowed
to observe what the actual machine does _between_ accesses to
volatile objects, but we are allowed to observe the state
of the actual machine at each point _of_ a volatile access, and
the actual machine state must match those things that must be
true in the abstract machine.

> Now, the argument started with the question of using volatile for
> inter-thread access synchronization to non-volatile objects. I say
> that it is not guaranteed to work, because in the absence of other
> explicit ordering guarantees (like C11 5.1.2.4), the only thing that
> threads can infer about each other is their observable behaviour,
> which does not include non-volatile object accesses. This is what I
> meant when I said volatile accesses are not guaranteed to be ordered
> with respect to ordinary accesses -- ordered in a way observable
> *outside* of a thread of execution. (For the record, even making all
> objects volatile would not necessarily work for inter-thread
> synchronization, due to the implications of 6.7.3p7).

By itself volatile never guarantees anything, for exactly the reason
you point out - what constitutes a volatile-qualified access is
implementation-defined.

However, if the implementation-defined behavior for volatile-qualified
access includes imposing the necessary memory barriers, then volatile
can serve to delimit data transmission between threads, even for other
non-volatile accesses, as long as those accesses are sequenced before
the volatile access in the sending process, and sequenced after the
corresponding volatile access in the receiving process. This result
is a consequence of the semantics for volatile given in 6.7.3 p7, and
reinforced in 5.1.2.3 p6.

Marcin Grzegorczyk

unread,

Mar 29, 2012, 2:41:32 PM3/29/12

to

Even if that is true (which I find debatable, but I am not willing to
venture further in this territory of advanced hair-splitting), we are
not guaranteed to have any /means/ of examining the state of the actual
machine beyond the minimum requirements specified in 5.1.2.3p6.

>> Now, the argument started with the question of using volatile for
>> inter-thread access synchronization to non-volatile objects. I say
>> that it is not guaranteed to work, because in the absence of other
>> explicit ordering guarantees (like C11 5.1.2.4), the only thing that
>> threads can infer about each other is their observable behaviour,
>> which does not include non-volatile object accesses. This is what I
>> meant when I said volatile accesses are not guaranteed to be ordered
>> with respect to ordinary accesses -- ordered in a way observable
>> *outside* of a thread of execution. (For the record, even making all
>> objects volatile would not necessarily work for inter-thread
>> synchronization, due to the implications of 6.7.3p7).
>
> By itself volatile never guarantees anything, for exactly the reason
> you point out - what constitutes a volatile-qualified access is
> implementation-defined.
>
> However, if the implementation-defined behavior for volatile-qualified
> access includes imposing the necessary memory barriers, then volatile
> can serve to delimit data transmission between threads, even for other
> non-volatile accesses, as long as those accesses are sequenced before
> the volatile access in the sending process, and sequenced after the
> corresponding volatile access in the receiving process. This result
> is a consequence of the semantics for volatile given in 6.7.3 p7, and
> reinforced in 5.1.2.3 p6.

Yes, *if* the implementation defines volatile accesses to have the
appropriate semantics. If it doesn't (and conforming implementations
are not required to; in fact, known are some that don't), then all bets
are off.
--
Marcin Grzegorczyk

Joshua Maurice

unread,

Mar 30, 2012, 4:08:13 PM3/30/12

to

On Mar 27, 10:47 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
> We aren't allowed
> to observe what the actual machine does _between_ accesses to
> volatile objects, but we are allowed to observe the state
> of the actual machine at each point _of_ a volatile access, and
> the actual machine state must match those things that must be
> true in the abstract machine.

I think this is patently silly. Are you saying that you can't do basic
optimizations like common subexpression elimination, power reductions,
and so on when volatiles are involved? You can't examine the state of
an object if the object is optimized out of existence. I hesitate to
even contemplate the implications this would have for threaded
programs. Your interpretation more or less outlaws all optimizations
for a program with threads.

Tim Rentsch

unread,

Apr 4, 2012, 10:04:23 PM4/4/12

to

You say that like it is some sort of shortcoming, but at some level
that's the point of being an /observable/ - we are allowed to examine
such properties any way we choose, independent of any mechanisms the
standard might supply. The requirement about contents of files, for
example, isn't just about how Standard-supplied operations work, but
what the contents of files are if examined by a non-C program. There
is no guarantee that the underlying OS provides a means of examining
the contents of files, but _however_ they are examined they must match
what would be written by straightforward translation (ie, "by the
abstract machine"), even if (and in fact especially if) the means
of examination doesn't rely on mechanisms defined in the Standard.

It sounds like you're just repeating the point about non-portable
behavior, which I already agreed to. Using volatile does not, in
and of itself, guarantee any inter-thread semantics either for
volatiles or for non-volatiles. But if a volatile access puts in a
standard memory barrier, there are sequencing implications for
non-volatiles as well as volatiles -- the claim that only volatiles
must be subject to sequencing is not correct (which basically has
been my point all along).

Tim Rentsch

unread,

Apr 4, 2012, 10:47:09 PM4/4/12

to

Joshua Maurice <joshua...@gmail.com> writes:

> On Mar 27, 10:47 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>> We aren't allowed
>> to observe what the actual machine does _between_ accesses to
>> volatile objects, but we are allowed to observe the state
>> of the actual machine at each point _of_ a volatile access, and
>> the actual machine state must match those things that must be
>> true in the abstract machine.
>
> I think this is patently silly. Are you saying that you can't do
> basic optimizations like common subexpression elimination, power
> reductions, and so on when volatiles are involved? You can't examine
> the state of an object if the object is optimized out of existence.

I think you're overstating the case. The situation with volatiles
isn't really very different from what happens with calls to external
functions. Basically you can do all the optimization you want up
to the point of the function call, but then everything has to "look
right" when the function is called. The presence of function calls
reduces the opportunity for optimization but certainly doesn't
eliminate it. The effect with somewhat larger with volatile, but
not at all totally inhibiting.

> I hesitate to even contemplate the implications this would have for
> threaded programs. Your interpretation more or less outlaws all
> optimizations for a program with threads.

Nonsense. Programs with threads don't have to use volatile, and
volatile doesn't make any guarantees about inter-thread semantics.
Certainly it _could_ make a difference in _some_ implementation,
but equally certainly there is no reason it _must_ make a
difference in _all_ implementations. Because what constitutes
volatile access is implementation-defined, it's not especially a
good choice for effecting inter-thread synchronization anyway, but
even assuming it is used, the costs will likely be due mostly to
putting in the necessary memory barriers (which are substantial),
not to the comparatively minor effect of less freedom in
optimization choices.

Joshua Maurice

unread,

Apr 5, 2012, 7:52:25 PM4/5/12

to

On Apr 4, 7:47 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:

> Joshua Maurice <joshuamaur...@gmail.com> writes:
> > On Mar 27, 10:47 pm, Tim Rentsch <t...@alumni.caltech.edu> wrote:
> >> We aren't allowed
> >> to observe what the actual machine does _between_ accesses to
> >> volatile objects, but we are allowed to observe the state
> >> of the actual machine at each point _of_ a volatile access, and
> >> the actual machine state must match those things that must be
> >> true in the abstract machine.
>
> > I think this is patently silly. Are you saying that you can't do
> > basic optimizations like common subexpression elimination, power
> > reductions, and so on when volatiles are involved? You can't examine
> > the state of an object if the object is optimized out of existence.
>
> I think you're overstating the case. The situation with volatiles
> isn't really very different from what happens with calls to external
> functions. Basically you can do all the optimization you want up
> to the point of the function call, but then everything has to "look
> right" when the function is called. The presence of function calls
> reduces the opportunity for optimization but certainly doesn't
> eliminate it. The effect with somewhat larger with volatile, but
> not at all totally inhibiting.

Let's be clear. Consider:
int main()
{
int volatile x;
int y;
y = 1;
x = 1;
y = y * 2;
return y;
}
By your argument, that program cannot be transformed by the compiler
to:
int main()
{
int volatile x;
x = 1;
return 2;
}
Which I think is highly silly, and (in my admittingly near complete
ignorance of actual implementations on this issue), I doubt that any
commercial implementation shares your view.

As a trite test, I just tested this on gcc version 4.1.2 20080704 (Red
Hat 4.1.2-44). I took both programs, compiled as:
gcc -O2 source.c
and did a binary diff on the output executable. Same contents.
(Simpler than comparing the assembly in this case.) At least this
version of gcc shares my views.

Do you really want to disallow that particular optimization, and
similar optimizations? Reference escape analysis can show that an
optimization on a local variable is safe to do across a function call,
function calls can be expanded inline, several implementations even do
link-time inlining and further optimization, and so on. With that in
mind, your proposed volatile semantics are actually a lot more
restrictive than function calls.

> > I hesitate to even contemplate the implications this would have for
> > threaded programs. Your interpretation more or less outlaws all
> > optimizations for a program with threads.
>
> Nonsense. Programs with threads don't have to use volatile, and
> volatile doesn't make any guarantees about inter-thread semantics.
> Certainly it _could_ make a difference in _some_ implementation,
> but equally certainly there is no reason it _must_ make a
> difference in _all_ implementations. Because what constitutes
> volatile access is implementation-defined, it's not especially a
> good choice for effecting inter-thread synchronization anyway, but
> even assuming it is used, the costs will likely be due mostly to
> putting in the necessary memory barriers (which are substantial),
> not to the comparatively minor effect of less freedom in
> optimization choices.

I'm sorry, I wasn't clear enough in my problem statement. I agree that
volatile as spec-ed, especially w.r.t. C1x (has the name been
formalized yet? sorry), is useless for inter-thread communication. (As
you note, perhaps an implementation could make stronger guarantees,
but the C standard does not give you enough on its own.)

That was not my point at all.

You made the interesting claim that at any volatile read or write, you
could examine the entire machine, and ensure that the abstract
semantics are followed. The logical extension IMHO would be to examine
the states /of other threads/. And as other threads execute
asynchronously, all optimizations are basically prohibited in the face
of a single volatile operation.

Of course, I think this is all rather academic, because I subscribe to
Marcin Grzegorczyk's view. I believe this is the consensus view IMHO.
The only guarantees you have are the specified visible behavior of the
abstract machine. This includes volatile operations, IO, and the
return value of main. No guarantees are given at all about the values,
or even existence, of non-volatile objects, at all, except insofaras
the behavior is visible from the aforementioned visible behavior.

Tim Rentsch

unread,

May 5, 2012, 1:29:13 PM5/5/12

to

When the Standard says that referencing a volatile-qualified object

"shall be evaluated strictly according to the rules of the abstract

machine, as described in 5.1.2.3", what is unreasonable about thinking
that requirement means that accesses to volatile-qualified objects

shall be evaluated strictly according to the rules of the abstract

machine, as described in 5.1.2.3? Because that's all I'm saying;
the implications about optimization are just a consequence of that.

> As a trite test, I just tested this on gcc version 4.1.2 20080704 (Red
> Hat 4.1.2-44). I took both programs, compiled as:
> gcc -O2 source.c
> and did a binary diff on the output executable. Same contents.
> (Simpler than comparing the assembly in this case.) At least this
> version of gcc shares my views.

Note that the wording describing the requirements of access
to volatile-qualified objects changed between C11 and C99.
Whether this difference represents a clarification or a change
I don't know, but either way it's easy to believe compilers
haven't caught up yet.

> Do you really want to disallow that particular optimization, and
> similar optimizations? Reference escape analysis can show that an
> optimization on a local variable is safe to do across a function call,
> function calls can be expanded inline, several implementations even do
> link-time inlining and further optimization, and so on. With that in
> mind, your proposed volatile semantics are actually a lot more
> restrictive than function calls.

It sounds like you're arguing implicitly that (you think) this isn't a
good idea, and therefore the Standard couldn't have required it.
Personally I don't think excluding such optimizations in the case of
volatile access (which after all is very rare) is any big deal, but
all that matters to me in this discussion is whether the Standard
requires it.

What you call a "logical extension" doesn't follow, because the
guarantees for volatile access apply only to a single thread, not to
all threads at once. I meant only that the execution state of a
single thread of execution could be examined; I'm sorry if it came
across otherwise.

However, since you bring it up, examining the state of other threads
is no big deal, because the Standard doesn't impose ordering
requirements on expressions and statements evaluated in other threads
(not counting things like calls to thread synchronization functions,
which clearly affect optimization whether volatile is used or not).
So using volatile in one thread doesn't prevent any optimization
in any other thread.

> Of course, I think this is all rather academic, because I subscribe to
> Marcin Grzegorczyk's view. I believe this is the consensus view IMHO.
> The only guarantees you have are the specified visible behavior of the
> abstract machine. This includes volatile operations, IO, and the
> return value of main. No guarantees are given at all about the values,
> or even existence, of non-volatile objects, at all, except insofaras
> the behavior is visible from the aforementioned visible behavior.

Two problems. First, that volatile accesses are evaluated strictly
according to the rules of the abstract machine, is in the given list
of observable behaviors. So we are allowed to observe it.

Second, any volatile access may have side-effects unknown to the
implementation. Writing to a file is a side-effect. So any volatile
access could, unbeknownst to the implementation, write a core file
capturing the thread's execution state. The Standard requires that
the contents of files must be the same as if done by running on the
abstract machine. Hence, what's in the thread memory for that core
file must match the abstract machine at each point of volatile access,
to satisfy that requirement of observable behavior.

Wojtek Lerch

unread,

May 6, 2012, 10:34:01 PM5/6/12

to

On 05/05/2012 1:29 PM, Tim Rentsch wrote:
> Second, any volatile access may have side-effects unknown to the
> implementation. Writing to a file is a side-effect. So any volatile
> access could, unbeknownst to the implementation, write a core file
> capturing the thread's execution state.

It depends on what exactly you mean by "execution state". If you mean
the contents of memory of the hardware running the program, then yes;
but this proves nothing, because the standard doesn't have any
requirements about how those contents represent the objects of the
abstract machine. If, on the other hand, by "execution state" you were
referring to the objects of the abstract machine, then no, in general a
core-capturing mechanism could not do that without the implementation's
help, again because the mapping between the state of the hardware
running the program and the state of non-volatile objects in the
abstract machine is completely unspecified. As far as the C standard is
concerned, non-volatile objects are not observable.

> The Standard requires that
> the contents of files must be the same as if done by running on the
> abstract machine.

The contents of files produced by a C program must be the same as if
done by running that program in the abstract machine. But if you have
some core-dumping mechanism that is not part of the C program or even of
the C implementation, then the behaviour of that mechanism is out of
scope of the C standard. If the core-dumping mechanism promises you to
save all the non-volatiles of the abstract machine (rather than just the
memory of the hardware that runs it), then it's not the implementation's
responsibility to make that possible by making them all observable in
the real hardware.

> Hence, what's in the thread memory for that core
> file must match the abstract machine at each point of volatile access,
> to satisfy that requirement of observable behavior.

Only if the implementation was foolish enough to promise you that
volatile access dumps non-volatile objects of the abstract machine to a
file. If that promise was made by something that is not part of the
implementation, then failing to keep that promise does not affect the
implementation's conformance. And if the promise was just about the
state of the hardware that your program runs on, then this doesn't prove
your point either.

Tim Rentsch

unread,

Dec 21, 2012, 10:46:51 AM12/21/12

to

These comments strike me either as nonsensical or a long non-response.
Obviously the description regarding volatile is referring to what
happens during execution in an actual machine. The requirement that
'any expression referring to such an object shall be evaluated

strictly according to the rules of the abstract machine, as described

in 5.1.2.3.' isn't selective about which rules apply -- all must be
observed, again during actual execution. Whatever the mapping between
objects in an abstract machine and memory in the actual execution,
each object must be represented somewhere, and obviously there is a
difference between a change to particular memory location and no
change (ie, whether a store has taken place in the abstract machine).
That the Standard doesn't describe the contents of execution memory
state is irrelevant: because the requirements of volatile are imposed
on actual execution, observing what occurs during actual execution (by
instruction-level simulation on a VM, for example) can tell whether
the rules in 5.1.2.3 are being (or have been) followed. Saying, "oh,
the Standard doesn't let you look" is silly; since the requirements
are imposed on actual execution, the only way to tell if they are
being met is by examining what happens during actual execution, and
there is no point in giving a requirement if meeting it can't be
tested. But certainly this can be done, so there's no problem.