Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Pointer Arithmetic & UB

289 views
Skip to first unread message

Edward Rutherford

unread,
Dec 10, 2012, 12:14:18 PM12/10/12
to
Hello

Would the following code invoke an undefined behavior?

char a[10];
size_t i=20,j=15;
*(a+i-j)=42;

It potentially constructs the invalid pointer a+i as an intermediate
value. But overall the access is inbounds.

christian.bau

unread,
Dec 10, 2012, 12:35:15 PM12/10/12
to
On Dec 10, 5:14 pm, Edward Rutherford
Undefined behaviour. It is _very_ likely, and also legal, that it will
behave exactly the same as a [i - j] = 42; but not guaranteed.

James Kuyper

unread,
Dec 10, 2012, 12:55:05 PM12/10/12
to
On 12/10/2012 12:14 PM, Edward Rutherford wrote:
> Hello
>
> Would the following code invoke an undefined behavior?

"invoke" is a bad term to use for this purpose; it implies that there's
some particular kind of behavior which is called "undefined behavior".
You should ask "Would the following code have undefined behavior?"

> char a[10];
> size_t i=20,j=15;
> *(a+i-j)=42;
>
> It potentially constructs the invalid pointer a+i as an intermediate
> value. But overall the access is inbounds.

Yes, it does have undefined behavior.
To make this seem more reasonable, consider a platform with the
following real-world characteristics: there are registers specialized
for storing addresses, and when an invalid address is stored in one of
those registers, the current process aborts immediately, as a safety
measure - it doesn't wait for the invalid address to be used. On such an
implementation, a conforming implementation could translate your code so
that 'a' is allocated near the end of a block of valid memory addresses,
so that adding 20 to a gives an invalid address. It could generate
instructions that load 'a' into an address register, then adds 'i' to
it. Execution of those instructions would result in the register
containing an invalid address, thus causing your program being aborted.

Eric Sosman

unread,
Dec 10, 2012, 1:25:05 PM12/10/12
to
On 12/10/2012 12:55 PM, James Kuyper wrote:
> On 12/10/2012 12:14 PM, Edward Rutherford wrote:
>> Hello
>>
>> Would the following code invoke an undefined behavior?
>
> "invoke" is a bad term to use for this purpose; it implies that there's
> some particular kind of behavior which is called "undefined behavior".
> You should ask "Would the following code have undefined behavior?"
>
>> char a[10];
>> size_t i=20,j=15;
>> *(a+i-j)=42;
>>
>> It potentially constructs the invalid pointer a+i as an intermediate
>> value. But overall the access is inbounds.
>
> Yes, it does have undefined behavior.
> To make this seem more reasonable, consider a platform with the
> following real-world characteristics: [...]

A colleague who did some work on IBM's AS/400 (they've
changed the name; I forget the new one) told me that simply
trying to calculate an out-of-range pointer yielded a null
pointer as a result. In the O.P.'s case, the intermediate
steps would go something like

a // OK so far
a + i // too big: result = NULL
NULL - j // not sure, but surely not good
*(NULL - j) // really Really REALLY not good

--
Eric Sosman
eso...@comcast-dot-net.invalid
Message has been deleted

James Kuyper

unread,
Dec 10, 2012, 3:29:18 PM12/10/12
to
On 12/10/2012 03:04 PM, Prathamesh Kulkarni wrote:
> Instead, of i and j
> would *(a + 20 - 15) still have
> undefined behaviour ? Probably any C compiler
> will evaluate 20 - 15 during translation, but
> does the standard require constant expression
> to be evaluated during translation
> even if it is a sub-expression
> as in the above case ?

No, it does not. It would be an unusual compiler that failed to do so,
but it is not required.

Message has been deleted

Prathamesh Kulkarni

unread,
Dec 10, 2012, 3:47:11 PM12/10/12
to
On Monday, 10 December 2012 15:44:37 UTC-5, Prathamesh Kulkarni wrote:
> Yes, I got that, that's why I deleted my post immediately
>
> after posting. How were you able to see it ?
>
> How

It repeated the word "How" ;)
Is there some problem with the google interface ?

glen herrmannsfeldt

unread,
Dec 10, 2012, 4:01:58 PM12/10/12
to
Well, constant expressions in context where constants
are needed are required to be evaluated at compile time,
but this (whole) expression isn't constant.

In Fortran 66 and 77 you couldn't

DIMENSION X(10+10)

but in C you always could. (Not counting VLAs)

It used to be suggested that one write Fortran constants with
the appropriate type, such as 0.0 instead of 0, to avoid
run-time conversion. One hoped that compilers would figure it
out, but you were never quite sure enough.

-- glen
>

James Kuyper

unread,
Dec 10, 2012, 4:02:39 PM12/10/12
to
On 12/10/2012 03:47 PM, Prathamesh Kulkarni wrote:
...
>> Yes, I got that, that's why I deleted my post immediately

Forged cancellation notices used to be common; as a result, most news
servers now ignore cancellation requests that come from other news
servers. Your request presumably deleted the message from Google's
archive, which is one of the largest and most widely used on the planet
- but it probably didn't delete many of the thousands of other copies
stored stored in various places around the world.

>> after posting. How were you able to see it ?
>>
>> How
>
> It repeated the word "How" ;)
> Is there some problem with the google interface ?

That's sort of like asking if the sea is wet. I recommend getting better
news server (I can strongly recommend eternalseptember.org, which is
free) and a monitoring it using decent newsreader (my favorite is
Mozilla thunderbird, but many people prefer other news readers).

Prathamesh Kulkarni

unread,
Dec 10, 2012, 4:08:28 PM12/10/12
to
Thanks, I will give a try to eternalseptember.org,

Sjouke Burry

unread,
Dec 10, 2012, 4:18:43 PM12/10/12
to
Prathamesh Kulkarni <bilbothe...@gmail.com> wrote in news:232fe1b0-
d8fa-42e2-85d...@googlegroups.com:

> On Monday, 10 December 2012 15:29:18 UTC-5, James Kuyper wrote:
> Yes, I got that, that's why I deleted my post immediately
> after posting. How were you able to see it ?
> How
>

By looking carefully.....

Greg Martin

unread,
Dec 10, 2012, 5:55:23 PM12/10/12
to
I think that should be eternal-september.org.

Edward A. Falk

unread,
Dec 10, 2012, 7:46:58 PM12/10/12
to
In article <ka59e6$mq7$1...@dont-email.me>,
Eric Sosman <eso...@comcast-dot-net.invalid> wrote:
>
> A colleague who did some work on IBM's AS/400 (they've
>changed the name; I forget the new one) told me that simply
>trying to calculate an out-of-range pointer yielded a null
>pointer as a result.

Heh; learn something new every day. I never would have guessed
that there was an actual architecture that would blow up with
this construct.

I assume that *(a+(i-j)) would be ok?

--
-Ed Falk, fa...@despams.r.us.com
http://thespamdiaries.blogspot.com/

Edward A. Falk

unread,
Dec 10, 2012, 7:48:34 PM12/10/12
to
In article <232fe1b0-d8fa-42e2...@googlegroups.com>,
Prathamesh Kulkarni <bilbothe...@gmail.com> wrote:
>
>Yes, I got that, that's why I deleted my post immediately
>after posting. How were you able to see it ?
>How

News servers are not required to honor cancels, and many do
not. Also, cancels can take time to propagate through the
network, so previous poster could have seen your article and
responded to it before it was deleted.

pete

unread,
Dec 10, 2012, 8:20:44 PM12/10/12
to
Prathamesh Kulkarni wrote:
>
> Instead, of i and j
> would *(a + 20 - 15) still have
> undefined behaviour ? Probably any C compiler
> will evaluate 20 - 15 during translation, but
> does the standard require constant expression
> to be evaluated during translation
> even if it is a sub-expression
> as in the above case ?

(20 - 15) isn't a subexpression in the above case.

(a + 20 - 15) means (a + 20) - 15

--
pete

James Kuyper

unread,
Dec 10, 2012, 9:28:15 PM12/10/12
to
Context:
char a[10];
size_t i=20,j=15;
*(a+i-j)=42;

On 12/10/2012 07:46 PM, Edward A. Falk wrote:
...
> Heh; learn something new every day. I never would have guessed
> that there was an actual architecture that would blow up with
> this construct.
>
> I assume that *(a+(i-j)) would be ok?

That should be safe for all conforming implementations of C.
--
James Kuyper

Noob

unread,
Dec 11, 2012, 5:36:58 AM12/11/12
to
Edward A. Falk wrote:

> I assume that *(a+(i-j)) would be ok?

Please correct me if I am wrong,

*(a+(i-j)) is strictly equivalent to a[i-j]

(I find the latter clearer.)

christian.bau

unread,
Dec 11, 2012, 1:00:56 PM12/11/12
to
Yes, it's the same. But there are also cases where * (a + i - j) would
be fine and * (a + (i - j)) or a [i - j] wouldn't: If you have 64 bit
pointers and 32 bit ints, then i - j might overflow, while a + i - j
could be correct.

Eric Sosman

unread,
Dec 11, 2012, 3:27:40 PM12/11/12
to
On 12/10/2012 7:46 PM, Edward A. Falk wrote:
> In article <ka59e6$mq7$1...@dont-email.me>,
> Eric Sosman <eso...@comcast-dot-net.invalid> wrote:
>>
>> A colleague who did some work on IBM's AS/400 (they've
>> changed the name; I forget the new one) told me that simply
>> trying to calculate an out-of-range pointer yielded a null
>> pointer as a result.
>
> Heh; learn something new every day. I never would have guessed
> that there was an actual architecture that would blow up with
> this construct.
>
> I assume that *(a+(i-j)) would be ok?

Assuming `i-j' in range, yes.

More on my colleague's tale: The code maintained a buffer
in which items of various sizes accumulated, and which drained
to disk when it got too full or too old. To decide whether a
newly-offered item would fit, the code did something like

itemEndPtr = nextBufferSpacePtr + itemSize;
if (itemEndPtr < bufferStart + bufferSize) ...

This worked as intended on all the other target systems, but
failed on AS/400. I suspect the failure had something to do
with the fact that the buffer was in a shared memory area, so
stepping off the end also meant stepping outside of mapped
address space; the problem might not have shown up with the
`auto' array in your example.

Still, perhaps a salutary lesson for the folks who still
believe "All the world's a VAX^H^H^Hx86^H^H^Hx64^H^H^H..."

--
Eric Sosman
eso...@comcast-dot-net.invalid

glen herrmannsfeldt

unread,
Dec 11, 2012, 4:09:05 PM12/11/12
to
Eric Sosman <eso...@comcast-dot-net.invalid> wrote:

(previous snip on pointer offsets)

>>> A colleague who did some work on IBM's AS/400 (they've
>>> changed the name; I forget the new one) told me that simply
>>> trying to calculate an out-of-range pointer yielded a null
>>> pointer as a result.

>> Heh; learn something new every day. I never would have guessed
>> that there was an actual architecture that would blow up with
>> this construct.

>> I assume that *(a+(i-j)) would be ok?

> Assuming `i-j' in range, yes.

> More on my colleague's tale: The code maintained a buffer
> in which items of various sizes accumulated, and which drained
> to disk when it got too full or too old. To decide whether a
> newly-offered item would fit, the code did something like

> itemEndPtr = nextBufferSpacePtr + itemSize;
> if (itemEndPtr < bufferStart + bufferSize) ...

Might fail in x86 (especially the 80286) in huge model.

You can't load arbitrary data into segment selector registers
in protected mode x86. In large mode, though, any offset isn't
tested until an actual access is attempted. (The offset is in
an ordinary register, such as AX.)

In huge model, the system allocates a series of segments,
such that the one can address through them in order.

Still, I believe that the compilers are careful not to load
a segment selector until needed to actually access something,
maybe partly to allow such faulty C code.

> This worked as intended on all the other target systems, but
> failed on AS/400. I suspect the failure had something to do
> with the fact that the buffer was in a shared memory area, so
> stepping off the end also meant stepping outside of mapped
> address space; the problem might not have shown up with the
> `auto' array in your example.

I believe that could happen with protected mode x86, too.

> Still, perhaps a salutary lesson for the folks who still
> believe "All the world's a VAX^H^H^Hx86^H^H^Hx64^H^H^H..."

In the 80286 days, I had OS/2 1.0 and then 1.2 running, when
just about everyone else was running MS-DOS. Instead of using
malloc(), I would directly allocate segments from OS/2 of exactly
the needed length. The hardware will then interrupt for an access,
even read, either before or just after the end of the allocated
space. (Unless the register wraps, and it is back into the
allocated space again.)

As usual in C, a 2D array was allocated as an array of pointers,
each pointing to its own OS/2 allocated segment.

Fortunately, the C compilers were always good at not using segment
selector registers when copying pointers that might not point to
anything.

I don't know AS/400 that well, but there have been systems that relied
on the compiler to generate the appropriate code, instead of run-time
memory protection. I believe some Burroughs ALGOL systems worked that
way. (Maybe still do.)

As far as I know, they never had a C compiler, but if one did it might
also have problems with out of range pointers.

-- glen

Ken Brody

unread,
Dec 12, 2012, 2:24:05 PM12/12/12
to
When I went to college, there was a FORTRAN compiler aimed specifically at
teaching FORTRAN programming. Since it was expected that there would be
more time spent compiling (and recompiling, after fixing errors) the program
than in running it (the resulting program would in all likelihood be run
only once), there was zero optimization done on the resulting code. This
made the compiler itself "lean and mean", allowing it to finish compiling
(or getting to the errors) faster. It didn't make sense to spend X amount
of time optimizing a program, only to save less than X time once the program
actually ran the one time it would be run.

I could see such an implementation of C being developed.

Ken Brody

unread,
Dec 12, 2012, 2:26:50 PM12/12/12
to
On 12/10/2012 4:08 PM, Prathamesh Kulkarni wrote:
> On Monday, 10 December 2012 16:02:39 UTC-5, James Kuyper wrote:
>> On 12/10/2012 03:47 PM, Prathamesh Kulkarni wrote:
[...]
>>> Is there some problem with the google interface ?
>>
>>
>>
>> That's sort of like asking if the sea is wet. I recommend getting better
>> news server (I can strongly recommend eternalseptember.org, which is
>> free) and a monitoring it using decent newsreader (my favorite is
>> Mozilla thunderbird, but many people prefer other news readers).
>
> Thanks, I will give a try to eternalseptember.org,

There's a hyphen there:

eternal-september.org


Ken Brody

unread,
Dec 12, 2012, 2:30:33 PM12/12/12
to
On 12/10/2012 7:46 PM, Edward A. Falk wrote:
> In article <ka59e6$mq7$1...@dont-email.me>,
> Eric Sosman <eso...@comcast-dot-net.invalid> wrote:
>>
>> A colleague who did some work on IBM's AS/400 (they've
>> changed the name; I forget the new one) told me that simply
>> trying to calculate an out-of-range pointer yielded a null
>> pointer as a result.
>
> Heh; learn something new every day. I never would have guessed
> that there was an actual architecture that would blow up with
> this construct.
>
> I assume that *(a+(i-j)) would be ok?

No. There is no requirement that the value of "i-j" be calculated prior to
adding it to "a". (Check the numerous threads here involving using
parentheses to "fix" UB in things involving such constructs as "i + (i++)".)
Operator precedence only guarantees how the expression is to be
interpreted, not the actual order of evaluation.


Ken Brody

unread,
Dec 12, 2012, 2:36:37 PM12/12/12
to
Are you sure? Does anything in the Standard *require* that "i-j" be
evaluated prior to adding it to "a"?

Haven't we had this discussion earlier, related to other forms of UB, with
the questioner asking if adding parentheses would "fix" the problem?


Keith Thompson

unread,
Dec 12, 2012, 2:42:47 PM12/12/12
to
True, but the expression `a+(i-j)` is evaluated *in the abstract
machine* by subtracting j from i and then adding the result to a.
A compiler is free to evaluate it by computing a+i and then
subtracting j from the result *only* if it can guarantee that the
result is the same, or if the canonical order has undefined behavior.

`INT_MAX + (1 - 1)` has well defined behavior.
`INT_MAX + 1 - 1` does not.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Robert Wessel

unread,
Dec 12, 2012, 3:34:52 PM12/12/12
to
On Wed, 12 Dec 2012 14:24:05 -0500, Ken Brody <kenb...@spamcop.net>
wrote:
Good old Watfor and Watfiv... In addition to doing no optimization
(although the IBM compiler did little or none in basic mode), it
allowed you to skip the heavyweight link step. I remember on a
370/138 (maybe it was a 4341 by that point, memories fade), typical
student jobs (~100 cards in first semester Fortran), taking a couple
of seconds of CPU time, whereas the IBM compiler took half a minute.
Better diagnostics too.


>I could see such an implementation of C being developed.

But on what current platform would student size jobs take too long to
compile, with or without optimization turned on?

glen herrmannsfeldt

unread,
Dec 12, 2012, 3:35:06 PM12/12/12
to
Ken Brody <kenb...@spamcop.net> wrote:
> On 12/10/2012 9:28 PM, James Kuyper wrote:
>> Context:
>> char a[10];
>> size_t i=20,j=15;
>> *(a+i-j)=42;

(snip)
>>> I assume that *(a+(i-j)) would be ok?

>> That should be safe for all conforming implementations of C.

> Are you sure? Does anything in the Standard *require* that "i-j" be
> evaluated prior to adding it to "a"?

On many systems, the result is the same until you try to dereference
the result.

Seems to me that on any system where the result isn't the same, that
the compiler better do it in the appropriate order.

With the appropriate wrap on overflow characteristic, fixed point
arithmetic is associative. If the compiler knows that, it can compute
i+(j-k) as (i+j)-k, knowing the result is the same.

If something else happens on overflow, the compiler shouldn't do that.

It gets more interesting with floating point.

> Haven't we had this discussion earlier, related to other forms of UB, with
> the questioner asking if adding parentheses would "fix" the problem?

-- glen

glen herrmannsfeldt

unread,
Dec 12, 2012, 3:45:04 PM12/12/12
to
Ken Brody <kenb...@spamcop.net> wrote:

(snip on expression evaluation order)

> When I went to college, there was a FORTRAN compiler aimed specifically at
> teaching FORTRAN programming. Since it was expected that there would be
> more time spent compiling (and recompiling, after fixing errors) the program
> than in running it (the resulting program would in all likelihood be run
> only once), there was zero optimization done on the resulting code. This
> made the compiler itself "lean and mean", allowing it to finish compiling
> (or getting to the errors) faster. It didn't make sense to spend X amount
> of time optimizing a program, only to save less than X time once the program
> actually ran the one time it would be run.

Maybe about that time, there was a story about benchmarks for Fortran
compilers. There was one that evaluated many complicated expressions,
mostly using statement functions.

This benchmark was then given to the OS/360 Fortran H compiler with
optimization level of 2, where it compiled very slowly and ran in
pretty much no time at all.

It turns out that the compiler expands statement functions inline
(seems to have been rare at the time), and then evaluates constant
expressions. The compiler did the whole calculation down to one
number to print out!

> I could see such an implementation of C being developed.

The favorite Fortran compilers I knew of for student programs were
WATFOR and WATFIV. Both are in-core compilers. They don't write out
an object program, but instead generate instructions in memory, ready
to run.

I believe that there is a WAT-C, but don't know anything about it.

The WATCOM compilers don't seem to be in-core compilers.

I do remember MS had a Quick-C which might have been (and still be)
an in-core compiler.

-- glen

James Kuyper

unread,
Dec 12, 2012, 3:57:23 PM12/12/12
to
On 12/12/2012 02:36 PM, Ken Brody wrote:
> On 12/10/2012 9:28 PM, James Kuyper wrote:
>> Context:
>> char a[10];
>> size_t i=20,j=15;
>> *(a+i-j)=42;
>>
>> On 12/10/2012 07:46 PM, Edward A. Falk wrote:
>> ...
>>> Heh; learn something new every day. I never would have guessed
>>> that there was an actual architecture that would blow up with
>>> this construct.
>>>
>>> I assume that *(a+(i-j)) would be ok?
>>
>> That should be safe for all conforming implementations of C.
>
> Are you sure? Does anything in the Standard *require* that "i-j" be
> evaluated prior to adding it to "a"?

Yes, I'm sure; if you aren't, perhaps there's been a miscommunication of
some kind?

Check the grammar rules. The right operand of a binary '+' expression
must be a multiplicative-expression (6.5.6p1). '(' doesn't qualify;
neither does '(i', or '(i+' or '(i+j'; the only thing that can be parsed
as the right operand of the '+' operator in that expression is (i+j),
which parses as primary-expression (6.5.1p1), and therefore as a
postfix-expression (6.5.2p1), a unary-expression (6.5.3p1), a
cast-expression (6.5.4p1), and a multiplicative expression (6.5.5p1), in
that order.

For C99, I would have stopped the explanation at that point, considering
that my point had already been made. However, in order to support
multi-threaded code, C2011 had to be more explicit about when the order
of two events is specified, and when it isn't, so there's a couple of
additional citations that are relevant. I believe that what they say was
inherently true even in C99, where it was not explicitly said:

"The value computations of the operands of an operator
are sequenced before the value computation of the result of the
operator." 6.5p1.

"An evaluation A happens before an evaluation B if A is sequenced before
B." 5.1.2.4p9

> Haven't we had this discussion earlier, related to other forms of UB, with
> the questioner asking if adding parentheses would "fix" the problem?

The problem with *(a+i-j) is that the standard mandates that 'i' be
added to 'a' before 'j' is subtracted from the result. Putting a
parenthesis around 'i - j' converts those three tokens into a single
primary-expression. That's why *(a+(i-j)) fixes the problem. It forces
the value computations for the subtraction expression to happen before
the value computations of the binary addition expression.

Kenneth gives i+(i++) as an example of a case where parentheses do
nothing to resolve the underlying problem. That is because the problem
is the absences of a sequence point separating 'i' from 'i++'.
Parenthesis do not insert a sequence point, and therefore do NOT solve
that problem.

Eric Sosman

unread,
Dec 12, 2012, 4:02:20 PM12/12/12
to
On 12/12/2012 2:36 PM, Ken Brody wrote:
> On 12/10/2012 9:28 PM, James Kuyper wrote:
>> Context:
>> char a[10];
>> size_t i=20,j=15;
>> *(a+i-j)=42;
>>
>> On 12/10/2012 07:46 PM, Edward A. Falk wrote:
>> ...
>>> Heh; learn something new every day. I never would have guessed
>>> that there was an actual architecture that would blow up with
>>> this construct.
>>>
>>> I assume that *(a+(i-j)) would be ok?
>>
>> That should be safe for all conforming implementations of C.
>
> Are you sure? Does anything in the Standard *require* that "i-j" be
> evaluated prior to adding it to "a"?

No, but the Standard requires that the thing added to `a'
be the value of `i-j'. The "as if" rule still applies, so an
actual implementation might calculate something that might be
written as `a-j+i' or `i+a-j' or `a-(j-i)' or a host of other
possibilities. Still, the result -- including the definedness
of the result -- must be as for "`a' plus `i-j'".

> Haven't we had this discussion earlier, related to other forms of UB,
> with the questioner asking if adding parentheses would "fix" the problem?

Nitpick: Since this isn't UB, "other" is out of place.

The usual misunderstanding is that the association of
operators with their operands -- "expression tree order" --
dictates evaluation order, which it doesn't. (Except for
certain special operators like ||, and even then only in
part.)

--
Eric Sosman
eso...@comcast-dot-net.invalid

James Kuyper

unread,
Dec 12, 2012, 4:13:53 PM12/12/12
to
On 12/12/2012 04:02 PM, Eric Sosman wrote:
...
> The usual misunderstanding is that the association of
> operators with their operands -- "expression tree order" --
> dictates evaluation order, which it doesn't. (Except for
> certain special operators like ||, and even then only in
> part.)

The expression tree does not impose an evaluation order on it's branches
at the same level (with the exceptions that you noted), but it does
impose a requirement that the operands be evaluated before the
expression itself. I believe that this requirement has always been
implied by the semantics of each expression, but C2011 has made this
requirement explicit for all expression in 6.5p1 and 5.1.2.4p18 (which I
just mis-cited in my response to Kenneth as 5.1.2.4p9).

Robert Wessel

unread,
Dec 12, 2012, 4:29:46 PM12/12/12
to
QuickC was MS's IDE, a precursor to Visual C++ and Visual Studio.
Under the hood it used the contemporary MSC compilers.

Ken Brody

unread,
Dec 12, 2012, 5:12:45 PM12/12/12
to
As noted in the replies to my post, I stand corrected. Because of the
"as-if" rule, if evaluating "i-j" first would not cause an overflow in
"a+(i-j)", then the compiler must guarantee that any rearranging of the code
will give an identical result, even if an overflow does occur.

glen herrmannsfeldt

unread,
Dec 12, 2012, 9:02:06 PM12/12/12
to
Robert Wessel <robert...@yahoo.com> wrote:

(snip, I wrote)

>>I do remember MS had a Quick-C which might have been (and still be)
>>an in-core compiler.

> QuickC was MS's IDE, a precursor to Visual C++ and Visual Studio.
> Under the hood it used the contemporary MSC compilers.

I had one, as it came with the MSC compiler, but don't think I
ever used it.

Still, I always thought it had a different compiler inside.
Maybe not a lot different.

-- glen

Eric Sosman

unread,
Dec 12, 2012, 9:18:50 PM12/12/12
to
Although I haven't studied the C11 stuff in detail, I'd
be surprised (and disappointed!) if in

#define WHICH 1
...
int r = WHICH * (x + y) + (1 - WHICH) * (z - x);

... the Standard required that `z - x' be evaluated at all,
much less "before" the entire expression.

However, neither surprise nor disappointment is entirely
strange to me. Embarrassment is an old pal, too ...

--
Eric Sosman
eso...@comcast-dot-net.invalid

James Kuyper

unread,
Dec 12, 2012, 10:42:38 PM12/12/12
to
On 12/12/2012 09:18 PM, Eric Sosman wrote:
...
> Although I haven't studied the C11 stuff in detail, I'd
> be surprised (and disappointed!) if in
>
> #define WHICH 1
> ...
> int r = WHICH * (x + y) + (1 - WHICH) * (z - x);
>
> ... the Standard required that `z - x' be evaluated at all,
> much less "before" the entire expression.

Well, the as-if rule always trumps any other requirements, when it
applies - if a strictly conforming program can't determine whether or
not sub-expressions were evaluated in the required order, evaluating
them in that order isn't really required. If it can't even determine
whether they were evaluated, they don't even have to be evaluated.

> However, neither surprise nor disappointment is entirely
> strange to me. Embarrassment is an old pal, too ...

Yep, I know him well myself.
--
James Kuyper

Tim Rentsch

unread,
Dec 17, 2012, 5:18:22 AM12/17/12
to
The phrasing here is a little funny. The rules for evaluation
order are not a requirement but rather (part of) a definition
of the semantics of the abstract machine. That definition has
always expressed these constraints on evaluation order; the
description in C11 simply expresses them more clearly.

Phil Carmody

unread,
Dec 17, 2012, 5:04:57 AM12/17/12
to
Eric Sosman <eso...@comcast-dot-net.invalid> writes:
> On 12/12/2012 4:13 PM, James Kuyper wrote:
> > On 12/12/2012 04:02 PM, Eric Sosman wrote:
> > ...
> >> The usual misunderstanding is that the association of
> >> operators with their operands -- "expression tree order" --
> >> dictates evaluation order, which it doesn't. (Except for
> >> certain special operators like ||, and even then only in
> >> part.)
> >
> > The expression tree does not impose an evaluation order on it's branches
> > at the same level (with the exceptions that you noted), but it does
> > impose a requirement that the operands be evaluated before the
> > expression itself. I believe that this requirement has always been
> > implied by the semantics of each expression, but C2011 has made this
> > requirement explicit for all expression in 6.5p1 and 5.1.2.4p18 (which I
> > just mis-cited in my response to Kenneth as 5.1.2.4p9).
>
> Although I haven't studied the C11 stuff in detail, I'd
> be surprised (and disappointed!) if in
>
> #define WHICH 1
> ...
> int r = WHICH * (x + y) + (1 - WHICH) * (z - x);
>
> ... the Standard required that `z - x' be evaluated at all,
> much less "before" the entire expression.

I am deliriously happy that the Standard requires that (the implementation
behave as if) `z - x' is evaluated. That would be, and is, consistent behaviour.

Pulling out the big cannon - if z and x are volatile, of course you
want x read twice, and z read once.

If you meant to say

int r = WHICH ? (x+y) : (z-x);

then write that, not some other silly expression which does arithmetic rather
than conditional evaluation.

Phil
--
I'm not saying that google groups censors my posts, but there's a strong link
between me saying "google groups sucks" in articles, and them disappearing.

Oh - I guess I might be saying that google groups censors my posts.

Phil Carmody

unread,
Dec 17, 2012, 5:11:40 AM12/17/12
to
i and j are not int but size_t. What do you mean by "overflow" in that context?
Can you come up with a concrete example of failure which doesn't have UB
in the "correct" version?

Phil Carmody

unread,
Dec 17, 2012, 5:16:20 AM12/17/12
to
The "*(a+(i-j))" expression has *nothing* in common with the part of the
"i+(i++)" that pertains to UB. That brackets fail to do something in an
unrelated situation is basically irrelevant.

Tim Rentsch

unread,
Dec 17, 2012, 5:33:34 AM12/17/12
to
Eric Sosman <eso...@comcast-dot-net.invalid> writes:

> On 12/12/2012 4:13 PM, James Kuyper wrote:
>> On 12/12/2012 04:02 PM, Eric Sosman wrote:
>> ...
>>> The usual misunderstanding is that the association of
>>> operators with their operands -- "expression tree order" --
>>> dictates evaluation order, which it doesn't. (Except for
>>> certain special operators like ||, and even then only in
>>> part.)
>>
>> The expression tree does not impose an evaluation order on it's branches
>> at the same level (with the exceptions that you noted), but it does
>> impose a requirement that the operands be evaluated before the
>> expression itself. I believe that this requirement has always been
>> implied by the semantics of each expression, but C2011 has made this
>> requirement explicit for all expression in 6.5p1 and 5.1.2.4p18 (which I
>> just mis-cited in my response to Kenneth as 5.1.2.4p9).
>
> Although I haven't studied the C11 stuff in detail, I'd
> be surprised (and disappointed!) if in
>
> #define WHICH 1
> ...
> int r = WHICH * (x + y) + (1 - WHICH) * (z - x);
>
> ... the Standard required that `z - x' be evaluated at all,
> much less "before" the entire expression.

The statements about evaluation order are not requirements but
part of defining the abstract machine. In the abstract machine
all expressions are evaluated (subject of course to the semantic
rules about execution flow, sizeof, etc). The requirement you're
asking about is on program _execution_, for which there is only
one: that it produce the same results as the semantics described
for the abstract machine would. There is nothing that says a
program execution has to evaluate any expressions, or even have
something corresponding to an "expression" in its execution
state. So it's not really a meaningful question.

Eric Sosman

unread,
Dec 17, 2012, 8:47:57 AM12/17/12
to
On 12/17/2012 5:04 AM, Phil Carmody wrote:
> Eric Sosman <eso...@comcast-dot-net.invalid> writes:
>>[...]
>> Although I haven't studied the C11 stuff in detail, I'd
>> be surprised (and disappointed!) if in
>>
>> #define WHICH 1
>> ...
>> int r = WHICH * (x + y) + (1 - WHICH) * (z - x);
>>
>> ... the Standard required that `z - x' be evaluated at all,
>> much less "before" the entire expression.
>
> I am deliriously happy that the Standard requires that (the implementation
> behave as if) `z - x' is evaluated. That would be, and is, consistent behaviour.
>
> Pulling out the big cannon - if z and x are volatile, of course you
> want x read twice, and z read once.

If `x' is volatile, the behavior is undefined.

--
Eric Sosman
eso...@comcast-dot-net.invalid

Malcolm McLean

unread,
Dec 17, 2012, 8:57:45 AM12/17/12
to
Sometimes when you're developing code, you need to know that it's correct.
Other times you need to know that it's correct and fast enough.

For instance if you are developing a video game, then typically you'll
write a bit of code, then run the game to see what effect your changes have
on the player's experience. So normally the program will have to run at the
same speed the end-user will be receiving.

Ben Bacarisse

unread,
Dec 17, 2012, 9:58:17 AM12/17/12
to
Are you referring to the new C11 wording of 6.5 p2? If so, I wonder if
this is intended because it looks to me like a change between C99 and
C11. If you mean that it's undefined even in C99, I'd welcome an
explanation.

--
Ben.

Eric Sosman

unread,
Dec 17, 2012, 10:59:08 AM12/17/12
to
I may be wrong (it's happened before and will happen again),
but here's my reasoning:

- If `x' is volatile, any use of it is a side-effect (5.1.2.3,
C99 p2 or C11 p3).

- The side-effect may change the value of `x' (6.7.3, C99 p6
or C11 p7).

- The two uses of `x' are not separated by sequence points
(Cxx, passim).

- Two modifications to one object without an intervening
sequence point is U.B. (6.5, C99 p2 or C11 p2, with different
wordings).

--
Eric Sosman
eso...@comcast-dot-net.invalid

Ben Bacarisse

unread,
Dec 17, 2012, 12:26:11 PM12/17/12
to
I'd call that "may be undefined in C99". If you know that the
side-effect does not change x, it would be defined would it not?

In contrast, the new wording in C11 (6.5 p2) makes it undefined
regardless of the nature of the side-effect which seems a little odd.

--
Ben.

Shao Miller

unread,
Dec 17, 2012, 7:33:48 PM12/17/12
to
The C11 footnote 84 says that

a[i] = i;

is fine. It doesn't mention whether or not 'i' is volatile. 6.5.16p3
("Assignment operators") has that "The evaluations of the operands are
unsequenced."

It might've been nice if the footnote had mentioned 'volatile', or had a
declaration for 'i' that excluded it.

- Shao Miller

Ken Brody

unread,
Dec 20, 2012, 11:00:01 AM12/20/12
to
On 12/17/2012 7:33 PM, Shao Miller wrote:
[...]

Condensed version of the discussion so far, in this subthread:

=====

Given:

extern volatile int x;
int i = x + x;

And citing several C&V from different standards regarding the fact merely
accessing a volatile is a "side effect".

Does the above invoke UB? (No sequence point between the two "side effects"
of accessing "x".)

=====

The Standard also requires (5.1.2.3p2) that all side effects be "complete"
at the next sequence point.

Given that whatever side effect the access of "x" may have is outside the
control of the abstract machine, I fail to see how the sequence point
requirement applies to the side effect of accessing a volatile.

Consider, for example, a memory-mapped I/O system, where reading from a
given address causes the printer to start printing whatever is in its
buffer. How can C enforce the "shall be complete" requirement of 5.1.2.3p2?
How is "i=x;i+=x;" any better than "i=x+x;"?


Tim Rentsch

unread,
Dec 20, 2012, 1:03:54 PM12/20/12
to
Ken Brody <kenb...@spamcop.net> writes:

> On 12/17/2012 7:33 PM, Shao Miller wrote:
> [...]
>
> Condensed version of the discussion so far, in this subthread:
>
> =====
>
> Given:
>
> extern volatile int x;
> int i = x + x;
>
> And citing several C&V from different standards regarding the fact
> merely accessing a volatile is a "side effect".
>
> Does the above invoke UB? (No sequence point between the two "side
> effects" of accessing "x".)

The answer is no. Arbitrary behavior may arise as a result, but
implementions are obliged to behave as they would if the side
effects were ordinary, non-interfering side effects. The term
"undefined behavior" is just a shorthand for saying something
about what an implementation may do. The actual behavior for
this example (and indeed for any volatile-qualified access at
all) is _unconstrained_, but the behavior is not _undefined_ in
the sense that the Standard uses the term 'undefined behavior'.

> =====
>
> The Standard also requires (5.1.2.3p2) that all side effects be
> "complete" at the next sequence point.
>
> Given that whatever side effect the access of "x" may have is outside
> the control of the abstract machine, I fail to see how the sequence
> point requirement applies to the side effect of accessing a volatile.
>
> Consider, for example, a memory-mapped I/O system, where reading from
> a given address causes the printer to start printing whatever is in
> its buffer. How can C enforce the "shall be complete" requirement of
> 5.1.2.3p2? How is "i=x;i+=x;" any better than "i=x+x;"?

Implementations may assume that completing the access (and
remembering that what it means to "access" a volatile-qualified
object is implementation defined) also completes the side effect.
Implementations are never responsible for the behavior of any
extra-linguistic functionality; that falls under the category
of a data-processing system on which the implementation is
running, which is explicitly left unaddressed (section 1 p 2)
by the Standard.

Robert Wessel

unread,
Dec 20, 2012, 1:21:30 PM12/20/12
to
On Thu, 20 Dec 2012 11:00:01 -0500, Ken Brody <kenb...@spamcop.net>
wrote:

>On 12/17/2012 7:33 PM, Shao Miller wrote:
>[...]
>
>Condensed version of the discussion so far, in this subthread:
>
>=====
>
>Given:
>
> extern volatile int x;
> int i = x + x;
>
>And citing several C&V from different standards regarding the fact merely
>accessing a volatile is a "side effect".
>
>Does the above invoke UB? (No sequence point between the two "side effects"
>of accessing "x".)


I think it may. Consider a memory mapped I/O device that returns some
externally supplied status/data value when you read x, and those
values were sometimes different on consecutive reads. There's no
constraint on what order the two reads of x occur. In your example,
it would make no difference, but if you had "i=(3*x)+ x;" instead,
which read of x occurred first would definitely make a difference. The
order of accesses is often critical with I/O devices.

James Kuyper

unread,
Dec 20, 2012, 1:40:22 PM12/20/12
to
That would be covered by "unspecified result". "undefined behavior" is
stronger. It's not just saying that the value of x can change, but that
the two changes to the value of x, without an intervening sequence
point, are allowed to interfere with each other in ways that might be
fatal for the continued execution of the program. This was definitely
intentional in the case of ordinary writes to x; the committee wanted to
discourage the writing of such code. It's less clear that it was
intended to apply to the change in the value of x that is allowed, for
volatile-qualified variables, to occur as a result of reading x.

Lanarcam

unread,
Dec 20, 2012, 1:49:15 PM12/20/12
to
Le 20/12/2012 19:21, Robert Wessel a �crit :
If I may, the only safe, readable, bug free way of programming
safety devices, for instance for control is:

Read (once)
Process
Write (once)

extern volatile int x; // Memory mapped
extern volatile int y; // Memory mapped
int i; // Local

i = x; // Read
i = f(i); // Process
y = i; // Write


Robert Wessel

unread,
Dec 20, 2012, 2:32:09 PM12/20/12
to
I'm not sure it is. Well, this example might be, but if we had x and
y referring to two different I/O ports on the same device, accessing
them in the wrong order could quite certainly have fatal results for
the continued execution of the program. IIRC, there was a screw up of
that nature (programming the video controller in the wrong order) that
caused many monochrome (PC era) monitors to burn out when they got
driven by an excessively fast signal. Now while I don't think that
was the result of a C compiler generating the unfortunate order of
references because order of evaluation is unspecified (in fact I know
it wasn't), but the concept applies. IOW we are in the *actual* realm
of "might cause your computer to catch fire".

So perhaps accessing more than one volatile between sequence points is
actually undefined? At least in the case of I/O devices, I'd have to
say it is. I'm not sure the C standard covers access I/O devices at
all, although the possible side affects associated with volatiles
certainly comes close.

James Kuyper

unread,
Dec 20, 2012, 2:58:59 PM12/20/12
to
On 12/20/2012 02:32 PM, Robert Wessel wrote:
> On Thu, 20 Dec 2012 13:40:22 -0500, James Kuyper
> <james...@verizon.net> wrote:
>
>> On 12/20/2012 01:21 PM, Robert Wessel wrote:
>>> On Thu, 20 Dec 2012 11:00:01 -0500, Ken Brody <kenb...@spamcop.net>
>>> wrote:
>>>
>>>> On 12/17/2012 7:33 PM, Shao Miller wrote:
>>>> [...]
>>>>
>>>> Condensed version of the discussion so far, in this subthread:
>>>>
>>>> =====
>>>>
>>>> Given:
>>>>
>>>> extern volatile int x;
>>>> int i = x + x;
>>>>
>>>> And citing several C&V from different standards regarding the fact merely
>>>> accessing a volatile is a "side effect".
>>>>
>>>> Does the above invoke UB? (No sequence point between the two "side effects"
>>>> of accessing "x".)
...
>> stronger. It's not just saying that the value of x can change, but that
>> the two changes to the value of x, without an intervening sequence
>> point, are allowed to interfere with each other in ways that might be
>> fatal for the continued execution of the program. This was definitely
>> intentional in the case of ordinary writes to x; the committee wanted to
>> discourage the writing of such code. It's less clear that it was
>> intended to apply to the change in the value of x that is allowed, for
>> volatile-qualified variables, to occur as a result of reading x.
>
>
> I'm not sure it is. Well, this example might be, but if we had x and
> y referring to two different I/O ports on the same device, accessing
> them in the wrong order could quite certainly have fatal results for
> the continued execution of the program. IIRC, there was a screw up of
> that nature (programming the video controller in the wrong order) that
> caused many monochrome (PC era) monitors to burn out when they got
> driven by an excessively fast signal. Now while I don't think that
> was the result of a C compiler generating the unfortunate order of
> references because order of evaluation is unspecified (in fact I know
> it wasn't), but the concept applies. IOW we are in the *actual* realm
> of "might cause your computer to catch fire".
>
> So perhaps accessing more than one volatile between sequence points is
> actually undefined?

No, the only relevant rule applies to accessing the SAME object twice
without an intervening sequence point. The problem you're talking about
is only covered by the generically implementation-defined aspects of
what 'volatile' means.

glen herrmannsfeldt

unread,
Dec 20, 2012, 4:54:57 PM12/20/12
to
Robert Wessel <robert...@yahoo.com> wrote:
> On Thu, 20 Dec 2012 13:40:22 -0500, James Kuyper
> <james...@verizon.net> wrote:

(snip, someone wrote)
>>> I think it may. Consider a memory mapped I/O device that returns some
>>> externally supplied status/data value when you read x, and those
>>> values were sometimes different on consecutive reads.

(snip)

>>That would be covered by "unspecified result". "undefined behavior" is
>>stronger. It's not just saying that the value of x can change, but that
>>the two changes to the value of x, without an intervening sequence
>>point, are allowed to interfere with each other in ways that might be
>>fatal for the continued execution of the program.

Seems to me that "volatile" automatically adds an implementation
dependence. The implementation should either define that dependence
or leave it unspecified (that is, defining that you can't rely on it).

The standard provides "volatile" but doesn't specifically define
the results of using it.

(snip)

>>intentional in the case of ordinary writes to x; the committee wanted
>>to discourage the writing of such code. It's less clear that it was
>>intended to apply to the change in the value of x that is allowed, for
>>volatile-qualified variables, to occur as a result of reading x.

In the case of:
volatile int x;
i = x + x;

unless the implementation defines the order, the compiler can evaluate
the x in either order. Well, in this case it doesn't matter, but in
cases where it did.

But volatile requires, as I understand it, the compiler to fetch x
twice from memory. Even without x being an I/O port, if it could
be changed otherwise, the program may test whether i is odd.

volatile int x;
while(x ^ ~x) ;

I probably could have just done x==x, but this was more fun.

The program, then, relies on the fact that at some point x will
change between the two evaluations. Volatile requires that x be
fetched twice. The actual change to x is implementation dependent.

> I'm not sure it is. Well, this example might be, but if we had x and
> y referring to two different I/O ports on the same device, accessing
> them in the wrong order could quite certainly have fatal results for
> the continued execution of the program. IIRC, there was a screw up of
> that nature (programming the video controller in the wrong order) that
> caused many monochrome (PC era) monitors to burn out when they got
> driven by an excessively fast signal.

Well, in addition to the fact that x can have side effects, there
might also be an interrupt (task switch) between the two.

> Now while I don't think that
> was the result of a C compiler generating the unfortunate order of
> references because order of evaluation is unspecified (in fact I know
> it wasn't), but the concept applies. IOW we are in the *actual* realm
> of "might cause your computer to catch fire".

> So perhaps accessing more than one volatile between sequence points is
> actually undefined? At least in the case of I/O devices, I'd have to
> say it is. I'm not sure the C standard covers access I/O devices at
> all, although the possible side affects associated with volatiles
> certainly comes close.

As someone else said, there is a difference between undefined
and implementation defined.

-- glen

Ben Bacarisse

unread,
Dec 20, 2012, 5:02:40 PM12/20/12
to
Tim Rentsch <t...@alumni.caltech.edu> writes:

> Ken Brody <kenb...@spamcop.net> writes:
>
>> On 12/17/2012 7:33 PM, Shao Miller wrote:
>> [...]
>>
>> Condensed version of the discussion so far, in this subthread:
>>
>> =====
>>
>> Given:
>>
>> extern volatile int x;
>> int i = x + x;
>>
>> And citing several C&V from different standards regarding the fact
>> merely accessing a volatile is a "side effect".
>>
>> Does the above invoke UB? (No sequence point between the two "side
>> effects" of accessing "x".)
>
> The answer is no. Arbitrary behavior may arise as a result, but
> implementions are obliged to behave as they would if the side
> effects were ordinary, non-interfering side effects. The term
> "undefined behavior" is just a shorthand for saying something
> about what an implementation may do. The actual behavior for
> this example (and indeed for any volatile-qualified access at
> all) is _unconstrained_, but the behavior is not _undefined_ in
> the sense that the Standard uses the term 'undefined behavior'.

I'm having trouble squaring this with the wording of 6.5 p2:

"If a side effect on a scalar object is unsequenced relative to either
a different side effect on the same scalar object or a value
computation using the value of the same scalar object, the behavior is
undefined."

Or maybe I'm having trouble understanding the point you are making.

<snip>
--
Ben.

Phil Carmody

unread,
Dec 20, 2012, 7:38:26 PM12/20/12
to
Lanarcam <lana...@yahoo.fr> writes:
*Precisely*, how is

extern volatile int x; // Memory mapped
extern volatile int y; // Memory mapped
y = f(x); // Read, Process, Write

less "safe", less "readable", or less "bug free"?

Shao Miller

unread,
Dec 21, 2012, 3:18:20 AM12/21/12
to
Yes, it invokes undefined behaviour. If a read of 'x' is a side effect,
then two reads of 'x' are two side effects that could conflict if they
occur simultaneously. That is, since an implementation can say that a
read of 'x' increments a foo counter elsewhere, then two simultaneous
reads can result in two simultaneous increments of the foo counter,
which blows up the computer.

- Shao Miller


Ken Brody

unread,
Dec 21, 2012, 2:33:15 PM12/21/12
to
To me, that's not UB. That might be "unspecified" (or, as Tim Rentsch says,
"unconstrained").

Consider, for example:

i = (3*foo()) + foo();

where:

int foo(void)
{
static int i=0;
return i++;
}

Certainly not UB in any sense of the word.

Tim Rentsch

unread,
Dec 21, 2012, 4:09:11 PM12/21/12
to
The difference is subtle, so let me take another run at explaining
it.

Suppose C had a provision for "magic functions". A magic function
is declared and defined just the same way that ordinary C functions
are (perhaps with an additional 'magic' keyword), and are called
the same way as ordinary functions. However, outside of the
language, including both the Standard itself and also any aspects
known to any implementation, there is a way of setting magic
functions so that they activate a logic control wire whose purpose
(and consequences) are unknown as far as the Standard is concerned
(again including both portable behavior and implementation-defined
behavior). This activation takes place whenever a magic function
is called. Even though the Standard doesn't know what will happen,
it wants to guarantee that the associated logic control wires are
activated, so it stipulates that calling a magic function is always
required, even if, eg, the function body is empty, and if that is
known at every point of call.

Under these hypothetical conditions, calling a magic function has,
at least potentially, the same consequences as undefined behavior
does. However, the act of calling a magic function does not give
an implementation any license to ignore or violate requirements.
This is true because, even though calling a magic function might
do something horrible, _it also might not_, and implementations
must proceed just as they would if nothing bad has happened,
because as far as they know that could be true.

It's important to remember what 'undefined behavior' means, which
is a statement about how implementations may behave. Calling a
magic function has unlimited consequences in how an /execution/ may
behave, but that doesn't eliminate any requirements for how the
/implementation/ must behave. Statements in the Standard are
really about what implementations (ie, mostly compilers) do, not
about what happens during execution. Even though calling a magic
function has potentially unlimited consequences during execution,
it doesn't change what an implementation is obliged to do to
conform to the Standard's requirements. That is the key point.

To get back to volatile, the consequences of accessing a volatile
are just the same as calling a hypothetical magic function. It is
true that performing a volatile-qualifed access counts as a side
effect, but not necessarily a side effect on the object being
accessed. The consequences of accessing a volatile object are
potentially horrible, and as a result something really bad might
happen, but here again _it also might not_. Because it is unknown,
and indeed unknowable, by fiat in the Standard, what the
consequences of a volatile-qualified access will be, implementations
must behave just as they would if the accesses in question did
nothing more than what an ordinary access would do. Ergo the
term 'undefined behavior' does not apply.

Finally, about 6.5 p2. What's being referred to here are side
effects on scalar objects that occur because of language-defined
program actions. Accessing a volatile object is a side effect,
but it is not, for the purpose of 6.5 p2, a side effect on any
particular scalar object. Otherwise, the mere _declaration_ of
a volatile object would potentially provoke undefined behavior,
since such objects may be modified at any time, and nothing in
the Standard defines their sequencing. Also it may be good to
remember that the notion of sequencing is defined only for
evaluations done as part of defined C semantics (per 5.1.2.3 p3).
Any consequences of volatile-access-induced side effects fall
outside the domain of the sequencing rules, because those rules
pertain only to evaluations of program expressions (and then
only those in a single thread). Doing two read accesses of
a single volatile-qualified object might produce horrible
consequences (then again, so might only a single read access),
but even so there is no 'undefined behavior', in the sense
that the Standard uses the term, ie, about what is further
required of the implementation: the execution may go completely
askew, but that doesn't let, eg, the compiler off the hook for
generating bad code.

I hope this explanation helped because I am going to stop now. :)

Tim Rentsch

unread,
Dec 21, 2012, 4:16:41 PM12/21/12
to
IMO this conclusion is wrong. The consequences of volatile access (ie,
the extra-linguistic side effects) are outside the domain of 6.5p2,
because it is concerned only with program expressions, not other
unknown memory changes. This view is explained in more detail in my
response to Ben Bacarisse in this thread.

glen herrmannsfeldt

unread,
Dec 21, 2012, 5:40:51 PM12/21/12
to
Tim Rentsch <t...@alumni.caltech.edu> wrote:
> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>> Tim Rentsch <t...@alumni.caltech.edu> writes:

(snip)
>>> The answer is no. Arbitrary behavior may arise as a result, but
>>> implementions are obliged to behave as they would if the side
>>> effects were ordinary, non-interfering side effects. The term
>>> "undefined behavior" is just a shorthand for saying something
>>> about what an implementation may do. The actual behavior for
>>> this example (and indeed for any volatile-qualified access at
>>> all) is _unconstrained_, but the behavior is not _undefined_ in
>>> the sense that the Standard uses the term 'undefined behavior'.

>> I'm having trouble squaring this with the wording of 6.5 p2:

>> "If a side effect on a scalar object is unsequenced relative to either
>> a different side effect on the same scalar object or a value
>> computation using the value of the same scalar object, the behavior is
>> undefined."

>> Or maybe I'm having trouble understanding the point you are making.

Before there was "volatile", there was PL/I and the ABNORMAL attribute.

Now, PL/I had multitasking pretty much from the beginning, so there
was always a way that a variable could change at unexpected times.

A compiler wasn't supposed to optimize, for example, A+A as 2*A,
as A might change.

I don't know C11 well at all, has multitasking, or multithreading,
been added now? Is there a way, within a C program (not counting
I/O registers and such) for a variable to change within a statement,
other than as side effects of that statement?

If so, then the compiler has to allow for that.

Otherwise, it seems to me, that "volatile" has to be implementation
defined. As a door that an implementation can use in implementaion
specific ways. If an implementation allows for variables to be I/O
registers, then the compiler has to compile as appropriate for
that case.

> The difference is subtle, so let me take another run at explaining
> it.

> Suppose C had a provision for "magic functions". A magic function
> is declared and defined just the same way that ordinary C functions
> are (perhaps with an additional 'magic' keyword), and are called
> the same way as ordinary functions. However, outside of the
> language, including both the Standard itself and also any aspects
> known to any implementation, there is a way of setting magic
> functions so that they activate a logic control wire whose purpose
> (and consequences) are unknown as far as the Standard is concerned
> (again including both portable behavior and implementation-defined
> behavior).

Stretching this farther than I probably should, consider a function
in the same file as its call, and that the function doesn't do
anything, as the compiler can plainly see. Now, consider that
one might use a linkage editor (the OS/360 linker can do this)
to later replace that function with a different one.

> This activation takes place whenever a magic function
> is called. Even though the Standard doesn't know what will happen,
> it wants to guarantee that the associated logic control wires are
> activated, so it stipulates that calling a magic function is always
> required, even if, eg, the function body is empty, and if that is
> known at every point of call.

There is much discussion on comp.lang.fortran on what compilers
might do when optimizing function calls. Seems like in Fortran,
a compiler is allowed to optimize out the call with very little
reason, consider:

x=0*fclose(out);

> Under these hypothetical conditions, calling a magic function has,
> at least potentially, the same consequences as undefined behavior
> does. However, the act of calling a magic function does not give
> an implementation any license to ignore or violate requirements.
> This is true because, even though calling a magic function might
> do something horrible, _it also might not_, and implementations
> must proceed just as they would if nothing bad has happened,
> because as far as they know that could be true.

OK, but there is no "magic" keyword to apply to functions.
So, should compilers always call functions?

> It's important to remember what 'undefined behavior' means, which
> is a statement about how implementations may behave. Calling a
> magic function has unlimited consequences in how an /execution/ may
> behave, but that doesn't eliminate any requirements for how the
> /implementation/ must behave. Statements in the Standard are
> really about what implementations (ie, mostly compilers) do, not
> about what happens during execution. Even though calling a magic
> function has potentially unlimited consequences during execution,
> it doesn't change what an implementation is obliged to do to
> conform to the Standard's requirements. That is the key point.

OK, but magic functions aren't defined (last I knew) in the
standard. The "volatile" attribute is, but, as I understand,
not well enough to say what it actually does.

> To get back to volatile, the consequences of accessing a volatile
> are just the same as calling a hypothetical magic function. It is
> true that performing a volatile-qualifed access counts as a side
> effect, but not necessarily a side effect on the object being
> accessed. The consequences of accessing a volatile object are
> potentially horrible, and as a result something really bad might
> happen, but here again _it also might not_. Because it is unknown,
> and indeed unknowable, by fiat in the Standard, what the
> consequences of a volatile-qualified access will be, implementations
> must behave just as they would if the accesses in question did
> nothing more than what an ordinary access would do. Ergo the
> term 'undefined behavior' does not apply.

OK, but it seems to me that the standard, separate from implementations,
should only cover what standard conforming programs can do.

An implementation may allow variables to be I/O ports, and use the
"volatile" keywork, but the standard does not have any such wording.

> Finally, about 6.5 p2. What's being referred to here are side
> effects on scalar objects that occur because of language-defined
> program actions. Accessing a volatile object is a side effect,
> but it is not, for the purpose of 6.5 p2, a side effect on any
> particular scalar object. Otherwise, the mere _declaration_ of
> a volatile object would potentially provoke undefined behavior,
> since such objects may be modified at any time, and nothing in
> the Standard defines their sequencing.

Yes. My feeling is that the keyword is there to allow for
implementation defined behavior. But implementation defined,
is distinct from undefined.

> Also it may be good to
> remember that the notion of sequencing is defined only for
> evaluations done as part of defined C semantics (per 5.1.2.3 p3).
> Any consequences of volatile-access-induced side effects fall
> outside the domain of the sequencing rules, because those rules
> pertain only to evaluations of program expressions (and then
> only those in a single thread). Doing two read accesses of
> a single volatile-qualified object might produce horrible
> consequences (then again, so might only a single read access),
> but even so there is no 'undefined behavior', in the sense
> that the Standard uses the term, ie, about what is further
> required of the implementation: the execution may go completely
> askew, but that doesn't let, eg, the compiler off the hook for
> generating bad code.

It does seem that the compiler should follow the code as written.
If there is one reference to a variable, it should be referenced once,
for the implementation dependent definition of reference.

volatile int x;
y=2*x;
z=x+x;

In this case, y should always be even, z has the possibility
of not being even, and the compiler should allow for that.

> I hope this explanation helped because I am going to stop now. :)

-- glen

Shao Miller

unread,
Dec 21, 2012, 5:57:58 PM12/21/12
to
I read the other response. And what about if 'x' and "foo counter" are
the same scalar? Regardless of that or any other example, N1570's
informative Annex I, point 2 includes:

"An ‘‘unordered’’ binary operator (not comma, &&, or ||) contains a
side effect to an lvalue in one operand, and a side effect to, or an
access to the value of, the identical lvalue in the other operand (6.5)."

Informative Annex J "Portability issues", J.2 "Undefined behavior",
point 1 includes:

"A side effect on a scalar object is unsequenced relative to either a
different side effect on the same scalar object or a value computation
using the value of the same scalar object (6.5)."

Why wouldn't the Standard discuss "stored value" or "modification"
instead of using the looser "side effect"? 5.1.2.3p2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution environment.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

There is no need for any extra-linguistic side effect, since "accessing
a volatile object" is a side effect by its own right. I think examples
can still be helpful, though.

6.7.3p7:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects.
Therefore any expression referring to such an object shall be evaluated
strictly according to the rules of the abstract machine, as described in
5.1.2.3. Furthermore, at every sequence point the value last stored in
the object shall agree with that prescribed by the abstract machine,
except as modified by the unknown factors mentioned previously.134) What
constitutes an access to an object that has volatile-qualified type is
implementation-defined."

If the last sentence here means that an implementation has license to
redefine that one or both of a read or a store is _not_ an access to a
volatile-qualified type, then it cannot abide by the second sentence,
which involves the semantics of reading and modifying. If one accepts
that the final sentence is not allowing for a redefinition of "access,"
then it must be allowing for a definition of what [else] _constitutes_
an access, such as:

- A clock tick increments the stored value of a tick-counter
- A read of a usage-counter increments the stored value of the usage-counter
- A read of the value of an object used for obtaining random values
causes the object's stored value to change to a new random value

"Undefined" by the Standard can obviously be defined by the
implementation, but need not be, which is why I'd suggest that it's
undefined. (The implementation doesn't have to know, document, or be
consistent about it.)

- Shao Miller

glen herrmannsfeldt

unread,
Dec 21, 2012, 6:52:46 PM12/21/12
to
Shao Miller <sha0....@gmail.com> wrote:

> On 12/21/2012 16:16, Tim Rentsch wrote:
(snip)
>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>> because it is concerned only with program expressions, not other
>> unknown memory changes. This view is explained in more detail in my
>> response to Ben Bacarisse in this thread.

> I read the other response. And what about if 'x' and "foo counter" are
> the same scalar? Regardless of that or any other example, N1570's
> informative Annex I, point 2 includes:

> "An ??????unordered?????? binary operator (not comma, &&, or ||) contains a
> side effect to an lvalue in one operand, and a side effect to, or an
> access to the value of, the identical lvalue in the other operand (6.5)."

This seems to be for expressions like (x++)+(x).

Also, it seems to indicate that access isn't a side affect.

> Informative Annex J "Portability issues", J.2 "Undefined behavior",
> point 1 includes:

> "A side effect on a scalar object is unsequenced relative to either a
> different side effect on the same scalar object or a value computation
> using the value of the same scalar object (6.5)."

> Why wouldn't the Standard discuss "stored value" or "modification"
> instead of using the looser "side effect"? 5.1.2.3p2:

> "Accessing a volatile object, modifying an object, modifying a file,
> or calling a function that does any of those operations are all side
> effects,12) which are changes in the state of the execution environment.
> Evaluation of an expression in general includes both value computations
> and initiation of side effects. Value computation for an lvalue
> expression includes determining the identity of the designated object."

Is there any indication in the standard on what "side affect" means
for volatile data? I/O registers have been mentioned in the discussion,
but is that in the standard?

There have been systems with self-incrementing or self-decrementing
memory locations. If one was used for a volatile variable, then
there is an obvious side effect.

> There is no need for any extra-linguistic side effect, since "accessing
> a volatile object" is a side effect by its own right. I think examples
> can still be helpful, though.

> 6.7.3p7:

> "An object that has volatile-qualified type may be modified in ways
> unknown to the implementation or have other unknown side effects.
> Therefore any expression referring to such an object shall be evaluated
> strictly according to the rules of the abstract machine, as described in
> 5.1.2.3. Furthermore, at every sequence point the value last stored in
> the object shall agree with that prescribed by the abstract machine,
> except as modified by the unknown factors mentioned previously.134) What
> constitutes an access to an object that has volatile-qualified type is
> implementation-defined."

Sometimes "access" means fetch but not store. I am not so sure
in this case either way.

> If the last sentence here means that an implementation has license to
> redefine that one or both of a read or a store is _not_ an access to a
> volatile-qualified type, then it cannot abide by the second sentence,
> which involves the semantics of reading and modifying. If one accepts
> that the final sentence is not allowing for a redefinition of "access,"
> then it must be allowing for a definition of what [else] _constitutes_
> an access, such as:

> - A clock tick increments the stored value of a tick-counter
> - A read of a usage-counter increments the stored value of the usage-counter
> - A read of the value of an object used for obtaining random values
> causes the object's stored value to change to a new random value

Seems to me that these all have to be implementation dependent
(or implementation defined), in which case the effects and meaning
of "volatile" should also be so defined.

> "Undefined" by the Standard can obviously be defined by the
> implementation, but need not be, which is why I'd suggest that it's
> undefined. (The implementation doesn't have to know, document, or be
> consistent about it.)

-- glen

Shao Miller

unread,
Dec 21, 2012, 7:47:05 PM12/21/12
to
On 12/21/2012 18:52, glen herrmannsfeldt wrote:
> Shao Miller <sha0....@gmail.com> wrote:
>
>> On 12/21/2012 16:16, Tim Rentsch wrote:
> (snip)
>>> IMO this conclusion is wrong. The consequences of volatile access (ie,
>>> the extra-linguistic side effects) are outside the domain of 6.5p2,
>>> because it is concerned only with program expressions, not other
>>> unknown memory changes. This view is explained in more detail in my
>>> response to Ben Bacarisse in this thread.
>
>> I read the other response. And what about if 'x' and "foo counter" are
>> the same scalar? Regardless of that or any other example, N1570's
>> informative Annex I, point 2 includes:
>
>> "An ??????unordered?????? binary operator (not comma, &&, or ||) contains a
>> side effect to an lvalue in one operand, and a side effect to, or an
>> access to the value of, the identical lvalue in the other operand (6.5)."
>
> This seems to be for expressions like (x++)+(x).
>
> Also, it seems to indicate that access isn't a side affect.
>

Down below in 5.1.2.3p2, we see that for volatile-qualified types of
objects, it really is. The upthread question about volatile 'x' in 'x +
x' has been snipped.

>> Informative Annex J "Portability issues", J.2 "Undefined behavior",
>> point 1 includes:
>
>> "A side effect on a scalar object is unsequenced relative to either a
>> different side effect on the same scalar object or a value computation
>> using the value of the same scalar object (6.5)."
>
>> Why wouldn't the Standard discuss "stored value" or "modification"
>> instead of using the looser "side effect"? 5.1.2.3p2:
>
>> "Accessing a volatile object, modifying an object, modifying a file,
>> or calling a function that does any of those operations are all side
>> effects,12) which are changes in the state of the execution environment.
>> Evaluation of an expression in general includes both value computations
>> and initiation of side effects. Value computation for an lvalue
>> expression includes determining the identity of the designated object."
>
> Is there any indication in the standard on what "side affect" means
> for volatile data? I/O registers have been mentioned in the discussion,
> but is that in the standard?
>

Yes. [Albeit non-normative] Footnote 134 in 6.7.3p7:

"A volatile declaration may be used to describe an object
corresponding to a memory-mapped input/output port or an object accessed
by an asynchronously interrupting function. Actions on objects so
declared shall not be ‘‘optimized out’’ by an implementation or
reordered except as permitted by the rules for evaluating expressions."

> There have been systems with self-incrementing or self-decrementing
> memory locations. If one was used for a volatile variable, then
> there is an obvious side effect.
>

I believe that there are _two_ side effects; one by 5.1.2.3p2 and one by
either "other unknown side effects" or due to an
"implementation-defined", additional "access". (Both of these latter
from 6.7.3p7.)

>> There is no need for any extra-linguistic side effect, since "accessing
>> a volatile object" is a side effect by its own right. I think examples
>> can still be helpful, though.
>
>> 6.7.3p7:
>
>> "An object that has volatile-qualified type may be modified in ways
>> unknown to the implementation or have other unknown side effects.
>> Therefore any expression referring to such an object shall be evaluated
>> strictly according to the rules of the abstract machine, as described in
>> 5.1.2.3. Furthermore, at every sequence point the value last stored in
>> the object shall agree with that prescribed by the abstract machine,
>> except as modified by the unknown factors mentioned previously.134) What
>> constitutes an access to an object that has volatile-qualified type is
>> implementation-defined."
>
> Sometimes "access" means fetch but not store. I am not so sure
> in this case either way.
>

"Access" is the very first definition under section 3 "Terms,
definitions, and symbols". 3.1p1:

"access
〈execution-time action〉 to read or modify the value of an object"

Then 3.1p2 further addresses your query:

"NOTE 1 Where only one of these two actions is meant, ‘‘read’’ or
‘‘modify’’ is used."

>> If the last sentence here means that an implementation has license to
>> redefine that one or both of a read or a store is _not_ an access to a
>> volatile-qualified type, then it cannot abide by the second sentence,
>> which involves the semantics of reading and modifying. If one accepts
>> that the final sentence is not allowing for a redefinition of "access,"
>> then it must be allowing for a definition of what [else] _constitutes_
>> an access, such as:
>
>> - A clock tick increments the stored value of a tick-counter
>> - A read of a usage-counter increments the stored value of the usage-counter
>> - A read of the value of an object used for obtaining random values
>> causes the object's stored value to change to a new random value
>
> Seems to me that these all have to be implementation dependent
> (or implementation defined), in which case the effects and meaning
> of "volatile" should also be so defined.
>

Why? 3.4.1p1:

"implementation-defined behavior
unspecified behavior where each implementation documents how the choice
is made"

Once again, 6.7.3p7 includes:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects. ..."

So each of the three examples could belong to either of these two
categories. Or, they could be implementation-defined, additional accesses.

>> "Undefined" by the Standard can obviously be defined by the
>> implementation, but need not be, which is why I'd suggest that it's
>> undefined. (The implementation doesn't have to know, document, or be
>> consistent about it.)
>

- Shao Miller

Ken Brody

unread,
Dec 21, 2012, 11:39:23 PM12/21/12
to
On 12/21/2012 5:40 PM, glen herrmannsfeldt wrote:
[...]
> There is much discussion on comp.lang.fortran on what compilers
> might do when optimizing function calls. Seems like in Fortran,
> a compiler is allowed to optimize out the call with very little
> reason, consider:
>
> x=0*fclose(out);

However, because the compiler cannot know if the function itself will have
side effects, the fact that x will always be zero is irrelevant to a C
compiler -- the function call cannot be removed. If Fortran allows it to be
removed, I would consider that part of the language to be "broken".

[...]
> OK, but there is no "magic" keyword to apply to functions.
> So, should compilers always call functions?

Yes, because that function may have side effects. And, in C at least, the
function must therefore be called. This is no different than:

(void)fclose(out);

[...]

glen herrmannsfeldt

unread,
Dec 22, 2012, 1:50:14 AM12/22/12
to
Ken Brody <kenb...@spamcop.net> wrote:

(snip, I wrote)
>> There is much discussion on comp.lang.fortran on what compilers
>> might do when optimizing function calls. Seems like in Fortran,
>> a compiler is allowed to optimize out the call with very little
>> reason, consider:

>> x=0*fclose(out);

> However, because the compiler cannot know if the function itself will have
> side effects, the fact that x will always be zero is irrelevant to a C
> compiler -- the function call cannot be removed. If Fortran allows it to be
> removed, I would consider that part of the language to be "broken".

Well, comp.lang.fortran people might considerr lots of parts of C
broken, but there is some debate as to what Fortran says on this.

There is also the PURE attribute for functions which don't have
side effects (and mostly the compiler can check for that).

Otherwise, it is usual to use SUBROUTINEs when you actually
want side effects.

>> OK, but there is no "magic" keyword to apply to functions.
>> So, should compilers always call functions?

> Yes, because that function may have side effects. And, in C at least, the
> function must therefore be called. This is no different than:

> (void)fclose(out);

-- glen

Tim Rentsch

unread,
Dec 22, 2012, 1:54:53 AM12/22/12
to
> I read the other response. [...snip...]

Then apparently you didn't understand it.

Tim Rentsch

unread,
Dec 22, 2012, 3:15:43 AM12/22/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:

> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>>> Tim Rentsch <t...@alumni.caltech.edu> writes:
>
> (snip)
>>>> The answer is no. Arbitrary behavior may arise as a result, but
>>>> implementions are obliged to behave as they would if the side
>>>> effects were ordinary, non-interfering side effects. The term
>>>> "undefined behavior" is just a shorthand for saying something
>>>> about what an implementation may do. The actual behavior for
>>>> this example (and indeed for any volatile-qualified access at
>>>> all) is _unconstrained_, but the behavior is not _undefined_ in
>>>> the sense that the Standard uses the term 'undefined behavior'.
>
>>> I'm having trouble squaring this with the wording of 6.5 p2:
>
>>> "If a side effect on a scalar object is unsequenced relative to either
>>> a different side effect on the same scalar object or a value
>>> computation using the value of the same scalar object, the behavior is
>>> undefined."
>
>>> Or maybe I'm having trouble understanding the point you are making.
>
> Before there was "volatile", there was PL/I and the ABNORMAL attribute.
>
> Now, PL/I had multitasking pretty much from the beginning, so there
> was always a way that a variable could change at unexpected times.
>
> A compiler wasn't supposed to optimize, for example, A+A as 2*A,
> as A might change.

I don't have a PL/I specification readily available, so
I won't try to compare ABNORMAL to volatile.

> I don't know C11 well at all, has multitasking, or multithreading,
> been added now?

Yes but volatile has been in standard C since before multithreading
was included, so it isn't really affected by that.

> Is there a way, within a C program (not counting I/O registers
> and such) for a variable to change within a statement, other
> than as side effects of that statement?

Not in a way that's defined by the Standard, no. (And ignoring
threading, which doesn't bear on the current discussion.)

> If so, then the compiler has to allow for that.
>
> Otherwise, it seems to me, that "volatile" has to be
> implementation defined.

It might seem that way but the Standard is very clear that this
isn't so. The consequences, or even potential consequences, of
any volatile access are unknown to the implementation, and this
is explicit in the Standard.

> As a door that an implementation can use in implementaion specific
> ways. If an implementation allows for variables to be I/O
> registers, then the compiler has to compile as appropriate for
> that case.

The only rule is that access to an object through a volatile-qualified
type must be done "naively", ie, according to straightforward rules
of expression evaluation and not optimized out. (The Standard has
a more precise definition, but this is the gist.) Anything beyond
that must be managed by the programmer, not the implementation.

>> The difference is subtle, so let me take another run at explaining
>> it.
>
>> Suppose C had a provision for "magic functions". A magic function
>> is declared and defined just the same way that ordinary C functions
>> are (perhaps with an additional 'magic' keyword), and are called
>> the same way as ordinary functions. However, outside of the
>> language, including both the Standard itself and also any aspects
>> known to any implementation, there is a way of setting magic
>> functions so that they activate a logic control wire whose purpose
>> (and consequences) are unknown as far as the Standard is concerned
>> (again including both portable behavior and implementation-defined
>> behavior).
>
> Stretching this farther than I probably should, consider a function
> in the same file as its call, and that the function doesn't do
> anything, as the compiler can plainly see. Now, consider that
> one might use a linkage editor (the OS/360 linker can do this)
> to later replace that function with a different one.

A similar idea. I don't think the exact mechanism is important,
as long as it is outside the domain of what the implementation
(eg, compiler) knows.

>> This activation takes place whenever a magic function
>> is called. Even though the Standard doesn't know what will happen,
>> it wants to guarantee that the associated logic control wires are
>> activated, so it stipulates that calling a magic function is always
>> required, even if, eg, the function body is empty, and if that is
>> known at every point of call.
>
> There is much discussion on comp.lang.fortran on what compilers
> might do when optimizing function calls. Seems like in Fortran,
> a compiler is allowed to optimize out the call with very little
> reason, consider:
>
> x=0*fclose(out);

The C Standard apparently is more demanding; optimizations
are allowed only when they don't change the results of what
the program naively does (again the Standard defines this
condition more precisely).

>> Under these hypothetical conditions, calling a magic function has,
>> at least potentially, the same consequences as undefined behavior
>> does. However, the act of calling a magic function does not give
>> an implementation any license to ignore or violate requirements.
>> This is true because, even though calling a magic function might
>> do something horrible, _it also might not_, and implementations
>> must proceed just as they would if nothing bad has happened,
>> because as far as they know that could be true.
>
> OK, but there is no "magic" keyword to apply to functions.
> So, should compilers always call functions?

In the hypothetical example language there is a special keyword
for magic functions, so this question doesn't apply.

>> It's important to remember what 'undefined behavior' means, which
>> is a statement about how implementations may behave. Calling a
>> magic function has unlimited consequences in how an /execution/ may
>> behave, but that doesn't eliminate any requirements for how the
>> /implementation/ must behave. Statements in the Standard are
>> really about what implementations (ie, mostly compilers) do, not
>> about what happens during execution. Even though calling a magic
>> function has potentially unlimited consequences during execution,
>> it doesn't change what an implementation is obliged to do to
>> conform to the Standard's requirements. That is the key point.
>
> OK, but magic functions aren't defined (last I knew) in the
> standard.

Did you miss the word "Suppose" in what I wrote before? Magic
functions are a hypothetical construct, defined as described
above and putatively added to C, for the purpose of illustration.

> The "volatile" attribute is, but, as I understand, not well enough
> to say what it actually does.

The point of volatile is to impose additional requirements, or
limitations really, on how a C program may be compiled. Beyond these
requirements, no semantics are defined (beyond those of the access
itself) either by the Standard or by the implementation. Indeed, the
Standard says that volatile objects "may be modified in ways unknown
to the implementation or have other unknown side effects." What
happens upon accessing such objects is very explicitly outside the
domain both of the Standard and of the implementation.

>> To get back to volatile, the consequences of accessing a volatile
>> are just the same as calling a hypothetical magic function. It is
>> true that performing a volatile-qualifed access counts as a side
>> effect, but not necessarily a side effect on the object being
>> accessed. The consequences of accessing a volatile object are
>> potentially horrible, and as a result something really bad might
>> happen, but here again _it also might not_. Because it is unknown,
>> and indeed unknowable, by fiat in the Standard, what the
>> consequences of a volatile-qualified access will be, implementations
>> must behave just as they would if the accesses in question did
>> nothing more than what an ordinary access would do. Ergo the
>> term 'undefined behavior' does not apply.
>
> OK, but it seems to me that the standard, separate from
> implementations, should only cover what standard conforming
> programs can do.

I understand this reaction, but it isn't really right. The
Standard is a specification for how implementations must
behave. How programs will behave is a consequence of what
the implementation must do (and also on being run on a
data-processing system capable of supporting a conforming
implementation, but this also is outside the scope of the
Standard).

> An implementation may allow variables to be I/O ports, and use
> the "volatile" keywork, but the standard does not have any such
> wording.

The point of declaring something 'volatile' is that how it
behaves when accessed is outside the domain of what the
implementation knows, and hence the implementation must
treat it in a particular way. In a sense, the word 'volatile'
says to the implementation, "You don't know what's going on
here, so don't imagine that you do."

>> Finally, about 6.5 p2. What's being referred to here are side
>> effects on scalar objects that occur because of language-defined
>> program actions. Accessing a volatile object is a side effect,
>> but it is not, for the purpose of 6.5 p2, a side effect on any
>> particular scalar object. Otherwise, the mere _declaration_ of
>> a volatile object would potentially provoke undefined behavior,
>> since such objects may be modified at any time, and nothing in
>> the Standard defines their sequencing.
>
> Yes. My feeling is that the keyword is there to allow for
> implementation defined behavior. [snip]

The Standard contradicts this idea.

>> Also it may be good to
>> remember that the notion of sequencing is defined only for
>> evaluations done as part of defined C semantics (per 5.1.2.3 p3).
>> Any consequences of volatile-access-induced side effects fall
>> outside the domain of the sequencing rules, because those rules
>> pertain only to evaluations of program expressions (and then
>> only those in a single thread). Doing two read accesses of
>> a single volatile-qualified object might produce horrible
>> consequences (then again, so might only a single read access),
>> but even so there is no 'undefined behavior', in the sense
>> that the Standard uses the term, ie, about what is further
>> required of the implementation: the execution may go completely
>> askew, but that doesn't let, eg, the compiler off the hook for
>> generating bad code.
>
> It does seem that the compiler should follow the code as
> written. If there is one reference to a variable, it should be
> referenced once, for the implementation dependent definition of
> reference.
>
> volatile int x;
> y=2*x;
> z=x+x;
>
> In this case, y should always be even, z has the possibility
> of not being even, and the compiler should allow for that.

Not even that. Even ignoring the possible undefined behavior
because of overflow, after the second statement is done
y could have any value at all, because accessing 'x' in
'z = x + x;' might have the side effect of storing into
y, and the compiler isn't allowed to know that or assume
that it doesn't happen. Any use of y after the second
assignment statement must refetch y, for just this reason.

Shao Miller

unread,
Dec 22, 2012, 3:31:01 AM12/22/12
to
Well in that case, I'd really like to enhance that understanding, if
possible. :)

On 12/21/2012 16:09, Tim Rentsch wrote:
> Suppose C had a provision for "magic functions". A magic function
> is declared and defined just the same way that ordinary C functions
> are (perhaps with an additional 'magic' keyword), and are called
> the same way as ordinary functions. However, outside of the
> language, including both the Standard itself and also any aspects
> known to any implementation, there is a way of setting magic
> functions so that they activate a logic control wire whose purpose
> (and consequences) are unknown as far as the Standard is concerned
> (again including both portable behavior and implementation-defined
> behavior). This activation takes place whenever a magic function
> is called. Even though the Standard doesn't know what will happen,
> it wants to guarantee that the associated logic control wires are
> activated, so it stipulates that calling a magic function is always
> required, even if, eg, the function body is empty, and if that is
> known at every point of call.

This seems pretty clear. It also seems analogous to the semantics of
'volatile'; helping to ensure that any non-standard side effects are
correctly "wired," regardless of their existence.

> Under these hypothetical conditions, calling a magic function has,
> at least potentially, the same consequences as undefined behavior
> does. However, the act of calling a magic function does not give
> an implementation any license to ignore or violate requirements.
> This is true because, even though calling a magic function might
> do something horrible, _it also might not_, and implementations
> must proceed just as they would if nothing bad has happened,
> because as far as they know that could be true.

As in, the computer could melt, but that possibility is entirely outside
of the scope of C, and we must discuss and behave as though it won't.
Seems pretty reasonable. That bit about "license" and "violate
requirements" is interesting. One apparent requirement from the
definitions is that a read or store be considered an access. If that's
a mistaken interpretation, so be it.

> It's important to remember what 'undefined behavior' means, which
> is a statement about how implementations may behave. Calling a
> magic function has unlimited consequences in how an /execution/ may
> behave, but that doesn't eliminate any requirements for how the
> /implementation/ must behave. Statements in the Standard are
> really about what implementations (ie, mostly compilers) do, not
> about what happens during execution. Even though calling a magic
> function has potentially unlimited consequences during execution,
> it doesn't change what an implementation is obliged to do to
> conform to the Standard's requirements. That is the key point.

I remember there was a toy that was a box and a switch. When you
flipped the switch, a false hand would come slowly creeping out of the
box, with a finger aimed at the switch. At some point it'd flip the
switch and a release would cause the hand to be instantly retracted back
into the box. If this toy was programmed in C, we might hope that the
implementation was conforming even though the Nth write to the
"keep_going" volatile scalar was always mysteriously tied to a
power-loss event.

> To get back to volatile, the consequences of accessing a volatile
> are just the same as calling a hypothetical magic function. It is
> true that performing a volatile-qualifed access counts as a side
> effect, but not necessarily a side effect on the object being
> accessed.

This would seem to me to be more of a key point than the last one. If
one accepts this, then it is easy to agree that there _might_ not be any
"trouble" with the Standard's discussion of unsequenced side effects on
the same scalar.

But why not? Is the suggestion that the wording of N1570's 5.1.2.3p2
does not match the authors' intentions?

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution environment.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

Maybe something more like:

"Modifying an object, modifying a file, or calling a function that
does any of those operations are all side effects,12) which are changes
in the state of the execution environment. In addition, accessing a
volatile object may cause side effects that may result in
implementation-defined or undefined behavior, but such side effects are
either implementation-defined or outside of the scope of this standard.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object."

> The consequences of accessing a volatile object are
> potentially horrible, and as a result something really bad might
> happen, but here again _it also might not_. Because it is unknown,
> and indeed unknowable, by fiat in the Standard, what the
> consequences of a volatile-qualified access will be, implementations
> must behave just as they would if the accesses in question did
> nothing more than what an ordinary access would do. Ergo the
> term 'undefined behavior' does not apply.

What about ensuring that all volatile reads (side effects, if
interpreted literally) enjoy the fantastic new sequencing semantics,
such as the discussions of side effects having been completed before a
sequence point?

If a volatile _read_ was not a side effect, but might possibly initiate
one that we shouldn't worry about, then perhaps given:

volatile int x;
x = 13;
int y = x + x;
x = 42;

we could actually get away with:

1. set value of x to 13
2. side effect for 1
3. sequence point
4. read value of x
5. side effect for 4
6. read value of x and add the value from 4 to compute initial value
for y
7. set value of y
8. sequence point
9. side effect that was skipped for 6 (hey, we don't know!)
10. set value of x to 42
11. side effect for 10

Now the 9 and 11 might happen simultaneously. They might have nothing
to do with the value of the scalar 'x' itself, but they are able to
collide. Same potential for collision if a read _is_ always a side
effect (even if the value of scalar 'x' still isn't influenced by that
side effect):

1. set value of x to 13
2. side effect for 1
3. sequence point
4. read value of x
5. side effect for 4
6. read value of x
7. side effect for 6
8. set value of y
9. sequence point
10. set value of x to 42
11. side effect for 10

Here, 5 and 7 might conflict, except now we can point one finger at the
programmer and another to the Standard and with raised voice and
eyebrows low, tell the programmer they're engaging in undefined
behaviour. By keeping side effects nice and separate, don't we enjoy a
little more safety, or else take our chances?

I'm hoping that understanding and agreement are two different things,
here. It's a lengthy post, but please do clarify, if and only if you
have an opportunity. :) Thank you for your time.

- Shao Miller

glen herrmannsfeldt

unread,
Dec 22, 2012, 5:34:00 AM12/22/12
to
Tim Rentsch <t...@alumni.caltech.edu> wrote:

(snip, I wrote)
>> Before there was "volatile", there was PL/I and the ABNORMAL attribute.

>> Now, PL/I had multitasking pretty much from the beginning, so there
>> was always a way that a variable could change at unexpected times.

>> A compiler wasn't supposed to optimize, for example, A+A as 2*A,
>> as A might change.

> I don't have a PL/I specification readily available, so
> I won't try to compare ABNORMAL to volatile.

OK, I now downloaded:

http://bitsavers.trailing-edge.com/pdf/ibm/360/pli/C28-6571-1_PL_I_Language_Specifications_Jul65.pdf

This is, as I understand it, how PL/I is supposed to work, separate from
any specific implementation. Separate manuals describe it as
implemented.

Among others that I had forgotten, it applies to procedures and to
variable, or at least did in 1965.

"Rules for abnormality in procedures:

1. Abnormality is a property of both external an dinternal
procedures. Blocks invoding procedures that are abnormal must be
within the scope of an ABNORMAL, USES, or SETS declaration for the
invoked entry name. However, the invocation of an abnormal procedure
does not make the envoking procedure itself abnormal. These
attributes enable program optimization to be performed.

2. An external procedure is abnormal if it or any procedure invoked
by it:

a. Access, modify, alocate, or free external data.
b. Modify, allocate, or free thier arguments.
c. Return inconsistent function values for the same argument values.
d. Maintain any kind of history.
e. Perform input/output operations.
f. Return control from the procedure by means of a GOTO statement.
3. An internal procedure is abnormal:
a. Under any condition listed above for external procedures.
b. If it, or any procedure called by it, access, modify, allocate,
or free variables declared in an outer block.
4. Abnormal external procedures invoked as functions much be declared
with at least one of the attributes, ABNORMAL, USES, or SETS. The
scope of this declaration must include the invoking block.
5. ABNORMAL used alone specifies that all possible types of abnormality
should be assumed. It is unnecessary to specify ABNORMAL for the
built-in functions, TIME and DATE.
6. The NORMAL attribute specifies that the entry name is for a
procedure that is not abnormal.

a. Access, modify, alocate, or free external data.
b. Modify, allocate, or free thier arguments.
c. Return inconsistent function values for the same argument values.
d. Maintain any kind of history.
e. Perform input/output operations.
f. Return control from the procedure by means of a GOTO statement.
3. An internal procedure is abnormal:
a. Under any condition listed above for external procedures.
b. If it, or any procedure called by it, access, modify, allocate,
or free variables declared in an outer block.
4. Abnormal external procedures invoked as functions much be declared
with at least one of the attributes, ABNORMAL, USES, or SETS. The
scope of this declaration must include the invoking block.
5. ABNORMAL used alone specifies that all possible types of abnormality
should be assumed. It is unnecessary to specify ABNORMAL for the
built-in functions, TIME and DATE.
6. The NORMAL attribute specifies that the entry name is for a
procedure that is not abnormal."

That part is pretty interesting. I know that many programs did those
things without the ABNORMAL attribute, but then maybe it is the default
for procedures.

Onto variables:

"Rules for abnormal data:

1. The ABNORMAL attribute may be declared for any variable.
2. The ABNORMAL attribute specifies that a variable may be altered or
otherwise accessed at an unpredictable time during execution of a
program. The situation might occur, for example, during the
execution of an ON-unit as described in "The ON Statement," in
Chapter 8.
3. Every time ABNORMAL data is referred to, its associated storage
contains its current value."

Much simpler than for procedures. Anyway:

"Default for abnormality of procedures:

If an external entry name appears only as a function reference, the
entry name is assumed to have the NORMAL attribute; otherwise, the
entry name is assumed to be ABNORMAL. Entry names of all internal
procedures and entry names of external procedures invoked in a CALL
statement are assumed to have the ABNORMAL attribute.

Default for abnormality if data:

Variables are assumed to be NORMAL, except structures containing
ABNORMAL elements; such structures may not be declared to be NORMAL."


>> I don't know C11 well at all, has multitasking, or multithreading,
>> been added now?

> Yes but volatile has been in standard C since before multithreading
> was included, so it isn't really affected by that.

>> Is there a way, within a C program (not counting I/O registers
>> and such) for a variable to change within a statement, other
>> than as side effects of that statement?

> Not in a way that's defined by the Standard, no. (And ignoring
> threading, which doesn't bear on the current discussion.)

But it is convenient that multithreading does allow variables to change
at unexpected (in the statement where they might be used) times.

>> If so, then the compiler has to allow for that.

>> Otherwise, it seems to me, that "volatile" has to be
>> implementation defined.

> It might seem that way but the Standard is very clear that this
> isn't so. The consequences, or even potential consequences, of
> any volatile access are unknown to the implementation, and this
> is explicit in the Standard.

In that case, compilers should just give up.

To make the discussion more interesting, are "volatile" variables
allowed to be partially modified? Consider one that is more than one
byte long, and the bytes are not written with any interlock. An
interrupt could occur while one is only partly updated.

S/370 has CAS, Compare and Swap, for interlocked updating of memory.
Other architectures have similar ways of updating storage. Maybe
compilers should use that?

>> As a door that an implementation can use in implementaion specific
>> ways. If an implementation allows for variables to be I/O
>> registers, then the compiler has to compile as appropriate for
>> that case.

> The only rule is that access to an object through a volatile-qualified
> type must be done "naively", ie, according to straightforward rules
> of expression evaluation and not optimized out. (The Standard has
> a more precise definition, but this is the gist.) Anything beyond
> that must be managed by the programmer, not the implementation.

But if you can't say which ways data might be modified, then it is
pretty hard to expect compilers to account for those ways.

(snip, I wrote)
>> Stretching this farther than I probably should, consider a function
>> in the same file as its call, and that the function doesn't do
>> anything, as the compiler can plainly see. Now, consider that
>> one might use a linkage editor (the OS/360 linker can do this)
>> to later replace that function with a different one.

> A similar idea. I don't think the exact mechanism is important,
> as long as it is outside the domain of what the implementation
> (eg, compiler) knows.

OK, sounds good to me.

(snip)

>> x=0*fclose(out);

> The C Standard apparently is more demanding; optimizations
> are allowed only when they don't change the results of what
> the program naively does (again the Standard defines this
> condition more precisely).

Would be interesting to have "volatile" and "nonvolatile" attribute
for functions.

(snip)

> Did you miss the word "Suppose" in what I wrote before? Magic
> functions are a hypothetical construct, defined as described
> above and putatively added to C, for the purpose of illustration.

Maybe.

>> The "volatile" attribute is, but, as I understand, not well enough
>> to say what it actually does.

> The point of volatile is to impose additional requirements, or
> limitations really, on how a C program may be compiled. Beyond these
> requirements, no semantics are defined (beyond those of the access
> itself) either by the Standard or by the implementation. Indeed, the
> Standard says that volatile objects "may be modified in ways unknown
> to the implementation or have other unknown side effects." What
> happens upon accessing such objects is very explicitly outside the
> domain both of the Standard and of the implementation.

(snip)

>> An implementation may allow variables to be I/O ports, and use
>> the "volatile" keywork, but the standard does not have any such
>> wording.

> The point of declaring something 'volatile' is that how it
> behaves when accessed is outside the domain of what the
> implementation knows, and hence the implementation must
> treat it in a particular way. In a sense, the word 'volatile'
> says to the implementation, "You don't know what's going on
> here, so don't imagine that you do."

But the implementation has to make some assumptions.

As I mentioned above, one might be atomic access to memory,
or that other accesses are atomic.

(snip)

>> Yes. My feeling is that the keyword is there to allow for
>> implementation defined behavior. [snip]

> The Standard contradicts this idea.

How about in earlier times?

(snip)

>> It does seem that the compiler should follow the code as
>> written. If there is one reference to a variable, it should be
>> referenced once, for the implementation dependent definition of
>> reference.

>> volatile int x;
>> y=2*x;
>> z=x+x;

>> In this case, y should always be even, z has the possibility
>> of not being even, and the compiler should allow for that.

> Not even that. Even ignoring the possible undefined behavior
> because of overflow, after the second statement is done
> y could have any value at all, because accessing 'x' in
> 'z = x + x;' might have the side effect of storing into
> y, and the compiler isn't allowed to know that or assume
> that it doesn't happen. Any use of y after the second
> assignment statement must refetch y, for just this reason.

Even if y isn't volatile?

(I specifically only made x volatile.)

-- glen


Tim Rentsch

unread,
Dec 22, 2012, 12:16:53 PM12/22/12
to
Shao Miller <sha0....@gmail.com> writes:

> On 12/22/2012 01:54, Tim Rentsch wrote:
>> Shao Miller <sha0....@gmail.com> writes:
>>
>>>> [..snip..snip..snip..]
>>>
>>> I read the other response. [...snip...]
>>
>> Then apparently you didn't understand it.
>
> Well in that case, I'd really like to enhance that understanding,
> if possible. :) [snip]

My suggestions are: read more carefully; think more deeply; try
to organize your thoughts more systematically; and make an effort
in your writing to express ideas more clearly and more concisely.

Tim Rentsch

unread,
Dec 22, 2012, 12:32:19 PM12/22/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:

> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>>
>> [snip]
>>
>> The only rule is that access to an object through a volatile-qualified
>> type must be done "naively", ie, according to straightforward rules
>> of expression evaluation and not optimized out. (The Standard has
>> a more precise definition, but this is the gist.) Anything beyond
>> that must be managed by the programmer, not the implementation.
>
> But if you can't say which ways data might be modified, then it is
> pretty hard to expect compilers to account for those ways.

Right, they can't. The Standard effectively prohibits them from
even trying.

>>> [snip]
>>
>> The point of declaring something 'volatile' is that how it
>> behaves when accessed is outside the domain of what the
>> implementation knows, and hence the implementation must
>> treat it in a particular way. In a sense, the word 'volatile'
>> says to the implementation, "You don't know what's going on
>> here, so don't imagine that you do."
>
> But the implementation has to make some assumptions. [snip
> elaboration]

Exactly the opposite: effectively the Standard requires that
the implementation make _no_ assumptions, and simply proceed
blindly doing naive code generation.

>>> Yes. My feeling is that the keyword is there to allow for
>>> implementation defined behavior. [snip]
>
>> The Standard contradicts this idea.
>
> How about in earlier times?

The writing in the Standard givng the semantics of volatile is
unchanged since C90.

>>> [these lines from glen herrmannsfeldt]
>>> It does seem that the compiler should follow the code as
>>> written. If there is one reference to a variable, it should be
>>> referenced once, for the implementation dependent definition of
>>> reference.
>>>
>>> volatile int x;
>>> y=2*x;
>>> z=x+x;
>>>
>>> In this case, y should always be even, z has the possibility
>>> of not being even, and the compiler should allow for that.
>>
>> Not even that. Even ignoring the possible undefined behavior
>> because of overflow, after the second statement is done
>> y could have any value at all, because accessing 'x' in
>> 'z = x + x;' might have the side effect of storing into
>> y, and the compiler isn't allowed to know that or assume
>> that it doesn't happen. Any use of y after the second
>> assignment statement must refetch y, for just this reason.
>
> Even if y isn't volatile? [noting only x is volatile]

Yes. Accessing a volatile object may have any side effect(s)
whatsoever, including modifying unrelated variables; moreover
the Standard stipulates that such effects are unknown, in
particular to the implementation (which includes the compiler).
That's why no assumptions are safe across accessing a volatile.

Ben Bacarisse

unread,
Dec 22, 2012, 10:48:09 PM12/22/12
to
Yes, I think I understand your position now, but I am not 100% in
agreement. Unfortunately I have very little time, but a long, helpful
reply like this deserves *some* reply so I will do what I can...

You argument that the side effects referred to in 6.5 p2 can't include
those that occur due to volatile access does not hold water for me. The
standard defines what a side effect is as far as volatile objects are
concerned and its the access, not any other action that might occur at
some other time. With my reading of thing, simply declaring a volatile
object would not provoke undefined behaviour because it is only the
access that is the side effect -- not any consequences of it, not any
"out of band" updates to it that might occur.

Yes, the consequences of a volatile access are unconstrained, and the
standard has no business restricting what an implementation may do in
such circumstances, but the standard *does* have a legitimate role in
determining that some programs are undefined as far it is concerned.
One thing that makes the behaviour undefined is the situation described
in 6.5 p2.

--
Ben.

glen herrmannsfeldt

unread,
Dec 23, 2012, 4:38:04 AM12/23/12
to
Tim Rentsch <t...@alumni.caltech.edu> wrote:
> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
>> Tim Rentsch <t...@alumni.caltech.edu> wrote:

(snip)
>>> The only rule is that access to an object through a volatile-qualified
>>> type must be done "naively", ie, according to straightforward rules
>>> of expression evaluation and not optimized out. (The Standard has
>>> a more precise definition, but this is the gist.) Anything beyond
>>> that must be managed by the programmer, not the implementation.

But are the rules that straightforward? Exactly when is the
implementation allowed to sample the value? Is the access atomic or not?
What about metastability?

>> But if you can't say which ways data might be modified, then it is
>> pretty hard to expect compilers to account for those ways.

> Right, they can't. The Standard effectively prohibits them from
> even trying.

(snip, I wrote)

>> But the implementation has to make some assumptions. [snip
>> elaboration]

> Exactly the opposite: effectively the Standard requires that
> the implementation make _no_ assumptions, and simply proceed
> blindly doing naive code generation.

A very common problem in doing HDL (verilog or VHDL) design with
asynchrounous inputs is that some bits get latched on one clock
cycle and others don't. In some cases, a Gray code counter can
be used, such that only one bit changes, and either the previous
value or new value is latched, but no other value.

Yet I beleive that the implementation doesn't have to worry
about that.

That is not quite the same as atomic access. If, for example,
one was doing asynchronous I/O the device may write bytes, and
only some bytes of a value may have changes.

Values may load or store in bytes or words smaller than the
data type in question, even though it is a single operation
at the C level. Should the implementation assume that it
does or doesn't?

Finally, should the implementation assume that no alpha particles
come through and flip bits stored in memory?

This is reminding me of my least favorite part of mathematics,
where you figure out the axioms, the things that you can assume
without need to prove them. You do have to start somewhere, and
those may seem so obvious, but maybe not always so obvious.

(And, if I remember right, you were a math major.)

-- glen

Tim Rentsch

unread,
Dec 23, 2012, 12:55:19 PM12/23/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:

> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
>>> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>>>
>>>> The only rule is that access to an object through a volatile-qualified
>>>> type must be done "naively", ie, according to straightforward rules
>>>> of expression evaluation and not optimized out. (The Standard has
>>>> a more precise definition, but this is the gist.) Anything beyond
>>>> that must be managed by the programmer, not the implementation.
>
> But are the rules that straightforward?

Yes, the rules of expression evaluation are straightforward.
Statements are done one at a time, in order; values of operands
are calculated before the operation on those operands starts;
side effects of an operation start when the operation starts and
may finish any time before finishing the (largest) containing
expression (or reaching the next sequence point, whichever comes
first). I'm sure there are some details that I've left out, but
the character of those is similar to the ones here.

> Exactly when is the implementation allowed to sample the value?

The same as any other object access under a naive implementation
of the abstract machine.

> Is the access atomic or not? What about metastability?

Implementations are allowed, and in fact required, to define
how accesses through volatile-qualified references are done.
So /how/ a volatile object is read, or written, is up to the
implementation. But /what happens/ as a consequence of any
such access is outside the domain of what the implementation
is allowed to know.
Neither. The implementation defines how access to a volatile
object is to be done; if that method of access gets good values,
then it gets good values, and if it gets bad values, then it gets
bad values. The implementation is not responsible for hardware
that behaves in ways the implementation doesn't expect; in
effect, the implementation states its expectations, and if those
expectations are met by the underlying hardware then everything's
jake, and if not, then, well, the program shouldn't be using
volatile to access those memory locations. There is no guarantee
that volatile access maps onto something useful in the face of
peculiar hardware mechanisms.


> Finally, should the implementation assume that no alpha particles
> come through and flip bits stored in memory?

An implementation may expect it is running on a data-processing
system capable of executing the compiled code faithfully. Whether
or not that is true (including any effects of alpha particles) is
explicitly outside the scope of what the Standard addresses.

> This is reminding me of my least favorite part of mathematics,
> where you figure out the axioms, the things that you can assume
> without need to prove them. You do have to start somewhere, and
> those may seem so obvious, but maybe not always so obvious.

I think the problem you're having is either, one, thinking the
Standard is defining how actual program execution will behave, or
two, that implementations are responsible for what actually occurs
during program execution. Neither of these is true. The Standard
gives requirements for how implementations (ie, mostly compilers)
must behave, and what they are obligated to provide in their
documentation, but this does not extend to managing arbitrarily
unruly hardware. A implementation of C does what it does; if
that is helpful for a particular operating environment, fine, but
if not then C expects developers to go outside the language and
use other mechanisms. C is not expected to be a solution to all
problems, and I believe that is a strength, not a weakness.

================

P.S. I hate the way your newsreader does its quoting. The
blank lines put in the middle of blocks of quoted text makes
the earlier conversations harder to follow, not easier.

Tim Rentsch

unread,
Dec 23, 2012, 1:22:59 PM12/23/12
to
(Since you have responded briefly I will also, on just one
point. We can get back to the others sometime later if
that's important.)

Can you identify for me which scalar object is modified
twice as a result of executing 'int i = x + x;', with x
being declared volatile? Remember, a volatile-qualified
access is a side effect, but it is a non-specific one, not
one that modifies the object being accessed. There is no
guarantee that the side effect of accessing a particular
volatile object will modify any given scalar object, or even
any object at all.

If an implementation cannot identify a scalar object that is
going to be modified twice by the volatile accesses, then it
must behave as if 6.5 p2 has not been violated (by those
accesses), for indeed it may not have been.

Shao Miller

unread,
Dec 24, 2012, 4:23:04 AM12/24/12
to
On 12/22/2012 12:16, Tim Rentsch wrote:
> Shao Miller <sha0....@gmail.com> writes:
>
>> On 12/22/2012 01:54, Tim Rentsch wrote:
>>> Shao Miller <sha0....@gmail.com> writes:
>>>
>>>>> [..snip..snip..snip..]
>>>>
>>>> I read the other response. [...snip...]
>>>
>>> Then apparently you didn't understand it.
>>
>> Well in that case, I'd really like to enhance that understanding,
>> if possible. :) [snip]
>
> My suggestions are: read more carefully; think more deeply; try
> to organize your thoughts more systematically; and make an effort
> in your writing to express ideas more clearly and more concisely.
>

The response immediately above does not appear to be an instance of
valuable discussion. A response to the points, instead of the person,
would be more valuable.

"Read more carefully":

3.1
1 access
〈execution-time action〉 to read or modify the value of an object
2 NOTE 1 Where only one of these two actions is meant, ‘‘read’’ or
‘‘modify’’ is used.
...

5.1.2.3 Program execution
...
2 Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution environment.
Evaluation of an expression in general includes both value computations
and initiation of side effects. Value computation for an lvalue
expression includes determining the identity of the designated object.
...

An access to a volatile object, not just a modification, is a side
effect, given just these. If only a modification of a volatile is
always to be considered a side effect in all cases, then either 3.1p2
has missed 5.1.2.3p2, or 5.1.2.3p2 requires further refinement.

6.7.3 Type qualifiers
...
7 An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects.
Therefore any expression referring to such an object shall be evaluated
strictly according to the rules of the abstract machine, as described in
5.1.2.3. Furthermore, at every sequence point the value last stored in
the object shall agree with that prescribed by the abstract machine,
except as modified by the unknown factors mentioned previously.134) What
constitutes an access to an object that has volatile-qualified type is
implementation-defined.
...

If you believe that this last sentence means that an implementation has
license to define that a read of a volatile-qualified scalar's value
does not constitute a side effect, please say so. If you believe that
the last sentence is a refinement of 5.1.2.3p2, please say so. If you
mean to support such an argument by noting that such a read might not
(or does not) result in a change in the state of the execution
environment (from the perspective of the abstract machine, perhaps),
please say so.

6.5 Expressions
...
2 If a side effect on a scalar object is unsequenced relative to
either a different side effect on the same scalar object or a value
computation using the value of the same scalar object, the behavior is
undefined. If there are multiple allowable orderings of the
subexpressions of an expression, the behavior is undefined if such an
unsequenced side effect occurs in any of the orderings.84)
...

If a read of a volatile scalar is always a side effect, then two reads
of the same volatile scalar are two side effects. If two such reads are
unsequenced, then two such reads are two unsequenced side effects on the
same scalar object. If this is the case, the behaviour is undefined.
If, mysteriously, only one of two such reads is a side effect, the other
is a value computation of the same scalar object, and the behaviour is
undefined. If neither such read is a side effect, then the behaviour is
not undefined. If you disagree with this logic, please say so.

volatile int x = 42;
int y = x + x;

If the behaviour is always undefined, then we can be careful to avoid it
and implementations might choose to warn about it. This seems
reasonable, to me. I suspect this is the case.

If the behaviour is always implementation-defined, then we can always
determine the behaviour by reading the implementation's definition for
what constitutes a volatile access. This seems reasonable, to me. Is
this what you suggest?

Are there other possibilities?

- Shao Miller

Ken Brody

unread,
Dec 26, 2012, 1:29:12 PM12/26/12
to
On 12/22/2012 3:15 AM, Tim Rentsch wrote:
> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
[... mega-snip ...]
>> volatile int x;
>> y=2*x;
>> z=x+x;
>>
>> In this case, y should always be even, z has the possibility
>> of not being even, and the compiler should allow for that.
>
> Not even that. Even ignoring the possible undefined behavior
> because of overflow, after the second statement is done
> y could have any value at all, because accessing 'x' in
> 'z = x + x;' might have the side effect of storing into
> y, and the compiler isn't allowed to know that or assume
> that it doesn't happen. Any use of y after the second
> assignment statement must refetch y, for just this reason.

I agree with just about everything you've said, up until this point.
Assuming that "y" is a non-volatile value, then the accesses of "x" in
"z=x+x" are not allowed to change the value of "y". (And, if "y" is a
volatile value, then it can change at any time, for any reason, and the
accesses of "x" are irrelevant to this fact.)

Ken Brody

unread,
Dec 26, 2012, 1:35:58 PM12/26/12
to
On 12/22/2012 12:32 PM, Tim Rentsch wrote:
> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
>
>> Tim Rentsch <t...@alumni.caltech.edu> wrote:
[...]
>>>> volatile int x;
>>>> y=2*x;
>>>> z=x+x;
>>>>
>>>> In this case, y should always be even, z has the possibility
>>>> of not being even, and the compiler should allow for that.
>>>
>>> Not even that. Even ignoring the possible undefined behavior
>>> because of overflow, after the second statement is done
>>> y could have any value at all, because accessing 'x' in
>>> 'z = x + x;' might have the side effect of storing into
>>> y, and the compiler isn't allowed to know that or assume
>>> that it doesn't happen. Any use of y after the second
>>> assignment statement must refetch y, for just this reason.
>>
>> Even if y isn't volatile? [noting only x is volatile]
>
> Yes. Accessing a volatile object may have any side effect(s)
> whatsoever, including modifying unrelated variables; moreover
> the Standard stipulates that such effects are unknown, in
> particular to the implementation (which includes the compiler).
> That's why no assumptions are safe across accessing a volatile.

If "y" is not defined as "volatile', then the compiler is free to assume
that its value does not change unless explicitly changed. Otherwise, the
mere presence of an access to a volatile object would mean that every byte
of memory could change, and all optimization based on "this value won't
change" must be thrown out the window.

Ben Bacarisse

unread,
Dec 27, 2012, 5:34:56 AM12/27/12
to
Tim Rentsch <t...@alumni.caltech.edu> writes:
<snip -- I hoe too dramatic a snip>
>> Yes, the consequences of a volatile access are unconstrained, and the
>> standard has no business restricting what an implementation may do in
>> such circumstances, but the standard *does* have a legitimate role in
>> determining that some programs are undefined as far it is concerned.
>> One thing that makes the behaviour undefined is the situation described
>> in 6.5 p2.
>
> (Since you have responded briefly I will also, on just one
> point. We can get back to the others sometime later if
> that's important.)
>
> Can you identify for me which scalar object is modified
> twice as a result of executing 'int i = x + x;', with x
> being declared volatile? Remember, a volatile-qualified
> access is a side effect, but it is a non-specific one, not
> one that modifies the object being accessed. There is no
> guarantee that the side effect of accessing a particular
> volatile object will modify any given scalar object, or even
> any object at all.

Agree, but I don't see why that is relevant. The wording has changed, I
thought, so that side-effects of any type are the issue. So 6.5 p2
talks only about side-effects on scalar objects, and simply accessing a
volatile object constitutes a side effect. What, if anything, changes
it not obviously relevant.

> If an implementation cannot identify a scalar object that is
> going to be modified twice by the volatile accesses, then it
> must behave as if 6.5 p2 has not been violated (by those
> accesses), for indeed it may not have been.

My view is that 6.5. p2 makes it undefined if there are two unsequenced
accesses not matter that the actual consequences are.

--
Ben.

Tim Rentsch

unread,
Dec 27, 2012, 10:55:20 AM12/27/12
to
Ken Brody <kenb...@spamcop.net> writes:

> On 12/22/2012 3:15 AM, Tim Rentsch wrote:
>> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
> [... mega-snip ...]
>>> volatile int x;
>>> y=2*x;
>>> z=x+x;
>>>
>>> In this case, y should always be even, z has the possibility
>>> of not being even, and the compiler should allow for that.
>>
>> Not even that. Even ignoring the possible undefined behavior
>> because of overflow, after the second statement is done
>> y could have any value at all, because accessing 'x' in
>> 'z = x + x;' might have the side effect of storing into
>> y, and the compiler isn't allowed to know that or assume
>> that it doesn't happen. Any use of y after the second
>> assignment statement must refetch y, for just this reason.
>
> I agree with just about everything you've said, up until this
> point. Assuming that "y" is a non-volatile value, then the
> accesses of "x" in "z=x+x" are not allowed to change the value
> of "y". [snip elaboration]

Look at the wording of 6.7.3 p 6, specifically the first
sentence:

An object that has volatile-qualified type may be modified
in ways unknown to the implementation _or have other unknown
side effects_. [my emphasis]

The Standard puts no limitations on what the unknown side effects
might be, only that they are unknown (ie, both to the Standard
and to the implementation). Hence those unknown side effects may
include changing (some of) the memory locations holding 'y';
whatever occurs is, by definition, outside the ability of the
implementation to control or even be aware of.

Tim Rentsch

unread,
Dec 27, 2012, 11:01:05 AM12/27/12
to
Ken Brody <kenb...@spamcop.net> writes:

> On 12/22/2012 3:15 AM, Tim Rentsch wrote:
>> glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
> [... mega-snip ...]
>>> volatile int x;
>>> y=2*x;
>>> z=x+x;
>>>
>>> In this case, y should always be even, z has the possibility
>>> of not being even, and the compiler should allow for that.
>>
>> Not even that. Even ignoring the possible undefined behavior
>> because of overflow, after the second statement is done
>> y could have any value at all, because accessing 'x' in
>> 'z = x + x;' might have the side effect of storing into
>> y, and the compiler isn't allowed to know that or assume
>> that it doesn't happen. Any use of y after the second
>> assignment statement must refetch y, for just this reason.
>
> I agree with just about everything you've said, up until this
> point. Assuming that "y" is a non-volatile value, then the
> accesses of "x" in "z=x+x" are not allowed to change the value
> of "y". [snip elaboration]

[correcting the bad reference in my last reply]

Look at the wording of 6.7.3 p 7, specifically the first

Tim Rentsch

unread,
Dec 27, 2012, 12:03:21 PM12/27/12
to
Actually the implementation is prohibited from making any
such assumption (or more accurately, acting on one). Again
look at the wording of 6.7.3 p 7, this time the second
sentence:

Therefore any expression referring to such an object shall
be evaluated strictly according to the rules of the abstract
machine, as described in 5.1.2.3.

The expression 'z=x+x;' must be evaluated (ie, in the actual
machine) strictly according to the rules of 5.1.2.3. Part of
those rules require that the evaluations (again, in the actual
machine) of all previous expressions be done /before/ evaluating
'z=x+x;', because of the sequence point between this full
expression and the one before, and the evaluations (again, in the
actual machine) of all subsequent expressions be done /after/
evaluating 'z=x+x;', because of the sequence point between this
full expression and the one after. These evaluations include any
access to y; in particular, any use of y's value in an expression
subsequent to 'z=x+x;' must actually read the corresponding
object, and not rely on some cached or assumed value (for the
first such subsequent access; naturally others may be optimized).
Otherwise the stipulations of 6.7.3 p 7 and 5.1.2.3 are not being
followed.

> Otherwise, the mere presence of an access to a volatile object
> would mean that every byte of memory could change, and all
> optimization based on "this value won't change" must be thrown
> out the window.

If we take the Standard at its word, then Yes: any access to a
volatile object prevents all optimizations from flowing across
said access. (Optimizations in between two volatile accesses are
allowed, of course.) Do you have some reason to believe the
meaning intended for this part of the Standard is different from
what the text apparently implies? More specifically, do you have
any evidence to offer that might support such a belief?

Tim Rentsch

unread,
Dec 27, 2012, 1:16:54 PM12/27/12
to
First let me be sure I understand you correctly. The wording
describing sequencing and access rules obviously has changed
between, well, let's be specific, N1256 and N1570. Do you
believe the intended meaning of these passages is significantly
different between these two versions of the Standard? I don't
think they are (or maybe I should say I believe they are not)
significantly different. AFAIAA the intended meanings of N1256
and N1570 are (for these areas) basically the same, expect that
N1570 uses more precise language and eliminates some potential
ambiguities there were present in N1256. (There is more to say
on this topic but I expect it's a side issue so I will stop
here--can expand later if that's needed.)

Second, sticking just to N1570, obviously which object is
modified makes a difference. Consider:

int i, j, k, *p = &k;
volatile int u, v;

++i + ++j; // okay
++i + ++i; // UB
++i + ++*p; // okay
++k + ++*p; // UB
i = j + u; // okay (I think most will agree)
i = j + u + v; // okay (IMO although some may argue)
i = j + u + u; // the case at issue (or much like it)

All statements have multiple side effects in them. For the lines
not involving any volatiles, the lines marked UB have undefined
behavior because a scalar object (i in one case, k in the other)
is being modified twice in unsequenced evaluations.

For the lines with volatiles, how are the conditions of 6.5 p 2
satisfied? As far as the implementation knows the only scalar
object being modified is i. Since the implementation cannot know
what (other) scalar objects (or indeed if any) are modified, or if
i is otherwise accessed, we cannot conclude there is undefined
behavior in these statements. Indeed on any actual data processing
system I'm aware of, the only object that would be modified is i,
nor would i be read. Ergo the conditions of 6.5 p 2 are not met in
such cases. Hence it does not apply, and the implementation (which
cannot know if it does) must act accordingly -- ie, as if the
volatle accesses don't do anything but access the variables
involved. (Of course the other rules about strictly following
5.1.2.3 must be followed also.)

>> If an implementation cannot identify a scalar object that is
>> going to be modified twice by the volatile accesses, then it
>> must behave as if 6.5 p2 has not been violated (by those
>> accesses), for indeed it may not have been.
>
> My view is that 6.5. p2 makes it undefined if there are two
> unsequenced accesses not matter that the actual consequences
> are.

How do you reconcile this viewpoint with 6.5 p 2 talking only
about cases where there is a side effect on a scalar object?
Certainly there are other kinds of side effects, such as
modifying floating point status bits, that are known not to
involve modifying any scalar object. Surely 6.5 p 2 is meant
to apply only to accesses that modify a (particular) scalar
object, and also have another unsequenced access to that same
scalar object -- isn't it?

Maybe I see what you're getting at -- accessing a volatile object
is supposed to count as a side effect "on that scalar object". But
I don't see any text in the Standard that supports that conclusion.
Can you offer any? Or am I still misunderstanding you?

Tim Rentsch

unread,
Dec 27, 2012, 2:07:13 PM12/27/12
to
Shao Miller <sha0....@gmail.com> writes:

> On 12/22/2012 12:16, Tim Rentsch wrote:
>> Shao Miller <sha0....@gmail.com> writes:
>>
>>> On 12/22/2012 01:54, Tim Rentsch wrote:
>>>> Shao Miller <sha0....@gmail.com> writes:
>>>>
>>>>>> [..snip..snip..snip..]
>>>>>
>>>>> I read the other response. [...snip...]
>>>>
>>>> Then apparently you didn't understand it.
>>>
>>> Well in that case, I'd really like to enhance that understanding,
>>> if possible. :) [snip]
>>
>> My suggestions are: read more carefully; think more deeply; try
>> to organize your thoughts more systematically; and make an effort
>> in your writing to express ideas more clearly and more concisely.
>
> The response immediately above does not appear to be an instance of
> valuable discussion. A response to the points, instead of the person,
> would be more valuable. [snip]

What I was responding to was your statement. You said you'd like
a better understanding, and I gave some suggestions for what to
do to achieve that. I'm sorry if my ideas there weren't along
the same lines as your own. If you're still interested, how
about going back to my long posting in the other subthread and
responding to that by just paraphrasing what I was saying there
to make sure you're understanding my comments. I think that
would be a good way to continue.

Ken Brody

unread,
Dec 27, 2012, 9:10:17 PM12/27/12
to
Perhaps, but the compiler is still free to assume that anything *not* marked
"volatile" *won't* change unless explicitly changed.


Ken Brody

unread,
Dec 27, 2012, 9:31:18 PM12/27/12
to
(Sorry for not snipping, as there didn't appear to be anything to snip.)
Well, I will fully admit that I am no expert in Standardese. However, I
find it hard to believe that the intent was that *any* access to *any*
volatile object means that the compiler *must* assume that *all*
non-volatile objects may have changed value.

Consider:

extern volatile int x;

int foo(void)
{
const int y = x;
int z = x + x;

/* Must the compiler assume that "y" may have changed? */

return y;
}

I can tell you that "gcc -O9 -ansi" assumes that "y" has not changed.
(Ditto without the "const".)

Philip Lantz

unread,
Dec 28, 2012, 4:39:00 AM12/28/12
to
Tim Rentsch wrote:
> Ken Brody writes:
> > Tim Rentsch wrote:
> object, ...

Okay, but who's to say where the corresponding object is? If y is not
volatile and its address is not taken, isn't the compiler free to store
its value wherever it likes? And isn't the compiler free to change that
location at its convenience? If whatever machinery lives behind the
volatile nature of x is able to find where the compiler is keeping y at
the present time, then let it go ahead and change it if it can. That
doesn't force the compiler to alter its behavior with respect to y.

Philip Lantz

unread,
Dec 28, 2012, 4:55:57 AM12/28/12
to
Ken Brody wrote:
> Tim Rentsch wrote:
Another way to look at this (as I alluded to in my previous post) is
that gcc chooses to store y in eax (assuming for the sake of discussion
an x86 compiler; substitute whatever location your favorite compiler
uses for an int return value). I assume that all will agree that it is
the compiler's prerogative where to store the values of variables. If
the subsequent accesses to x manage to change eax, then y will change.

(Not that it matters for this theoretical discussion, but this isn't
totally hypothetical: the access to x could trigger a virtualization
event or an SMI, and the VMM or SMM could easily change the value in a
register.)

glen herrmannsfeldt

unread,
Dec 28, 2012, 6:31:40 AM12/28/12
to
Philip Lantz <p...@canterey.us> wrote:
> Tim Rentsch wrote:
>> Ken Brody writes:
(snip)
It used to be that computers had front panels with switches and lights
that would let you stop programs, alter memory, and continue.

Now, many debuggers let us do that.

I suppose one could use volatile to allow one to change variables
at surprising times through the front panel (or virtualization
of one) or a debugger.

-- glen

Keith Thompson

unread,
Dec 28, 2012, 11:47:19 AM12/28/12
to
glen herrmannsfeldt <g...@ugcs.caltech.edu> writes:
[...]
> It used to be that computers had front panels with switches and lights
> that would let you stop programs, alter memory, and continue.
>
> Now, many debuggers let us do that.
>
> I suppose one could use volatile to allow one to change variables
> at surprising times through the front panel (or virtualization
> of one) or a debugger.

More likely you already *can* change variables through such means;
using volatile lets the program behave properly in the presence of
such changes.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Tim Rentsch

unread,
Dec 29, 2012, 4:43:09 PM12/29/12
to
What was/is expected, or intended, by the various Standards' authors
is a much more complicated question. Certainly it seems likely that
different people had different understandings, and that in some
cases even those who read the exact wording carefully and agreed to
it didn't fully realize the implications. Putting the issue in the
form of questions, we might ask: Did the C11 committee intend a
(single) other interpretation? If so, what is it? Is there any
evidence (eg, in DR's, Rationale documents, or other committee
writings) that supports a different interpretation, or that some
particular alternative interpretation was generally intended? To me
the last question is the key one -- unless there is some evidence
supporting a different viewpoint, the phrasing used in the Standard
implies accessing a volatile must act as a strict demarcation point
between all previous evaluations and all subsequent evaluations,
even if that has severe consequences for optimization.

> Consider:
>
> extern volatile int x;
>
> int foo(void)
> {
> const int y = x;
> int z = x + x;
>
> /* Must the compiler assume that "y" may have changed? */
>
> return y;
> }
>
> I can tell you that "gcc -O9 -ansi" assumes that "y" has not
> changed. (Ditto without the "const".)

Descrepancies like this one have been observed and discussed both in
the newsgroups and in at least one widely cited paper. These
discussions clearly show that there are different opinions about
what the Standard is supposed to mean; however, that by itself
doesn't say anything about what the actual text of the Standard
requires when read at face value, or about how its authors expected
it to be read or what they intend it to mean.

That last paragraph didn't come out too well. Basically, what
the Standard says is not the same as what this or that group
thinks it says. I believe the behavior with gcc is an instance
of the latter, not the former.

Tim Rentsch

unread,
Dec 29, 2012, 8:45:06 PM12/29/12
to
I believe it's true that an independent scalar variable may be kept
in any single location, and even that the single location may change
as a function of the progress in executing the surrounding function.
But that's not really the issue here; the question is what happens
if y is "stored" in one place but its value is taken from another,
or more elaborate cases where the value could otherwise be assumed
to be unchanging, but now it can't be, eg,

y = 7;
z = x + x; // x is volatile
if( y & 1 ){ // body might be skipped!
...
}

The if() expression must be re-evaluated because of the intermediate
volatile access, and that does not depend on which location holds
the "object" for y.

The rule is, whatever the compiler does, it must act like it is
storing y in a single, addressable object, and is accessing that
object for each evaluated expression that uses y. The rules for
volatile make it harder to disguise violations of that rule.

> If whatever machinery lives behind the volatile nature of x is
> able to find where the compiler is keeping y at the present time,
> then let it go ahead and change it if it can.

The 'volatile daemon' machinery always can, because the compiler has
to pick first -- the compiler is not allowed to know what the
machinery will do, but the machinery always can discover what the
compiler did, and then adjust its actions accordingly. This
'not-possible-for-the-compiler-to-know' is the essence of volatile.

> That doesn't force the compiler to alter its behavior with respect
> to y.

In fact it does, for some behaviors that would otherwise be
admissible in the absence of an intervening volatile access. Having
y be in a "suprising" location doesn't change the implications for
what the compiler may assume (or more pointedly, what it must not
assume) after the point of a volatile access.

Tim Rentsch

unread,
Dec 29, 2012, 9:03:18 PM12/29/12
to
This assertion doesn't jibe with what the Standard says about how
expressions that access volatiles must behave, as I have explained
in more detail elsethread.

glen herrmannsfeldt

unread,
Dec 29, 2012, 11:06:12 PM12/29/12
to
Tim Rentsch <t...@alumni.caltech.edu> wrote:
> Philip Lantz <p...@canterey.us> writes:

(snip)
>> Okay, but who's to say where the corresponding object is? If y is
>> not volatile and its address is not taken, isn't the compiler free
>> to store its value wherever it likes? And isn't the compiler free
>> to change that location at its convenience?

> I believe it's true that an independent scalar variable may be kept
> in any single location, and even that the single location may change
> as a function of the progress in executing the surrounding function.
> But that's not really the issue here; the question is what happens
> if y is "stored" in one place but its value is taken from another,
> or more elaborate cases where the value could otherwise be assumed
> to be unchanging, but now it can't be, eg,

> y = 7;
> z = x + x; // x is volatile
> if( y & 1 ){ // body might be skipped!
> ...
> }

> The if() expression must be re-evaluated because of the intermediate
> volatile access, and that does not depend on which location holds
> the "object" for y.

> The rule is, whatever the compiler does, it must act like it is
> storing y in a single, addressable object, and is accessing that
> object for each evaluated expression that uses y. The rules for
> volatile make it harder to disguise violations of that rule.

Well, with aliasing rules it almost has to do that anyway.
Fortran programmers complain about aliasing in C, and how it
reduces the ability for optimization.

Now, if z and x aren't dereferencing any pointers, then maybe not,
but then again how hard do C compilers work to figure out what can
be moved, when aliasing is so likely.

>> If whatever machinery lives behind the volatile nature of x is
>> able to find where the compiler is keeping y at the present time,
>> then let it go ahead and change it if it can.

> The 'volatile daemon' machinery always can, because the compiler has
> to pick first -- the compiler is not allowed to know what the
> machinery will do, but the machinery always can discover what the
> compiler did, and then adjust its actions accordingly. This
> 'not-possible-for-the-compiler-to-know' is the essence of volatile.

>> That doesn't force the compiler to alter its behavior with respect
>> to y.

> In fact it does, for some behaviors that would otherwise be
> admissible in the absence of an intervening volatile access. Having
> y be in a "suprising" location doesn't change the implications for
> what the compiler may assume (or more pointedly, what it must not
> assume) after the point of a volatile access.

-- glen

Tim Rentsch

unread,
Jan 2, 2013, 1:08:17 PM1/2/13
to
Sure, having pointers makes it harder to do optimization, but
C compilers still do a pretty good job, and don't seem to have
too much trouble doing so. Also, having those things be in the
compiler gives tremendous leverage - I'd much rather program
in C with pointers than in Fortran without.

Ben Bacarisse

unread,
Jan 10, 2013, 12:26:27 PM1/10/13
to
I find it hard enough to follow what is written without trying to deduce
what is intended. Others have said that no change in meaning is
intended but who am I to say either way? My remarks were based solely
on what is written in N1570.

You ask several questions, but it's clear from your last remark what the
difference of option is, so permit me to answer only there.
No, that's exactly it. The new wording -- the switch from talking about
modification to talking about side effects -- now includes accesses to
volatile objects, at least in my naive reading of 5.1.2.3 p2:

"Accessing a volatile object, modifying an object, modifying a file,
or calling a function that does any of those operations are all side
effects,12) which are changes in the state of the execution
environment. Evaluation of an expression in general includes both
value computations and initiation of side effects. Value computation
for an lvalue expression includes determining the identity of the
designated object."

So "accessing a volatile object" and "modifying an object" are both side
effects expressed in exactly the same terms. Neither is explicitly said
to be "on the object" so I don't see how can you say that one is "on the
object" but the other is not. I'd argue that if the intent is that
modification of an object is a side effect on the object, but accessing
a volatile object is not, then that distinction that needs to be made
clear. The default understanding has to be that they are both
side effect on an object.

--
Ben.

Tim Rentsch

unread,
Jan 12, 2013, 3:38:13 PM1/12/13
to
For giving answers in comp.lang.c, I often (or usually?) find
that looking for the intended meaning results in better answers,
ie, answers that are more consistent with statements made in
Defect Reports and the Rationale document, etc, than reading just
what's written in (one revision of) the Standard. It's different
in comp.std.c -- there the whole point is to talk about whether
or how well the Standard expresses, or could express, what is
intended (or sometimes should be intended). Here though I asked
the question only to better understand your opinion or point of
view. Who are you to say? Just the foremost and most reliable
authority as to your own conscious thoughts, and without doubt
the person most qualified to answer the question I was asking.

> You ask several questions, but it's clear from your last remark what
> the difference of option is, so permit me to answer only there.

Likewise (presuming for 'option' you meant 'opinion').
I'm not sure if I'm shocked by this last statement or merely very
surprised. On the one hand it is clear from other passages in
the Standard that there are some kinds of side effects that are
not "on a scalar object"; but, the default understanding has to
be, when the Standard does _not_ say, that a side effect must be
a side effect on a scalar object (and not just some scalar object
but the same one)? Normally I'd expect the default assumption to
be just the opposite - that a particular condition is met only
when the Standard includes some sort of explicit statement as
to the condition in question. Beyond that general reaction,
however, let's look into some specifics.

The operators that directly change scalar objects are postfix ++
and --, prefix ++ and --, and assignment.

The semantics of postfix ++ (and --) are given in section 6.5.2.4,
which says in part:

As a side effect, the value of the operand object is
incremented (that is, the value 1 of the appropriate
type is added to it). [...] The value computation of
the result is sequenced before the side effect of
updating the stored value of the operand.

The semantics of prefix ++ (and --) are given in section 6.5.3.1,
which says in part:

The expression ++E is equivalent to (E+=1). See the
discussions of additive operators and compound assignment
for information on constraints, types, side effects, [and
some other things].

The semantics of assignment are given in section 6.5.16, which
says in part:

The side effect of updating the stored value of the left
operand is sequenced after the value computations of the
left and right operands.

To me all of these explicitly reference a particular object that
the side effect will modify. Compare the passages above to the
semantics of volatile access, given in 6.7.3 p7, which says in
part:

An object that has volatile-qualified type may be modified
in ways unknown to the implementation or have other unknown
side effects. Therefore any expression referring to such an
object shall be evaluated strictly according to the rules of
the abstract machine, as described in 5.1.2.3.

Here there is no statement about what the side effects of reading
a volatile object might be. They might modify the object being
read, or they might not; they might modify some other scalar
object, or they might not; they change some other physically
detectable state of the machine running the program, or they might
not. (The modifications mentioned in the first sentence may occur
at any time, indepedent of any access in program expressions.)
Certainly there is no explicit statement that reading a volatile
object modifies (or updates, or has a side effect on) an object.
That is a clear distinction with every other operation that might
be affected by 6.5 p2.

Finally, there is 5.1.2.3 p9:

EXAMPLE 1 An implementation might define a one-to-one
correspondence between abstract and actual semantics: at
every sequence point, the values of the actual objects would
agree with those specified by the abstract semantics. The
keyword volatile would then be redundant.

Under your theory, volatile would not be redundant in such cases,
because adding volatile would change the semantics of programs
like the sample code that started the discussion, and so the
above EXAMPLE would be contradicted. Admittedly, this paragraph
is informative rather than normative. But, considered along with
the other cited passages, the evidence favors the position that
reading a volatile-qualified object, even though it is a side
effect, is not in and of itself "a side effect on a scalar object"
as the phrase is used in 6.5 p2.

Does this convince you? If it doesn't, do you have any other
evidence to offer that reading a volatile object constitutes, for
the purpose of 6.5 p2, "a side effect on a scalar object"?
It is loading more messages.
0 new messages