a[a[0]] = 1; when a[0] begins with the value 0.
The general opinion was that the above invokes undefined behaviour due
to the fact that there is no sequence point in the expression, and
that the value of a[0] is used in a "value computation" unsequenced
with respect to the side-effect of the assignment operator. This is
slightly surprising, with a naive viewpoint that the above "should" be
the same as t = a[0], a[t] = 1; However, code like the above
expression is rare (I only found a single instance in the Linux kernel
source) - so this obscure case can probably be ignored to maintain the
simplicity of the sequence point rules in the standard.
However, there is another, much more common case similar to the above
which appears in real-world code:
node->next->next = something; when node->next begins with the value
'node'. (A circularly linked list one link long.)
Using the offsetof() macro, and the a[b] <---> *(a + b) identity, this
new expression may be transformed into the first undefined one. This
is a problem. Should such common practice be undefined? If not, then
there may be a defect in the standard.
One possible solution is to make pointer dereferencing (or accessing
an array) contain a sequence point between the calculation of the
pointer value and its dereference. Most users of C probably have the
mistaken impression that such a sequence point already exists. (The
abstract machine "obviously" needs to evaluate the value of an address
before that address may be dereferenced.) However, an argument can be
made that the current undefined behaviour allows compiler
optimizations that otherwise wouldn't be possible due to the knowledge
that the undefined situation cannot occur.
Steven
A regular expression that finds them (along with a few false
positives) is
grep -Er "\->(.*)\->\1 .* ?= " *
Here's what that finds (removing the != , ==, >= etc. cases) in the
Linux kernel source
arch/sparc/kernel/pci_psycho.c: pbm->sibling->sibling = pbm;
arch/sparc/kernel/pci_schizo.c: pbm->sibling->sibling = pbm;
net/tipc/bcast.h: item->next->next = NULL;
drivers/message/fusion/mptbase.c: ioc->alt_ioc->alt_ioc =
NULL;
drivers/message/fusion/mptbase.c: ioc->alt_ioc->alt_ioc =
NULL;
drivers/scsi/aic7xxx/aic79xx_core.c: next_scb->col_scb-
>col_scb = next_scb;
drivers/scsi/ipr.c: ipr_cmd->sibling->sibling = NULL;
drivers/scsi/aic7xxx_old.c: prev_p->next->next =
current_p;
drivers/scsi/aic7xxx_old.c: prev_p->next->next =
current_p;
Some of the above are false-positives because the pointer chain is
known to never to loop back to the first pointer.
Steven
>However, there is another, much more common case similar to the above
>which appears in real-world code:
>
>node->next->next = something; when node->next begins with the value
>'node'. (A circularly linked list one link long.)
This seems like a very odd thing to do with a circular list. Can you
suggest why it might happen in a real program?
I conjecture that such code probably has a bug in it anyway.
-- Richard
--
Please remember to mention me / in tapes you leave behind.
It reminds me of a simple loop detection algorithm (use two pointers;
for every iteration, one pointer advances one step and the other two
steps; if there is a loop, the "fast one" will eventually catch up with
the "slow one" from behind)
However, you wouldn't advance the "fast pointer" two steps without an
intermediate NULL check - unless you already knew that the list
looped... which of course a circular linked list does, by design.
BTW, the a[a[0]] construct the OP mentioned earlier reminds me of RC4...
DES
--
Dag-Erling Smørgrav - d...@des.no
>> > node->next->next = something; when node->next begins with the value
>> > 'node'. (A circularly linked list one link long.)
>> This seems like a very odd thing to do with a circular list. Can you
>> suggest why it might happen in a real program?
>It reminds me of a simple loop detection algorithm (use two pointers;
>for every iteration, one pointer advances one step and the other two
>steps; if there is a loop, the "fast one" will eventually catch up with
>the "slow one" from behind)
>
>However, you wouldn't advance the "fast pointer" two steps without an
>intermediate NULL check - unless you already knew that the list
>looped... which of course a circular linked list does, by design.
Remember that this problem - if it in fact exists - only arises if you
*assign* to node->next->next when it's circular. Merely examining it
is certainly ok.
I don't see how the quoted statement can invoke undefined behaviour,
assuming next is a pointer to the next node in an ADT..(if it were a
flexible array member, then yes, it is potentially dangerous, but why
would anyone dereference a flexible array member is beyond me). It is
not at all equivalent with the a[a[0]] = foo; statement, either. There
are no side effects with the use of the arrow operator, and there is
nothing obscure about it either.
Overall, nice attempt to troll.
Did you read what sfuerst wrote? Let me spell it out for you:
#include <stdlib.h>
void list(void)
{
struct node { int data; struct node *next; };
struct node *node;
node = calloc(1, sizeof *node);
node->next = node;
node->next->next = NULL;
}
void array(void)
{
int a[5] = { -1, -1, -1, -1, -1, };
a[0] = 0;
a[a[0]] = -1;
}
The first example is a classical singly-linked list with only one
element. The second can also be viewed as a singly-linked list where
each element in the array is the index of the next element in the list,
and -1 is the equivalent of NULL - although it's not very useful, since
there is nowhere to store any actual data, unless you add a second array
b of the same size, such that a[i] is the "next pointer" for the element
stored in b[i].
On a related note, a novice programmer could be excused for thinking
that the first two assignments in list() could be collapsed into one:
node->next = node = calloc(1, sizeof *node);
It seems logical, but invokes undefined behavior.
> Overall, nice attempt to troll.
Not very constructive. And people wonder why USENET is going the way of
the dodo...
a classical *circular* singly-linked list, that is. Never liked them,
myself. It's like spelling "banana" - you never know when to stop.
Check the standard, your system documentation, or any good book about C
(e.g. K&R 2) for an explanation of the difference between malloc() and
calloc().
C99 6.5p2: between two sequence points (here the whole assignment
expression), an assigned-to object (here, the "final" /next/ pointer)
shall have its value read only to determine the value to be stored
(here, clearly restricted to 'something', perhaps casted.)
And one may argue that in the above expression, the value is _also_ read
as part of the evaluation of the "intermediary" /next/ pointer, in order
to determine the object to be assigned.
> It is not at all equivalent with the a[a[0]] = foo; statement, either.
I believe Dag-Erling addressed this one.
> There are no side effects with the use of the arrow operator,
I fail to see any side effect with the [] operator either. Can you
elaborate?
> and there is nothing obscure about it either.
There is nothing obscure, just that we are in the grey areas of the
legal terms of the Standard.
> Overall, nice attempt to troll.
The thread is cross-posted to comp.lang.c and comp.std.c; in the latter
group (which is the one I read), I believe it is really on-topic.
Please take my post in this context, it might help you to understand my
point.
Antoine
An empty circular list with a sentinel node is exactly such a list.
Let's suppose ``node'' to be a pointer to such a list (specifically, to the
sentinel node).
Thus node->next is a pointer to the list head (the condition
(node->next == node indicating that the list is empty).
We might use node->next->next to insert a new item just after the first
node.
newnode->next = node;
node->next->next = newnode;
If the list is empty, something meaningful happens: the item simply
becomes the only item the list. That might be a bug, since
in that case the usual postcondition ``newnode is the second item in the
list'' isn't true. Or it might not be a bug. It depends on whether
a dependency on such a postcondition is coded into the program
elsewhere.
> Hello, back in 2002 there was a long discussion in these newsgroups
> about the undefinedness of the expression:
>
> a[a[0]] = 1; when a[0] begins with the value 0.
>
> The general opinion was that the above invokes undefined behaviour due
> to the fact that there is no sequence point in the expression, and
> that the value of a[0] is used in a "value computation" unsequenced
> with respect to the side-effect of the assignment operator.
These terms are used in the newer C1X drafts, N1336 and N1401.
Looking at N1401 under 6.5.16p3, it says that the side effect of
updating the left operand (of an assignment) is sequenced after
the value computations of the left and right operands. In other
words, under N1401 this assignment is well-defined.
So, you might want to reconsider the discussion in that thread
in light of Larry Jones's recent comment that the new "sequencing"
language was put in for threading support and doesn't change
the single thread semantics, which would mean both the 'a[a[0]] = 1;'
and the other case discussed below are well-defined, not undefined.
[my apologies if this got posted twice... newsreader confusion]
It is not me contending that undefined behaviour is invoked. That
answer seemed to be the consensus of the thread in 2002. It's start
is at:
http://groups.google.com/group/comp.std.c/browse_thread/thread/cffd61637f5520d/5ce672676f5ab8ef
and it is a particularly interesting read. The pointer-chain example
is introduced about halfway along.
I'd like to believe that most programmers assume that there is some
sort of temporal ordering between the calculation of the value of a
pointer, and that pointer's eventual dereference. Unfortunately, it
seems no such ordering is mandated by the standard itself. In the
thread in 2002, there were hypothetical C implementations discussed
that could crash, hang, or do other crazy things as a result of merely
executing compiled code corresponding to inserting something into a
circularly linked list. That such implementations could exist is a
rather obscure trap imho.
Steven
Thanks for this. I was reading N1336 (not N1404).
Of course the above still disallows tricks like:
a[a[0]=0]=1;
but I could live with that. :-)
Steven
I believe this is corrected by the new sequencing language in the C1X
draft (N1425 is the latest version).
--
Larry Jones
From now on, I'm devoting myself to the cultivation of
interpersonal relationships. -- Calvin
The assignment cannot modify a[0] until that assignment actually happens;
i.e. the modification of a[0] is the culmination of that assignment
(except for the part that the assignment expression also returns a value,
which is not relevant in the above).
The assignment cannot happen until the value being assigned is known (which it
always is since it is the constant 1), and it also cannot happen until the
location of the destination operand is established; if you don't yet know
/where/ to store the value, you cannot do the assignment.
To know where the assignment is going to be stored, we must evaluate
a[0], and so this must happen before the assignment occurs, which means
that it it uses the prior value of a[0].
The access to a[0] is therefore properly sequenced with regard to the
assignment: not by explicit sequence points but by a data flow dependency which
follows the rules of causality in our observable universe.
Outside of science fiction fantasies involving time travel, parallel universes,
or time going backwards in localized regions of time-space, the determination
of where an assignment is going to happen cannot occur after the assignment has
already happened.
> The general opinion was that the above invokes undefined behaviour due
General or not, that is not a particularly well-informed opinion.
C expression evaluation (oustide of the sequencing operators like comma or ||)
is not totally unordered. The standard says explicitly that the order
of evaluation of the /subexpressions/ of an operator is unspecified
and may even be parallel, and the order in which side effects take place
is unspecified:
6.5 Expresions
3 The grouping of operators and operands is indicated by the syntax.71)
Except as specified later (for the function-call (), &&, ||, ?:, and comma
operators), the order of evaluation of subexpressions and the order in which
side effects take place are both unspecified.
Note that this paragraph does not insist that there is never any
order among side effects and subexpressions. There are situations where
logic requires order.
And note that 6.5 doesn't say that the evaluation of operators may be reordered
with respect to their own subexpressions (that is illogical)! Only among the
subexpressions is the order unspecified. To evaluate an operator, you must
fetch the values of its subexpressions (except where short-circuiting permits
elision). Only after that can the operator be evaluated.
So what happens if the evaluation of a parent operator invokes a side effect,
which depends on the constituent subexpressions? In that case we cannot
conclude that the order is unspecified between that side effect and
the subexpressions.
The = operator in a[a[0]] = 1 operator has two subexpressions: a[a[0]] and 1.
The implementation may evalute these in either order, in accordance with 6.5p3.
It also has a side effect: that of storing to the lvalue. Certainly, that side
effect might be ordered arbitrarily with regard to some side effects elsewhere
in the expression, in accordance with 6.5.
However, the two subexpressions of the assignment cannot be reordered with
regard to the store side effect /of that same assignment/ operator. That is a
logical impossibility, since the assignment must evaluate its two operands
before producing that side effect.
This doesn't need to be spelled out in the standard, because it's not the sort
of document for readers who cannot recognize logical impossibilities.
> On 2010-01-14, sfuerst <svfu...@gmail.com> wrote:
>> Hello, back in 2002 there was a long discussion in these newsgroups
>> about the undefinedness of the expression:
>>
>> a[a[0]] = 1; when a[0] begins with the value 0.
<snip discussion of sub-expression sequencing>
>> The general opinion was that the above invokes undefined behaviour due
>
> General or not, that is not a particularly well-informed opinion.
<snip more about sequencing>
I don't disagree with your arguments about sequencing, logic, and
causality, but I don't see how you can dismiss the conclusion above so
lightly -- particularly without any reference to the text that, in my
view, renders it undefined.
The oft-quoted 6.5 paragraph 2 reads:
"Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be read only to
determine the value to be stored." [footnote numbers removed]
Surely the prior value of an object (a[0] -- which is having it's
stored value modified exactly once) is being read for some purpose
other than to determine the value to be stored?
You could argue that this shouldn't be undefined, or that the new
sequencing wording in n1401.pdf makes it no longer undefined; but that
it was undefined (and still is undefined) looks like a perfectly
reasonable conclusion.
--
Ben.
> It is not me contending that undefined behaviour is invoked. That
> answer seemed to be the consensus of the thread in 2002. It's start
> is at:
> http://groups.google.com/group/comp.std.c/browse_thread/thread/cffd61637f55=
> 20d/5ce672676f5ab8ef
> and it is a particularly interesting read. The pointer-chain example
> is introduced about halfway along.
I've paged through all 329 messages and searched for the string "*a".
I've also searched the current thread. Since I haven't found what I was
looking for, I'd like to add a reformulation of the original
a[a[0]] = 1;
statement. I apologize if it's trivial.
C89 6.3.2.1 Array subscripting
"The definition of the subscript operator [] is that E1[E2] is identical
to (*(E1+(E2)))."
C99 6.5.2.1 Array subscripting
"The definition of the subscript operator [] is that E1[E2] is identical
to (*((E1)+(E2)))."
Going with the latter, the statement can be rewritten as
(*((a)+((*((a)+(0)))))) = 1;
which can be simplified to
*(a + *a) = 1;
I'm sorry if this doesn't add anything to the discussion; my hope is
that it would.
lacos
This text can be parsed as:
[ Furthermore, the prior value shall be read only ] to
determine the value to be stored."
Indeed, before the store, the prior value shall not be written, but only read,
and this is of course necessary for determining the value to be stored.
See, people are just reading it wrong. There is no document defect to see
here, people; move along.
So this sentence does not unambiguously grant implementors a license
to gratuitously break code with cunning diagnostics (at least in
a mode that claims to be conforming).
> On 2010-01-15, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
>> Kaz Kylheku <kkyl...@gmail.com> writes:
>>
>>> On 2010-01-14, sfuerst <svfu...@gmail.com> wrote:
>>>> Hello, back in 2002 there was a long discussion in these newsgroups
>>>> about the undefinedness of the expression:
>>>>
>>>> a[a[0]] = 1; when a[0] begins with the value 0.
>><snip discussion of sub-expression sequencing>
>>>> The general opinion was that the above invokes undefined behaviour due
>>>
>>> General or not, that is not a particularly well-informed opinion.
>><snip more about sequencing>
>>
>> I don't disagree with your arguments about sequencing, logic, and
>> causality, but I don't see how you can dismiss the conclusion above so
>> lightly -- particularly without any reference to the text that, in my
>> view, renders it undefined.
>>
>> The oft-quoted 6.5 paragraph 2 reads:
>>
>> "Between the previous and next sequence point an object shall have
>> its stored value modified at most once by the evaluation of an
>> expression. Furthermore, the prior value shall be read only to
>> determine the value to be stored." [footnote numbers removed]
>
> This text can be parsed as:
>
> [ Furthermore, the prior value shall be read only ] to
> determine the value to be stored."
Are you being serious? It seems unlikely. In case you are, I have
these arguments:
It is true that "only" can either follow or precede the thing it
limits, but not, I think when followed by "to". The OED uses "only
to" as an example of only preceding the thing it limits (1899 Literary
Guide 1 Oct. 146/2 "Certain doctrines were imparted only to
initiates"). Your interpretation is more naturally written "the prior
value shall only be read to determine the value to be stored".
Your parse (if I have it right) is that to determine the value to be
stored, only reading of the prior value is permitted. What actions on
the prior value is your interpretation intended to prohibit? Would it
not prevent the value from being doubled to determine the value to
be stored (i = i*2;)?
In your reading of it, does the sentence have any purpose? I.e. what
kinds of expression would be defined were it not for that extra
sentence?
> Indeed, before the store, the prior value shall not be written, but only read,
> and this is of course necessary for determining the value to be
> stored.
Your phrase "the store" is a slight of hand. Which is "the store"
before which the prior value can not be written in:
x = i++;
and why would the standard wish to say which store comes first? The
prior value of i may very well be written (to x) before the store to
i. If you intended to limit this remark to expressions with only one
object being modified, the you are saying that "the prior value can't
be written before it is written".
> See, people are just reading it wrong. There is no document defect to see
> here, people; move along.
If you are right, there is a defect because the second example in
footnote 73 is then wrong. a[i++] = i; is fine by your reading, is it
not?
> So this sentence does not unambiguously grant implementors a license
> to gratuitously break code with cunning diagnostics (at least in
> a mode that claims to be conforming).
--
Ben.
I'm sorry, English is not my first language and it's not obvious to me what
you meant to say there -- do you mean that the "only" does not apply to
"determine" (i.e. only to determine the new value, but not for any other
purpose), but to "read" (i.e. only read, but not written anywhere, added to
anything, or processed in any other way)? This wouldn't seem to make a lot
of sense to me, so perhaps that's not what you meant?
Yep.
And note that the parse that many people are assuming rules out
the assignment operator completely. Code such as
i = 1;
is undefined. Here i is modified, but it is also read, not for
computing the value to be stored. The value of an assignment expression
is that of the lvalue, after the assignment, you see. So the lvalue
is read by this expression to fetch this value. That ``after
the assignment'' bit appears like it imposes an order, but it's
not a sequence point. So the expression statement i = 1; contravenes the
naive parsing of paragraph 2.
> limits, but not, I think when followed by "to". The OED uses "only
> to" as an example of only preceding the thing it limits (1899 Literary
> Guide 1 Oct. 146/2 "Certain doctrines were imparted only to
> initiates").
Here ``to initiates'' is a very different kind of clause from ``to
determine ...'', because the ``determine ...'' part is a full sentence
in its own right. Also, ``read'' and ``impart'' are semantically
different. The relationship between ``doctrine'' and ``impart''
is limited. What else can you do with a doctrine and an initiate,
besides impart? Why ``only impart''? Doctrines were only imparted on the
initiates (but not also tatooed on their foreheads?). Nah; such
tattooing is subsumed under extended semantics of imparting, right?
Read is different. In computing, we even have the phrase ``read-only'';
there is a relationship between read and only, which means that
writing is excluded.
> Your interpretation is more naturally written "the prior
> value shall only be read to determine the value to be stored".
True. All kinds of things are more naturally wirtten than some of the
long-winded gobbledygook in the ISO C standard.
> Your parse (if I have it right) is that to determine the value to be
> stored, only reading of the prior value is permitted.
Permitted, but of course not required.
> What actions on
> the prior value is your interpretation intended to prohibit? Would it
> not prevent the value from being doubled to determine the value to
> be stored (i = i*2;)?
Doubling does not destroy the value; it produces a new value which is
twice the previous one, in the absence of overflow.
Clearly the ``value'' here refers to manipulation of the object: the
stored value. The term value has multiple meanings; there is the stored
value in an object which can be read, or the value of an expression.
The value of the expression i*2 is no longer the stored value in the
object i. All that i*2 does to the stored value is read it.
> In your reading of it, does the sentence have any purpose?
I suspect that it doesn't, and that it's not alone in not having one.
> I.e. what
> kinds of expression would be defined were it not for that extra
> sentence?
None, but clearly, if a value modified by some store is written before
that store, that is not a good thing. This may not be; it may be read
only. :)
>> Indeed, before the store, the prior value shall not be written, but only read,
>> and this is of course necessary for determining the value to be
>> stored.
>
> Your phrase "the store" is a slight of hand.
Not intended.
> Which is "the store"
> before which the prior value can not be written in:
>
> x = i++;
the prior value of x cannot be written prior to the assignment x =
and the prior avlue of i cannot be written prior to the incremnt i++.
Each modified object has a prior value. Each modified object is modified
only once, hence it cannot be modified prior to that one and only
modification.
> and why would the standard wish to say which store comes first? The
It doesn't. The order in which the side effects to x and i happen
is not specified.
> prior value of i may very well be written (to x) before the store to
> i. If you intended to limit this remark to expressions with only one
> object being modified, the you are saying that "the prior value can't
Paragraph 2 is not limited to expressions with just one object modified,
but it's clear that for some object that is modified, the prior value
and store refer to the same object.
> be written before it is written".
Yes, precisely. Isn't that clear? If a value is written before it is
written, then it's modified twice. So this, uh, reinforces the first
sentence of the same paragraph. That's it!
>> See, people are just reading it wrong. There is no document defect to see
>> here, people; move along.
>
> If you are right, there is a defect because the second example in
> footnote 73 is then wrong. a[i++] = i; is fine by your reading, is it
> not?
No, it's not fine.
This expression is ruled out by the unspecified order of evaluation of
subexpressions (except for the noted operators) and by
the unspecified order of side effect completion.
The major connective of this expression is the = operator, whose
constituents are a[i++] and i, which, being subexpressions of an
unsequenced operator, may be evaluated in either order. Moreover, the
completion of the side effect emanating from the one subexpression is
not required as a dependency for the computation of the other. Clearly,
this is ambiguous. The i++ effect may complete before the i is accessed,
after, or could be in progress while i is accessed. You will have
an actual portability problem with actual compilers if you write
this expression.
Now we could add some superfluous text to the standard to try to capture
this idea, but it's not necessary; the undefinedness follows straight
from paragraph 3.
The defect is that the example in the footnote is irrelevant to
the paragraph to which the footnote is attached. That paragraph
does not render it undefined; however, the next paragraph does.
The footnote could be moved to the next paragraph and reworded
to say that it pertains to ``this and the preceding
paragraph''.
Mr. Rentsch clarified my point quite well on that one, I believe.
OTOH, if you still believe those two to be semantically identical I
cannot continue arguing.
> > There are no side effects with the use of the arrow operator,
>
> I fail to see any side effect with the [] operator either. Can you
> elaborate?
I can elaborate on the OPs original point, if you'd like. We just
solved the [] operator issue a few posts ago, correct?
> > and there is nothing obscure about it either.
>
> There is nothing obscure, just that we are in the grey areas of the
> legal terms of the Standard.
Well, perhaps we are referring to different versions of the Standard
(sic). My understanding is that even under N1124 the statement
wouldn't invoke UB - Mr. Kylheku clarified on that.
> > Overall, nice attempt to troll.
>
> The thread is cross-posted to comp.lang.c and comp.std.c; in the latter
> group (which is the one I read), I believe it is really on-topic.
> Please take my post in this context, it might help you to understand my
> point.
I'd start a thread on trolling but that'd be off-topic, wouldn't it. :-
P
@OP: I understand what the discussion's consensus was. Leaving outside
the fact that it was back in 2002 and the C language is still evolving
(thankfully), perhaps I'm interpreting the standard wrong, or I'm too
familiar with gcc features and extensions - it wouldn't be the first
time. However I've yet to see an argument that will convince me I'm
wrong through proper wording (maybe I'm just asking too much) or
perhaps a PoC program.
Did they add a new variety of sequence point?
--
frank
restored snip
****
>struct node { int data; struct node *next; };
>struct node *node;
>node = calloc(1, sizeof *node);
****
> > Just for curiosity: Does this has any benefits compared to
>
> > struct node { int data; struct node *next; } * node =
> > malloc( sizeof * node );
>
> > ? Especially, I refer to the use of »calloc«.
there are two changes.
1. compression of three statements into one
2. replacement of calloc() with malloc()
I can't see any point to 1. apart from increasing the obscurity of the
code.
> Check the standard, your system documentation, or any good book about C
> (e.g. K&R 2) for an explanation of the difference between malloc() and
> calloc().
so to repeat the question, does this have any benefits? We can all
read.
>> [ Furthermore, the prior value shall be read only ] to
>> determine the value to be stored."
>I'm sorry, English is not my first language and it's not obvious to me what
>you meant to say there -- do you mean that the "only" does not apply to
>"determine" (i.e. only to determine the new value, but not for any other
>purpose), but to "read" (i.e. only read, but not written anywhere, added to
>anything, or processed in any other way)? This wouldn't seem to make a lot
>of sense to me, so perhaps that's not what you meant?
It would not be a reasonable reading in English.
None. Of course, counting the initialization of the allocated buffer
to 0 as a gain is debatable.
It is not necessary to read i in order to obtain the assigned value.
tmp = 1; /* compute right hand side */
i = tmp; /* assign the computed value to i */
/* the assigned value is now available in tmp */
Can you point out to me the relevant post? (assuming it is not
<news:kfnfx68...@x-alumni2.alumni.caltech.edu>, which is not
cross-posted but that I spotted; and was written 4 hours after mine.)
> We just solved the [] operator issue a few posts ago, correct?
Sorry, I cannot follow your reasonning.
As I wrote, I am reading this thread from comp.std.c, and I believe you
are reading it exclusively from the comp.lang.c context. Also see below.
>>> and there is nothing obscure about it either.
>> There is nothing obscure, just that we are in the grey areas of the
>> legal terms of the Standard.
>
> Well, perhaps we are referring to different versions of the Standard
> (sic).
:-) Sorry for the pedantism, but please bear with me: ISO rules wants
the officially approved standards to be spelled (in English)
"International Standards", and are quite strict once the capitalization;
I just follow that move; please remember once again that I am reading
you from comp.std.c context.
Other than that, no, we are not referring to different versions,
I quoted C99 which is the one you ought to refer yourself. Having this
point clearer in C1x will not render the text of C99 more clear, much
the contrary: it highlights the fact that the area is grey in C90/C99.
I read <news:ooh427-...@jones.homeip.net> to be in the same line.
> My understanding is that even under N1124 the statement
> wouldn't invoke UB - Mr. Kylheku clarified on that.
I do not read <news:201001141...@gmail.com> as a clarification,
rather the contrary: it sparkled more discussion (at least here.)
> However I've yet to see an argument that will convince me I'm
> wrong through proper wording (maybe I'm just asking too much) or
> perhaps a PoC program.
I believe there is a basic misunderstanding here.
There is one thing which I believe is applicable here: a Standard (note
the capital) has to be written in common language and have a reasonable
size, and sometimes this cannot afford to cover all the possible cases;
in such a case, necessarily the Standard has to be conservative, and
underspecifies, or if you prefer, it should "outlaw" the cases which are
not possible to spell out clearly.
Note this is clearly different from Roman Law.
It is the job of the people writing the next version of the standard
(lower case here ;-) ) to further refine the text, in order to cover
more potential corner cases and if possible include them in the accepted
and well-defined behaviours; you might notice that revisions of the
Standards (and laws in general) are increasing in size, and this is one
of the main reasons; also this is a main topic of comp.std.c, to iron
out those corner cases and to improve the words if possible (whether it
succeeds is a debate I shan't enter.) If you are concerned about the
words (as you might imply above), then comp.std.c is the correct group
to discuss it; please join.
On the other hand, comp.lang.c should focus on the practical situations;
as Kaz correctly pointed out in the first part of his post, there are no
situations (we presently know of) where those expressions can be
mishandled; as such, there are no "proofs" to be presented. So the fact
the current words of the relevant Standards may, or may not, qualify
them as "undefined behaviour" is pretty immaterial to C programmers,
since every compiler and every compiled program will end with the
expected behaviour, or, if you prefer, "/no undefined behaviour can be/
/invoked/." As such, I believe it was an initial mistake to keep the
thread cross-posted, since it raised biaised discussion, it rather
should have been redirected (use followup-to:); as you correctly pointed
out yesterday, the whole issue looks like a troll on comp.lang.c
In fact, I was required to keep my yesterday's post cross-posted since
apparently you were asking for details while writing
>>> I don't see how the quoted statement can invoke undefined behaviour,
and I felt it was important to try to answer it, but I was not seeing
any sign of you reading comp.std.c. Maybe I was wrong and it was written
tongue-in-cheek, I do not know your style sufficiently. In such a case,
sorry for the overlengthly explanations (and I reiterate my invitation
to debate these issues in comp.std.c!)
Antoine
> It reminds me of a simple loop detection algorithm (use two pointers;
> for every iteration, one pointer advances one step and the other two
> steps; if there is a loop, the "fast one" will eventually catch up with
> the "slow one" from behind)
What's the advantage of that over the simpler algorithm: one pointer
advances one step and the other zero steps?
>> It reminds me of a simple loop detection algorithm (use two pointers;
>> for every iteration, one pointer advances one step and the other two
>> steps; if there is a loop, the "fast one" will eventually catch up with
>> the "slow one" from behind)
>What's the advantage of that over the simpler algorithm: one pointer
>advances one step and the other zero steps?
That only works if the initial node is in the loop.
Thanks. But would it be a reasonable interpretation in the context of the C
standard? Clearly, the purpose of that sentence is to forbid things that
don't fit into the "only" -- this interpretation sounds as if the prior
value can be read from the object but then doing anything else with it is
forbidden. Except, it seems, when the goal is *not* to determine the value
to be stored. Is my parsing of the English correct here too?
As a native speaker of English, I don't see that as a valid parse of
that sentence. It's not even an ambiguous parse; the phrase "only to"
must apply forward, not backward. It restricts the ways in which the
value read may be used. It doesn't say that that the stored value may
only be read, and that no other operations may be performed on it.
Writing the value was already prohibited by the immediately preceding
sentence "Between the previous and next sequence point an object shall
have its stored value modified at most once by the evaluation of an
expression.", so there's no need to repeat that prohibition. It was
the committee's intent that all other operations that might be
performed on the value (multiplication, addition, division, use as a
subscript, etc.) are permitted - so long as the end result of those
operations is to determine the value to be written. Because the new
value must be determined before the new value is written, code meeting
this requirement must have the read occur before the write. That was
the primary purpose for imposing this requirement.
A read of the prior value that determines where the value is to be
written must also necessarily precede the write, and the existing
wording doesn't cover that. That is one of the reasons why better
wording has been proposed for the next version of the standard.
> So this sentence does not unambiguously grant implementors a license
> to gratuitously break code with cunning diagnostics (at least in
> a mode that claims to be conforming).
The clause is poorly worded and overly-restrictive, and clearer
wording is planned for the next revision, but the 2002 discussion
described possible implementations (such as ones where reads are
destructive at the hardware level) where such breakage would not be
gratuitous, but a perfectly natural result of an implementation
optimized for such hardware.
Yes, but reading that documentation should be sufficient to answer the
question. The key difference between calloc() and malloc() is that the
memory is zeroed out. Whether or not this has any benefits depends
entirely upon whether or not the memory needs zeroing. If it does,
this difference counts as a benefit; if it does not, calling calloc()
rather than malloc() is a waste of time.
In this particular case, since the calloc() call was followed
immediately (without error checking!) by code that ensured that the
'next' field of the structure was initialized, the only effect of the
change to calloc() was to ensure that the 'data' field was also
zeroed. Personally, in that particular case, I'd prefer to write
node->data = 0;
but I think he was concerned about the more general case where there
was more than one field to be zeroed. As long as none of the fields to
be zeroed have a floating point or pointer type, using calloc()
strikes me as a reasonable way to initialize a large structure at the
same time as allocating memory for it.
A -> B -> C -> D --+
^ |
| |
+--------+
--
Eric Sosman
eso...@ieee-dot-org.invalid
In the drawing you gave, if the unmoving pointer were to be at B,
then the loop wouldn't be detected.
Um sorry, I had missed the *not* there...
OK, now I'm confused -- did you mean that I misinterpreted his square
brackets added to the text quoted from the Standard, or did you mean that I
paraphrased his interpretation accurately but his interpretation was not a
reasonable one?
>>>> It would not be a reasonable reading in English.
>OK, now I'm confused -- did you mean that I misinterpreted his square
>brackets added to the text quoted from the Standard, or did you mean that I
>paraphrased his interpretation accurately but his interpretation was not a
>reasonable one?
The latter.
That's why Eric posted it!
Oh! I missed a level of followup nesting (perhaps I am colorblind
without coffee, or else slrn did not color it) and thought Eric
had posted both the question and the diagram.
Of course, that would be awfully silly of him - indeed, he's plenty
smart enough to see the problem without any diagrams at all.
Thanks. I thought so too, but I've been surprised before. :)
No, we codified that there are more things that affect sequencing (e.g.,
data flow dependencies) and rewrote the rule accordingly. The new rule
more clearly expresses the intended behavior.
--
Larry Jones
This sounds suspiciously like one of Dad's plots to build my character.
-- Calvin
No. The rule says that the *prior* value may only be read to determine
the value to be stored; that's reading the new value, not the prior one.
--
Larry Jones
I always have to help Dad establish the proper context. -- Calvin
> On 2010-01-15, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
>> Kaz Kylheku <kkyl...@gmail.com> writes:
>>
>>> On 2010-01-15, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
<snip>
>>>> The oft-quoted 6.5 paragraph 2 reads:
>>>>
>>>> "Between the previous and next sequence point an object shall have
>>>> its stored value modified at most once by the evaluation of an
>>>> expression. Furthermore, the prior value shall be read only to
>>>> determine the value to be stored." [footnote numbers removed]
>>>
>>> This text can be parsed as:
>>>
>>> [ Furthermore, the prior value shall be read only ] to
>>> determine the value to be stored."
>>
>> Are you being serious? It seems unlikely. In case you are, I have
>> these arguments:
>
> Yep.
OK. That being so, I think it best to leave this as it stands. You
understand my point of view and I am pretty sure I have a good grasp
of yours.
<snip>
--
Ben.
Is it? I've always believed that an assignment operator returns the value
that it also stores in the object, *without* reading it back from the
object, and that the Standard's description in terms of "the value of the
object after the assignment" was just a clumsy wording (no offense) that
wasn't meant to imply a read access. If that's not the intended
interpretation, then why doesn't the standard say that the new value is
stored before the assignment expression yields its value? Why does it talk
about the next sequence point, if it doesn't mean to allow the store to
happen *after* the value has been returned to any operator that the
assignment may be an operand of?
(What's even more curious is why it talks about the "previous" sequence
point. Does is mean to say that the store may happen before the two
operands of the assignment have been completely evaluated? In rare cases,
such as "x = y & 0", the value of an operand may be known before its
evaluation is finished -- does the standard mean to allow x to be written to
before y is read?)
And closer to the original topic, was the "only to determine the value to be
stored" part intended to simply say that any read accesses to the object
must belong to the right argument of the assignment (rather than to the left
argument or some expression outside of the assignment expression whose
evaluation may be interleaved with the assignment)? I thought it was *not*
meant to forbid things like
x = y = x+1;
where the old value of x is read not *only* to determine the value to be
stored in x, but also to determine the value to be stored in y.
Actually, of the top of my head, it better not mean that the store is
read to obtain the value just stored because that would mean that
assigning to a volatile variable has an unspecified value.
I don't think *that* would be much more of a problem than the fact that an
explicit read from a volatile is just as unspecified. But in my experience,
most code that assigns to a volatile doesn't use the result of the
assignment operator. What I think would be a much bigger problem is that
there would not be a way to write to a hardware register without reading
back from it. For instance, there's hardware around where a write sends a
command and a read from the same location confirms an interrupt -- it can be
a problem if you read the location without receiving an interrupt first.
Good. I certainly want most of my volatile hardware registers
to have unspecified values after I've assigned to them. Otherwise
I'd never bother reading from them after the first assignment,
would I?
Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
Oh it's worse than that. Whether or not the value is read
back, assigning to a volatile variable is undefined behavior
(unknown side-effects, right?). And so is reading from one.
In practice of course this doesn't matter because the behavior
is actually defined, just by the underlying hardware rather
than the implementation. But it's good to remember that any
use of volatile is inherently not completely portable.
The values of your hardware registers are unspecified by the C standard
anyway, but hopefully specified by the manufacturer of your hardware. But
this is not about the values of your registers -- it's about the values of
your assignment expressions. Does an assignment simply report the value
that it writes to the object, or does it have to read back from the object
and return whatever it finds there? In other words, when "volatile char
*ptr" points to a register that always returns zero when read, does
int val = *ptr = 1;
initialize "val" to zero or one?
Most code that deals with hardware registers doesn't use the value returned
by an assignment. If the C standard insisted that an assignment must read
from the register after writing to it anyway, that would be merely a waste
of time for most hardware. But in some hardware, reading from a register
actually consumes the data read -- wouldn't it be bad if a write to a serial
port's output register had the side-effect of consuming a byte from its
input queue?
That was certainly the intent, although actually doing a read would also
be an acceptable implementation. But a number of people have argued
that the wording in the Standard requires a read. If you think that (as
Kaz seems to), the value read would be the new value, not the prior
value.
> (What's even more curious is why it talks about the "previous" sequence
> point. Does is mean to say that the store may happen before the two
> operands of the assignment have been completely evaluated? In rare cases,
> such as "x = y & 0", the value of an operand may be known before its
> evaluation is finished -- does the standard mean to allow x to be written to
> before y is read?)
No.
> And closer to the original topic, was the "only to determine the value to be
> stored" part intended to simply say that any read accesses to the object
> must belong to the right argument of the assignment (rather than to the left
> argument or some expression outside of the assignment expression whose
> evaluation may be interleaved with the assignment)?
Yes.
--
Larry Jones
I don't like these stories with morals. -- Calvin
Ok. I'd heard that the committee wasn't even considering major
revisions. It's an issue not just for C but other syntaxes as well, to
wit, will they follow the C definitions or strike out on their own.
--
frank
For non-volatile objects, sure, because the as-if rule allows it. But for
volatile objects, doing a read where the abstract machine isn't doing one
would be wrong, wouldn't it?
Either an assignment is required to return the value it stores, or it's
allowed to return whatever value it finds in the object after writing to it,
even if the object is volatile and the value is different. I don't think
you can have it both ways, can you?
> But a number of people have argued
> that the wording in the Standard requires a read. If you think that (as
> Kaz seems to), the value read would be the new value, not the prior
> value.
And the store must happen before the assignment operator reads its value
back from the object, and the words about it happening before the next
sequence point are unnecessary and redundant.
>> (What's even more curious is why it talks about the "previous" sequence
>> point. Does is mean to say that the store may happen before the two
>> operands of the assignment have been completely evaluated? In rare
>> cases,
>> such as "x = y & 0", the value of an operand may be known before its
>> evaluation is finished -- does the standard mean to allow x to be written
>> to
>> before y is read?)
>
> No.
Then the words about the write happening after the previous sequence point
are unnecessary and redundant, aren't they? The write happens after the
operands of the assignment are evaluated, but before the next sequence
point. Or, according to Kaz, just after the operands are evaluated, before
the value is read back and then passed on to any surrounding operators.
Either way, the restriction on reading the old value really applies to all
reading from the object between the surrounding sequence points, doesn't it
(other than the read-back that Kaz believes is part of the assignment) --
since the store may be the last thing that happens before the next sequence
point, there's no way to reliably read the *new* value from the object
before the next sequence point, is there?
>> And closer to the original topic, was the "only to determine the value to
>> be
>> stored" part intended to simply say that any read accesses to the object
>> must belong to the right argument of the assignment (rather than to the
>> left
>> argument or some expression outside of the assignment expression whose
>> evaluation may be interleaved with the assignment)?
>
> Yes.
I vaguely remember that the formal model of sequence points that ended up
not making into the standard a while ago seemed to allow the left argument
to an assignment to read the value, like the a[a[0]]=1 example does. Was
that one of the reasons why it was dropped? Or was my interpretation of it
incorrect?...
Yes, a circle is a loop but a loop is not necessarily a circle.
DES
--
Dag-Erling Smørgrav - d...@des.no
An object has just one value; prior and new refers to different times at
which that value is observed.
If an access unambiguously reads either the prior or new value,
that means it is well-defined.
So, if I understand it right, i = 1 is well-defined, because it accesses
and yields the new value of i, which we know it does unambiguously
because it is well-defined. :)
However the question is whether that value will be 1. I stick with my
view that requiring (or even allowing) a read after writing will be a
code breaker if i is volatile. I have always understood that the
returned value is the value (rvalue)assigned cast to the type being
assigned to.
Also, the new value does not have to be read from 'i', and will almost
certainly not be. The standard merely requires that the assignment have
the same value as the object after assignment; a conforming
implementation does not need to retrieve that value from 'i' to meet
that requirement. All that it needs to do is keep the value in the
register after writing it to memory.
> An object has just one value; prior and new refers to different times at
> which that value is observed.
int i =1;
i = 3;
If you insist on saying that the object 'i' has just one value, which
can be observed to be '1' prior to the assignment statement, and can be
observed to be 3 after the assignment statement, the more conventional
way to describe the same events is by saying that the value of 'i' prior
to the assignment was 1, and that the new value after the assignment is
3. If f you care to communicate clearly with people about the meaning of
the C standard, you should follow those conventions; otherwise
statements such as the yours above will be correctly interpreted as
nonsense.
> If an access unambiguously reads either the prior or new value,
> that means it is well-defined.
In itself, yes. However, the behavior can become undefined during the
evaluation of a statement that can modify the value, if the access
unambiguously reads the prior value, and uses it for any purpose other
than computing the new value (at least under C99 rules - I gather that
the C1x rules will be a bit more flexible).
> So, if I understand it right, i = 1 is well-defined, because it accesses
> and yields the new value of i, which we know it does unambiguously
> because it is well-defined. :)
"i = 1;" is well-defined because the required semantics does not in any
way involve reading the prior value. There's nothing circular about the
logic.
When the object is a regular non-volatile variable, sure; but the
interesting case is an assignment to a volatile object. Imagine that you're
assigning 1 to a hardware register that always returns zero when read? What
is "the value of the object after the assignment" in that case -- is it one
or is it zero?
Yes. :-)
--
Larry Jones
Is it too much to ask for an occasional token gesture of appreciation?!
-- Calvin
No, that was generally considered to be a good thing.
The primary reason it was dropped was that there was insufficient time
and tutorial material for the committee to be sure that they understood
it and that it matched the intended behavior. There were also holes in
it involving floating-point exception flags (which are somewhat similar
to regular objects but with significant differences). A final objection
was that it formalized accesses in the abstract machine and required
those access to occur for volatile variables and some committee members
were strongly opposed to that.
--
Larry Jones
Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin
It has to be more than that, because there's no way to cast to the type
of a bit field. As far as I'm concerned, anyone who uses the result of
an assignment to a volatile variable deserves whatever value they happen
to get. It's easy enough to write the code to explicitly do a read back
or not, whichever you want.
--
Larry Jones
Geez, I gotta have a REASON for everything? -- Calvin
That's essentially what the rewritten rule in C1X says. It's not what
the rule said previously, although it's what was intended all along.
--
Larry Jones
Good gravy, whose side are you on?! -- Calvin
Do you mean the Standard leaves the definition of "the value" unspecified
for volatile objects?
It's the value that gets stored in the object. What THAT value is is
specified somewhere, somehow, right?
> As far as I'm concerned, anyone who uses the result of
> an assignment to a volatile variable deserves whatever value they happen
> to get. It's easy enough to write the code to explicitly do a read back
> or not, whichever you want.
Well, no -- if the standard allows (requires?) the assigment to do the read
back for you, then it's impossible to write code that does a write without a
read back. As a minimum, the standard should make it clear whether the
implicit read back is required, forbidden, or unspecified.
Converted, then.
> As far as I'm concerned, anyone who uses the result of
> an assignment to a volatile variable deserves whatever value they happen
> to get. It's easy enough to write the code to explicitly do a read back
> or not, whichever you want.
As far as I'm concerned, anyone who does so should be able to
determine, by reading the standard, whether the behavior is defined,
unspecified, undefined, or whatever. (You said elsethread that this
is clarified in C201X.)
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
It seems that it's the last one. ;-)
I think it's safe to assume that any useful implementation of volatile
would not do a readback when the value of the assignment is not used.
But I can see reasonable implementors (and users) disagreeing on whether
a readback is desirable or not when the value *is* used.
> As a minimum, the standard should make it clear whether the
> implicit read back is required, forbidden, or unspecified.
Agreed.
--
Larry Jones
I think if Santa is going to judge my behavior over the last year,
I ought to be entitled to legal representation. -- Calvin
No, what is clarified in C1X is that the values of the left and right
operands have to be fully determined before the assignment can take
place, which makes a[a[0]] = 1 well-defined regardless of the value of
a[0]. That doesn't affect whether a readback occurs or not.
--
Larry Jones
Hmm... That might not be politic. -- Calvin
That's my position: it is unspecified whether the value of an assignment
to a volatile object is the value actually stored in the object or the
value obtained by reading the object after storing the value.
--
Larry Jones
The game's called on account of sudden death. -- Calvin
Agreed. If the standard indeed requires a readback, an implementation that
violates that requirement is more useful than a conforming one.
> But I can see reasonable implementors (and users) disagreeing on whether
> a readback is desirable or not when the value *is* used.
Sure; but that has nothing to do with whether it's actually required or
forbidden by the standard. But I can't find any evidence in the standard
suggesting that the presence or absence of the readback in the abstract
machine depends on the context surrounding the assignment expression, or
that it's meant to be unspecified or implementation-defined. I'd like to
believe that "the value after the assignment" was intended to mean the value
being stored in the object. I can accept that it was, instead, intended to
mean a value read from the object immediately after the store. But it's
really hard for me to believe that those words were chosen to reflect the
intent that it should be unspecified whether it's one or the other.
"Unspecified" means that the standard offers two or more possibilities and
allows the implementation to chose one of them. That's not the same as
offering an unclear description that can be interpreted in two different
ways. The description in the standard does *not* sound like it was meant to
offer a choice, does it? Doesn't "the value after the assignment" sound
like there's exactly one possibility? The only problem is the standard
forgets to explain is how we're supposed to interpret those words when the
object refuses to be modified by the assignment.
The way I see it, the standard doesn't even consider the possibility of such
objects -- the only way to make them fit into the standard's model seems to
be by pretending that the write actually stores the new value in the object,
but then immediately something restores the old value in a way unknown to
the implementation. If you agree to that interpretation, I think "the value
after the assignment" should be taken as the value *immediately* after the
store, rather than the value at some point later, after the object may have
been modified in ways known or unknown to the implementations. But, of
course, that does not necessarily reflect the intended meaning of the
standard -- as we already know, the words of the standard do not seem to
clearly reflect the intent in this area.
Dragons be down that path.
Would a useful implementation of volatile also not read *x in the expressions:
*x;
(void) *x;
where x is a pointer to some volatile type?
If a read is required, it can't be elided just because it is thrown away,
when the object is volatile.
That's separate issue from whether or not assignment requires a read-back,
surely.
> But I can see reasonable implementors (and users) disagreeing on whether
> a readback is desirable or not when the value *is* used.
I've been arguing that the text in fact requires the readback, but I perhaps
haven't made it clear that I don't agree that it's a good requirement.
For one thing, it leaves the programmer with no obvious way to express a pure
assignment (write only, no read).
That's very bad when a read may have a side effects on the hardware.
6.7.3p6: What constitutes an access to an object that has
volatile-qualified type is implementation-defined.
That loophole was intended to be large enough to drive an arbitrary size
vehicle through.
But it has nothing to do with what kind of accesses assignment operators are
required or allowed to perform in the abstract machine. Only after that
question is answered, you can use your loophole to translate the read and/or
write accesses in the abstract machine to whatever the implementation says
consitutes those accesses for volatile objects. If the correct
interpretation of 6.5.16#3 is that an assignment operator returns the
converted value of its right operand that it also stores in the object, I
don't see how your loophole could possibly justify reading back and
returning a different value.
> The primary reason it was dropped was that there was insufficient time
> and tutorial material for the committee to be sure that they understood
> it and that it matched the intended behavior. There were also holes in
> it involving floating-point exception flags (which are somewhat similar
> to regular objects but with significant differences). A final objection
> was that it formalized accesses in the abstract machine and required
> those access to occur for volatile variables and some committee members
> were strongly opposed to that.
I presume you mean that they were opposed to the specific kind of
accesses that were formalised, rather than to the principle as such?
After all, the principle is present in the current Standard as well, and
besides, it is volatile's main purpose.
Richard
No, they were opposed to the principle. They liked the current
Standard's loophole (what constitutes an access to a volatile object is
implementation-defined) and didn't want any tighter requirements for
fear that it would prevent them from doing what they considered to be
"the right thing".
--
Larry Jones
Ever notice how tense grown-ups get when they're recreating? -- Calvin
I agree.
I think a "read" access to an object involves a lvalue-to-value
conversion (6.3.2.1-1), even though that's not clearly stated.
Moreover 6.5.16-3 indicates that assignment expressions aren't lvalues.
They have the value (rvalue) of the left operand after the assignment.
I think it implies that a volatile variable is not read twice.
Note that this is different from C++ where assignments are lvalues
which makes
int a,b,c=0;
a=b=c;
Well-defined in C but undefined in C++ (I think it's a C++ standard
defect but fortunately not a C defect).
I can think of an hypotetical implementations giving strange behavior
to a[0]=0;a[a[0]]=1;
#include <stdio.h>
int main(void) {
int a[3];
a[0]=0; /* line 1 */
a[a[0]]=1; /* line 2 */
printf("%d\n", a[0]); /* line 3 */
return 0;
}
The C implementation may notice that a[0] is read on line 3 and
assigned a constant value on line 2, and so, its
constant-value-optimization module may test if the value of a[0] may have been
modified between line 1 and line 3. If it proves it cannot be modified,
it will be able to optimize the printf statement to:
printf("%d\n", 0);
Now, the C implementation analyzes line 2. It notices that the a array
is written, but, due to the fact a[0] is read for something that's not
needed to compute the value (the 1 literal), it cannot be
modified before the next sequence point (at end of this statement).
Thus, the compiler may optimize the printf statement, and output zero.
As n1124 is worded, this optimization is legal.
--
André Gillibert
It's the other way around: a lvalue-to-value conversion involves a read
access, but that doesn't mean that every read access is part of an
lvalue-to-value conversion.
> Moreover 6.5.16-3 indicates that assignment expressions aren't lvalues.
> They have the value (rvalue) of the left operand after the assignment.
> I think it implies that a volatile variable is not read twice.
Twice? This is about whether an assignment to a volatile variable reads it
*once* or not at all. I don't think anybody has claimed that it reads is
twice.
> Note that this is different from C++ where assignments are lvalues
> which makes
>
> int a,b,c=0;
> a=b=c;
>
> Well-defined in C but undefined in C++ (I think it's a C++ standard
> defect but fortunately not a C defect).
It's undefined in C++? Why?
Indeed, you're right.
However, the standard doesn't specify where there are no read access...
e.g.
int y=0;
volatile int x;
x=y;
The standard doesn't specify if y is to be read 153 times or only once.
>
>> Moreover 6.5.16-3 indicates that assignment expressions aren't lvalues.
>> They have the value (rvalue) of the left operand after the assignment.
>> I think it implies that a volatile variable is not read twice.
>
> Twice? This is about whether an assignment to a volatile variable reads
> it *once* or not at all. I don't think anybody has claimed that it reads
> is twice.
Excuse me... I had in mind the case of i = i + 1 where i is a volatile
variable. :)
>
>> Note that this is different from C++ where assignments are lvalues
>> which makes
>>
>> int a,b,c=0;
>> a=b=c;
>>
>> Well-defined in C but undefined in C++ (I think it's a C++ standard
>> defect but fortunately not a C defect).
>
> It's undefined in C++? Why?
>
This has even been mentioned in a C++ DR (DR #222).
From http://std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#222:
> One could argue that as the C++ standard currently stands, the effect of x
> = y = 0; is undefined. The reason is that it both fetches and stores the
> value of y, and does not fetch the value of y in order to compute its new
> value.
This DR also deals with volatile variable in assignments... In the context
of C++.
The standard doesn't specify that all "reads" are due to lvalue-to-rvalue
conversions, but I think one can state that all lvalue-to-rvalue conversions
count as "reads".
In C++, x = y = 0; makes a lvalue-to-rvalue conversion of the expression (y
= 0), and thus, "reads" the y object. This read is not done to compute its
new value. This is UB.
In C, there's nothing y=0 is a rvalue, and there's nothing stating the value
is read after the assignment (so, a fetch is either "unspecified" or
forbidden).
n1124> An assignment expression has the value of the left operand after the
assignment, but is not an lvalue.
I was curious enough to test existing practice.
#include <stdio.h>
volatile int x,y;
int main(void) {
printf("hello");
x=y=0;
printf("hello");
}
(The printf statements are useful as delimiters around the relevant compiled
code).
GCC on GNU/Linux/i386 fetches y, in both C99 and C++98 modes.
TCC on GNU/Linux/i386 doesn't fetch y.
Consequently, existing behavior differs between C99 implementations. I think
that's bad. It may be due to the ambiguous/defective standard wording, or a
GCC bug, perhaps because GCC uses the same back-end for C and C++.
It's a long way from "One could argue" in 1999 to "undefined in C++"
in 2010, given that there's been a standard update in 2003 where had
it been seriously thought there was a problem it could have been fixed.
(I'm referring to the non-volatile case.)
The standard doesn't say that y is read; it says that the lvalue expression
"y", in a context such as the above, is converted to the value stored in the
designated object. A read access is only implied, not stated explicitly;
but it seems to be clear enough that the conversion is done by a single read
access to the object.
The standard also says that the value of the expression "x=y" is the value
of the object after the assignment. Whether those words also imply a read
access to x or not is much less clear, at least to some of us, and that's
what we were debating here.
>>> Note that this is different from C++ where assignments are lvalues
>>> which makes
>>>
>>> int a,b,c=0;
>>> a=b=c;
>>>
>>> Well-defined in C but undefined in C++ (I think it's a C++ standard
>>> defect but fortunately not a C defect).
>>
>> It's undefined in C++? Why?
>>
>
> This has even been mentioned in a C++ DR (DR #222).
> From http://std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#222:
That's interesting reading. Thanks for the pointer.
>> One could argue that as the C++ standard currently stands, the effect of
>> x = y = 0; is undefined. The reason is that it both fetches and stores
>> the value of y, and does not fetch the value of y in order to compute its
>> new value.
But the C++ standard, just like the C standard, only forbids fetching the
*prior* value for purposes other than to compute the nev value. I think
you'll agree that people who write things like x=y=0 expect the *new* value
of y to be stored in x, not the old value -- but indeed, it seems that the
C++ standard does not guarantee that clearly enough.
That's undefined in C++03, but, the new sequencing rules introduced in 2008
(Oxford) fix this issue. C++1x will include these new rules.
Ok.
:Between the previous and next sequence point an object shall have its
stored value
:modified at most once by the evaluation of an expression. Furthermore, the
prior value
:shall be read only to determine the value to be stored.71
The glitch is that, the assignment y = 0 may take effect (side effect) after
the y value is fetched (in C++03) for the x= assignment.
: the order of evaluation of subexpressions and the order in which side
effects take place are both unspecified.
> I think you'll agree that people who write things like x=y=0 expect the
> *new* value of y to be stored in x, not the old value --
But, a C++03 implementation might give the old value....
In C++03, x=y=0 is equivalent to (y=0,x=y), but, without the sequence point
of the comma...
If x and y are numeric, it's equivalent to (y=0)+(x=y).
> but indeed, it seems that the C++ standard does not guarantee that
> clearly enough.
>
Anyway, this is an old issue that C never had and C++1x fixes.
--
Andr� Gillibert
I maintain that you deserve whatever behavior you get when you write
code like that. Instead of the compound assignment, you should be
explicit about the behavior you want. Either:
y = 0;
x = 0;
or:
y = 0;
x = y;
--
Larry Jones
What better way to spend one's freedom than eating chocolate
cereal and watching cartoons! -- Calvin
Clearly no one thought the undefinedness was more than a theoretical issue
for non-volatiles even in 2003. It's still a jump from "One could argue" to
your unqualified assertion that it is undefined.
Is this anything more than speculation? Is there any evidence of any C++
compiler actually not doing what everyone expects? (Reminder, this is
the non-volatile case.)
This is speculation.
The standard committee never intended to make x=y=0 undefined. It's
very widely used, and supported by all compilers I've ever seen.
--
André Gillibert
Why? Becuase the C standard is unclear on what that code does?
Yes, if x and y are volatile, the C standard is unclear and existing
implementations differ.
--
André Gillibert
And whose fault is that -- the programmer's or the standard's?
Besides, even if x and y are not volatile, there still is a problem.
The standard attempts to define the semantics of C by describing the
behaviour of the abstract machine, and then, separately, telling us what
aspects of that behaviour must be reflected by the implementation. Before
we start talking about how the behaviour of the abstract machine maps to the
real hardware, we need to know what it can possibly be. In the case of the
assignment operators, it's the standard's job to tell us whether the
abstract machine obtains the value by reading it back from the object being
assigned to, by returning a copy of the value being assigned, or whether
it's up to the implementation. And the standard fails to do that job.
At the first glance, it seems that the answer doesn't matter unless the
object is volatile; but are you certain that it's not possible to have
situations where the presence of a read access in the abstract machine may
trigger undefined behaviour by violating some seemingly unrelated rule --
maybe something about the effective type, or about the restrict qualifier,
or perhaps something else? If such cases are possible (for non-volatile
objects), would you say that the programmer deserves whatever behaviour he
gets there as well?
No, because the meaning of the code in inherently fuzzy. If you're
concerned about exactly what accesses get made to an object (which you
probably are if the object is volatile), then using it in a complex
expression is just asking for trouble. Writing simple code that clearly
expresses your intent is a much better strategy.
--
Larry Jones
I like maxims that don't encourage behavior modification. -- Calvin
Perhaps existing implementations, hence implementers?
Since they pick up different interpretations, and both appear to make
sense, and furthermore different programmers might have learn the
different behaviours, you do not have easy solution to that situation,
hence the "problem"; seeking "whose fault" won't help, by the way;
trying to define a consensus about how to make the Standard (or the
Standard's reading) clearer, on the other hand, might.
So it might be comp.*.c gurus' fault, which are failing to cast clear
interpretations? ;-)
> Besides, even if x and y are not volatile, there still is a problem.
If there is a practical problem with int x,y; x=y=0; I fail to see which
one. I believe this code is given for granted if you read K&R1.
> In the case of the assignment operators, it's the standard's job to tell us
> whether the abstract machine obtains the value by reading it back from
> the object being assigned to, by returning a copy of the value being
> assigned, or whether it's up to the implementation. And the standard
> fails to do that job.
I fail to understand the result of your past discussions. Where does it
prevent the implementer to choose?
If it does not prevent it, where does it fail to fall back in the last
case of your enumeration?
> but are you certain that it's not possible to have situations where
<snip>
There is a general answer to that rhetoric: the Standard explicitely
forecasts such cases to exist, and even forges a name at this effect:
unspecified behaviour.
Antoine
The programmer's. It is the programmer's job to do things the way the
standard specifies. If a programmer attempts to do something that the
standard is unclear on, then it is the programmer's fault if it
doesn't work the way they expect it to.
--
Dan Giaimo
Both. The standard should not be unclear, and programmers should avoid
writing code that depends upon the precise interpretation of parts of
the standard that are not clear.
No, the meaning of the code wouldn't be fuzzy if the words in the standard
that define it weren't ambiguous. Nothing inherent about it.
> If you're
> concerned about exactly what accesses get made to an object (which you
> probably are if the object is volatile), then using it in a complex
> expression is just asking for trouble. Writing simple code that clearly
> expresses your intent is a much better strategy.
It seems to me that what constitutes "simple code" is relative -- to me,
something like
x = y = complicated_expression;
is simpler and more clearly expresses my intent than
tmp = complicated_expression;
x = tmp;
y = tmp;
But maybe that's just because I've spent many years under the apparently
mistaken impression that my interpretation of that paragraph in the standard
was the intended one, and that those two ways of writing code are
equivalent. If the real intention was to make volatile so "flexible" that a
conforming implementation is free to declare that the expression a+b
constitues a write acces to the volatile variable named c, then of course I
should not rely on what my instincts tell me about the simplicity or clarity
of C code.
It's implementers fault that the words of the standard allow different
conflicting interpretations that all seem reasonable? Or is it their fault
because they just picked their own interpretations without complaining to
the committee about the ambiguity in the standard? I have to admit that I
don't blame them for that -- to me, this particular ambiguity is of te sort
that makes it easy to think that the other interpretation is just plain
silly and therefore obviously wrong. Regardless of which of the two
interpretations you pick.
> Since they pick up different interpretations, and both appear to make
> sense, and furthermore different programmers might have learn the
> different behaviours, you do not have easy solution to that situation,
Sending a DR is fairly easy (for some people), and it would solve the
situation even if the answer was that it's unspecified.
> hence the "problem"; seeking "whose fault" won't help, by the way;
> trying to define a consensus about how to make the Standard (or the
> Standard's reading) clearer, on the other hand, might.
The first step would be for the committee to define a consensus about their
intent.
>> Besides, even if x and y are not volatile, there still is a problem.
>
> If there is a practical problem with int x,y; x=y=0; I fail to see which
> one. I believe this code is given for granted if you read K&R1.
Sure, but I was thinking about an assignment being a subexpression in a big
and complicated expression, in a context where a read access to x triggers
undefined behaviour by violating some rule in a subtle way. I don't know of
such situation and doubt it would be a practical problem if it existed; but
here in comp.std.c there's nothing wrong about being concerned about
theoretical problems too.
>> In the case of the assignment operators, it's the standard's job to tell
>> us
>> whether the abstract machine obtains the value by reading it back from
>> the object being assigned to, by returning a copy of the value being
>> assigned, or whether it's up to the implementation. And the standard
>> fails to do that job.
>
> I fail to understand the result of your past discussions. Where does it
> prevent the implementer to choose?
But it's the purpose of the standard to prevent implementers from having too
much choice!!! In the case in question, my complaint is not that the
standard dos or does not allow implementers to choose -- it's that it forces
them to choose between different possible interpretations of the text.
That's the kind of choice that the standard should avoid giving to people.
If the standard wants to give implementers choice, it should just say what
they can choose from, rather than say something unclear and make them guess
what it was supposed to mean.
> If it does not prevent it, where does it fail to fall back in the last
> case of your enumeration?
The last case sounds like the least reasonable interpretation of the words
to me -- the words just don't sound like they're meant to let implementation
choose. The words sound like they say that a particular thing happens.
They just fail to clearly describe what that thing is.
Think about the other place that talks about obtaining the value of an
object -- 6.3.2.1#2. It says that lvalues are "converted to the value
stored in the designated object". Note that this one doesn't clearly say
that this "conversion" is done *by* accessing the object -- should we think
that the standard means to say that it's unspecified whether an access
occurs here as well?
>> but are you certain that it's not possible to have situations where
> <snip>
>
> There is a general answer to that rhetoric: the Standard explicitely
> forecasts such cases to exist, and even forges a name at this effect:
> unspecified behaviour.
No, unspecified behaviour is when the standard PROVIDES two or more choices
and allows the implementation to choose. That's not the same as providing
some unclear words that can be interpreted in two or more different ways.