int i;
// ...
i = i++;
I know, that's illegal. But I am tired of answering questions like that,
so i decided to implement a warning for that kind of error.
Algorithm
---------
o Setup a global assignment counter that will be reset at the end
of each sequence point.
o For each assignment to a scalar variable store the expression
tree in an array of expressions.
If this counter contains already at least one expression, go
in a loop that:
o tests if the left hand side of the assignment is
identical to the assignment being done.
o If it is, emit a warning
It looks simple and took very little time to implement. This
makes me suspicious...
Did I forget something?
Why I have never seen a compiler that implements this
warning???
Thanks in advance for your attention.
jacob
At each sequence point, not "at the end of a sequence point".
Sorry for the stupid expression above.
Quibble: It's not "illegal", it's undefined behavior.
> But I am tired of answering questions like that,
> so i decided to implement a warning for that kind of error.
Good.
> Algorithm
> ---------
>
> o Setup a global assignment counter that will be reset at the end
> of each sequence point.
> o For each assignment to a scalar variable store the expression
> tree in an array of expressions.
> If this counter contains already at least one expression, go
> in a loop that:
> o tests if the left hand side of the assignment is
> identical to the assignment being done.
> o If it is, emit a warning
>
> It looks simple and took very little time to implement. This
> makes me suspicious...
>
> Did I forget something?
Well, you can't catch all possible cases of this kind of undefined
behavior (such as ``*p = (*q)++''), but warning about the easy cases
is better than not doing so.
I think you're also missing cases like:
a[i++] = i;
(that's an example from the standard).
> Why I have never seen a compiler that implements this
> warning???
gcc certainly warns about ``i = i++'' if you give it the right
options:
warning: operation on `i' may be undefined
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
<> I know, that's illegal.
< Quibble: It's not "illegal", it's undefined behavior.
That is what they always say. Does that mean that a
compiler couldn't refuse to compile it?
There are some things that are undefined or implementation
defined due to the standard allowing for ones complement
and sign magnitude integers. Sometime that is fine, other
times not.
I was trying to think if there were cases where the
undefined behavior might not be a problem.
One that doesn't work, I believe, is:
j=f(i++)+f(i++);
Where, for example, one might not care which order the
f() were called.
(snip)
<> Did I forget something?
< Well, you can't catch all possible cases of this kind of undefined
< behavior (such as ``*p = (*q)++''), but warning about the easy cases
< is better than not doing so.
Not so hard to test at run time, especially if you already
have p and q in registers.
< I think you're also missing cases like:
< a[i++] = i;
< (that's an example from the standard).
For the case with multiple i++, I understand that they
can be done in any order, and assigned back to i in
any order. In this case, I could see the assigned value
being either before or after the increment, but it is hard
to see any other value. I can imagine cases where either
would work.
-- glen
"undefined behavior" means that the rules of the language
place no limitations on what the compiler may do.
--
pete
No, what it really means is that the standard doesn't use the terms
"legal" or "illegal", at least not in that sense.
A compiler may refuse to compile it, but it's not obligated to do so,
or even to issue a diagnostic. In my experience, the term "illegal"
refers to constructs that the compiler is required to diagnose, or
even to reject. In C, it's not entirely unreasonable to refer to
syntax errors and constraint violations as "illegal", but I prefer to
stick to the standard's terminology to avoid confusion.
> There are some things that are undefined or implementation
> defined due to the standard allowing for ones complement
> and sign magnitude integers. Sometime that is fine, other
> times not.
>
> I was trying to think if there were cases where the
> undefined behavior might not be a problem.
>
> One that doesn't work, I believe, is:
>
> j=f(i++)+f(i++);
>
> Where, for example, one might not care which order the
> f() were called.
>
> (snip)
> <> Did I forget something?
>
> < Well, you can't catch all possible cases of this kind of undefined
> < behavior (such as ``*p = (*q)++''), but warning about the easy cases
> < is better than not doing so.
>
> Not so hard to test at run time, especially if you already
> have p and q in registers.
Sure, but the whole point of leaving such things undefined is to
permit optimizations.
> < I think you're also missing cases like:
>
> < a[i++] = i;
>
> < (that's an example from the standard).
>
> For the case with multiple i++, I understand that they
> can be done in any order, and assigned back to i in
> any order. In this case, I could see the assigned value
> being either before or after the increment, but it is hard
> to see any other value. I can imagine cases where either
> would work.
Very odd things can happen in the presence of optimization, especially
given that the optimizer is permitted to assume that no undefined
behavior occurs.
I think some cases will be easy to diagnose and others not so easy.
Lint and splint can spot this problem easily:
C:\tmp>type tt.c
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int i = rand();
i = i++;
printf("i=%d\n", i);
return 0;
}
C:\tmp>"C:\Lint\Lint-nt" +v -i"C:\Lint" std.lnt -os(_LINT.TMP)
tt.c
PC-lint for C/C++ (NT) Vers. 8.00u, Copyright Gimpel Software
1985-2006
--- Module: tt.c (C)
C:\tmp>type _LINT.TMP
--- Module: tt.c (C)
_
i = i++;
tt.c(7) : Warning 564: variable 'i' depends on order of evaluation
C:\tmp>splint tt.c
Splint 3.1.1 --- 12 Mar 2007
tt.c: (in function main)
tt.c(7,10): Expression has undefined behavior (value of left operand i
is
modified by right operand i++): i = i++
Code has unspecified behavior. Order of evaluation of function
parameters or
subexpressions is not defined, so if a value is used and modified in
different places not separated by a sequence point constraining
evaluation
order, then the result of the expression is unspecified. (Use -
evalorder to
inhibit warning)
Finished checking --- 1 code warning
But others are more subtle.
Lint gets fooled but Splint sees the problem:
C:\tmp>type _LINT.TMP
--- Module: tt.c (C)
_
}
tt.c(14) : Note 953: Variable 'c' (line 6) could be declared as const
--- Eff.
C++ 3rd Ed. item 3
tt.c(6) : Info 830: Location cited in prior message
_
}
tt.c(14) : Note 953: Variable 'd' (line 7) could be declared as const
--- Eff.
C++ 3rd Ed. item 3
tt.c(7) : Info 830: Location cited in prior message
C:\tmp>splint tt.c
Splint 3.1.1 --- 12 Mar 2007
tt.c: (in function main)
tt.c(10,14): Expression has undefined behavior (value of left operand
*c is
modified by right operand c[0]++ + d[0]++): *c = c[0]++ + d[0]++
Code has unspecified behavior. Order of evaluation of function
parameters or
subexpressions is not defined, so if a value is used and modified in
different places not separated by a sequence point constraining
evaluation
order, then the result of the expression is unspecified. (Use -
evalorder to
inhibit warning)
Finished checking --- 1 code warning
C:\tmp>type tt.c
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int a = rand();
int *c = &a;
int *d = &a;
printf("*c=%d, *d=%d\n", *c, *d);
*c = c[0]++ + d[0]++;
printf("*c=%d, *d=%d\n", *c, *d);
return 0;
}
I think it would be entirely reasonable to define as illegal any code
that a conforming compiler is required to reject. Unfortunately, this
definition is a bit problematic for C, since the only code that a C
compiler is required to reject is a #error directive which survives
conditional compilation.
Not unless it can prove that the undefined behavior must actually occur
at run time. Problematic code only produces undefined behavior if it's
executed and there's no license for an implementation to reject the code
just because it *might* produce undefined behavior.
--
Larry Jones
Philistines. -- Calvin
Or, the compiler could refuse to compile it if it's invoked in a
non-ANSI mode, such as gcc with -Werror combined with -Wsequence-point.
Is it forbidden for something that causes undefined behavior to
cause it at compile time? I have heard of some compilers doing
exactly that, although with stuff that must be evaluated at compile
time, like:
#if 1/0
Good point.
Well obviously the standard can impose no requirements on
non-conforming compilers, including compilers in non-conforming mode.
> Is it forbidden for something that causes undefined behavior to
> cause it at compile time? I have heard of some compilers doing
> exactly that, although with stuff that must be evaluated at compile
> time, like:
> #if 1/0
Actually that's a constraint violation.
C99 6.10.1p1 (under Constraints):
The expression that controls conditional inclusion shall be an
integer constant expression except that:
[...]
C99 6.6p4 (under Constraints):
Each constant expression shall evaluate to a constant that is in
the range of representable values for its type.
(Permission to quote my words without proper attribution is denied.)
> "undefined behavior" means that the rules of the language
> place no limitations on what the compiler may do.
The rules of the language don't give any limitations what the compiler
may do, actually. It covers what compiled code will do.
In the case of compiling undefined behaviour: Undefined behaviour is
only undefined behaviour if the code is actually executed.
if (0) i = i++;
is perfectly fine. So if the compiler wants to refuse to compile some
code because of undefined behaviour, it would at least have to prove
that the offending code will necessarily be executed.
The standard is silent on the behavior of compilers, but quite vocal on
the behavior of implementations, of which compilers are typically an
important part. What the standard says about the behavior that occurs
when executing translated code imposes some strict requirements on the
behavior of the implementation that performed the translation.
Implementations are, under certain circumstances, required to generate
diagnostics. They are required to accept and translate correctly at
least one program which must meet certain requirements. Unless the
behavior of a program is undefined, if an implementation accepts and
translates the source code, it is required to translate it into a
program whose behavior is consistent with that defined by the standard,
and with the implementation's own documentation of how it handles
implementation-defined behavior.
In comp.std.c, article <lniqhyw...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:
> gordon...@burditt.org (Gordon Burditt) writes:
> > Is it forbidden for something that causes undefined behavior to
> > cause it at compile time? I have heard of some compilers doing
> > exactly that, although with stuff that must be evaluated at compile
> > time, like:
> > #if 1/0
> Actually that's a constraint violation.
> C99 6.10.1p1 (under Constraints):
> The expression that controls conditional inclusion shall be an
> integer constant expression except that:
> [...]
> C99 6.6p4 (under Constraints):
> Each constant expression shall evaluate to a constant that is in
> the range of representable values for its type.
> (Permission to quote my words without proper attribution is denied.)
And what if an implementation decides to define 1/0 as some
representable integer constant (just like it usually does for
1.0 / 0.0, as a constant representable in a double)?
BTW, the standard says:
An implementation may accept other forms of constant expressions.
For instance, the following program is successfully translated
by gcc:
static double x = 1.0 / 0.0;
int main (void)
{
return x == 0.0;
}
--
Vincent Lef�vre <vin...@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
Quibble - unelide '
return;' from the ellipsis.
Phil
--
Marijuana is indeed a dangerous drug.
It causes governments to wage war against their own people.
-- Dave Seaman (sci.math, 19 Mar 2009)