The arguments seem to be as follows:
One one hand: Non-terminating executions are not mentioned in the list
of "least requirements" in 5.1.2.3 of N1124, so the compiler is free
to terminate an infinite loop.
On the other hand: Termination of infinite loops will change the side
effecting operations performed by a program, so a conforming
implementation must not do this.
Can anyone comment on the intent of the standards committee on this
point? Is there any chance this can be explicitly cleared up in a
future standard?
Consider this example, where "icc" is the current version of Intel's
compiler for x86:
regehr@john-home:~$ icc fermat2.c -o fermat2
regehr@john-home:~$ ./fermat2
Fermat's Last Theorem has been disproved.
Where fermat2.c is this code:
int fermat (void)
{
const int MAX = 1000;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}
#include <stdio.h>
int main (void)
{
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}
A number of other C implementations have this behavior, whereas others
preserve the termination behavior of their inputs.
The other day I wrote a slightly lighthearted blog entry about this
issue and it generated quite a bit of discussion:
http://blog.regehr.org/archives/140
Thanks.
John Regehr
If by "terminating" you mean something along the lines of "letting the next
statement after the loop execute", then I think it's a question of whether
5.1.2.3 should be interpreted as allowing implementations to produce side
effects of expressions that would never be evaluated in the abstrace machine
(and I believe the answer is no). If, on the other hand, you mean just
terminating the program instead of letting it loop forever, then I think
it's much more subtle.
There was a discussion here about this a while ago:
Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'. (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
Is today the First of April in the Orthodox calendar or in
hexadecimal or in metric or something like that?
--
Eric Sosman
eso...@ieee-dot-org.invalid
> Conjecture: The version that terminated was built with compiler
> options resulting in a 16-bit `int'. (When I simulate this with a
> whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
> where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
No-- the Intel compiler on x86 has sizeof(int)==4. My code is
designed to not let any int exceed 2^31-1.
John
Ah thanks. I'd done a cursory scan of the archives and missed this.
For the curious, here's the list of x86 compilers I've tried.
Will terminate an infinite loop in such a way that the observable
behavior of a program changes: Microsoft (Visual Studio 2008 and
2010), LLVM/Clang 2.7, Open64-x86 4.2.3, Sun CC 5.10, and Intel CC
11.1.072.
Has not been observed to change the observable behavior of a program
containing an infinite loop: various versions of gcc 3.x and 4.x, Wind
River Diab C compiler.
John Regehr
It still seems an "astonishing coincidence," and worth a
closer look. Try printing the value of INT_MAX, for example.
> My code is
> designed to not let any int exceed 2^31-1.
... whereas what's actually needed is code that doesn't
try to exceed INT_MAX. Initializing MAX to cbrt(INT_MAX/2)
instead of to 1000 might be prudent.
--
Eric Sosman
eso...@ieee-dot-org.invalid
John Regehr <reg...@cs.utah.edu> writes:
> No-- the Intel compiler on x86 has sizeof(int)==4. My code is
> designed to not let any int exceed 2^31-1.
You could change to "unsigned long" to finish all comments like that.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--
regehr@john-home:~$ icc fermat2.c -o fermat2
regehr@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647
Fermat's Last Theorem has been disproved.
regehr@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}
#include <stdio.h>
#include <limits.h>
int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);
I wish. Eric I see that you're at Sun, so perhaps you'll prefer this
version:
regehr@john-home:~$ suncc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files. Use 'cc -flags' for details
regehr@john-home:~$ suncc -O fermat2.c -o fermat2
"fermat2.c", line 20: warning: statement not reached
regehr@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647
Fermat's Last Theorem has been disproved.
regehr@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}
#include <stdio.h>
#include <limits.h>
int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);
if (fermat()) {
Old news, I'm afraid. On the bright side, I had an enviable
87.5% success rate in surviving layoffs there ...
> so perhaps you'll prefer this
> version:
>
> regehr@john-home:~$ suncc -V
> cc: Sun C 5.10 Linux_i386 2009/06/03
> usage: cc [ options] files. Use 'cc -flags' for details
> regehr@john-home:~$ suncc -O fermat2.c -o fermat2
> "fermat2.c", line 20: warning: statement not reached
> regehr@john-home:~$ ./fermat2
> sizeof(int)=4, INT_MAX=2147483647
> Fermat's Last Theorem has been disproved.
> regehr@john-home:~$ cat fermat2.c
> [... code snipped, see up-thread ...]
Okay, the "closer look" proves my conjecture wrong. Also,
by limiting all arithmetic results to no more than 2000 you've
ruled out overflow even on the narrowest possible int. There's
undefined behavior from printing a size_t with "%d", but I'm just
not going to believe that makes a difference.
At a guess -- and it's only a guess, the compiler "sees"
- That the `return 0;' cannot execute (it says as much)
- Hence, if any `return' executes it must be the `return 1;'
inside the loop
- That the only side-effects are to a,b,c, all unobservable
outside the function
- Hence, those side-effects are unneeded once the function
returns
... hence, the whole function body can be replaced with `return 1;'
with no change in externally observable behavior.
Ugh.
If (*if*) that's indeed what's going on. A couple tests you
might try:
- Make any or all of a,b,c static
- Make any or all of a,b,c file-scope static
- Return the value of a,b, or c (or an expression using them)
But still: My reaction remains "Ugh." I think the reasoning
behind the final "hence" above is simply putrid. 6.8.5p4 says
that "An iteration statement causes a statement called the loop
body to be executed repeatedly until the controlling expression
compares equal to 0," which never happens with your constant
controlling expression `1'. 6.8.4.1p2 says that "[...] the first
substatement [of an `if'] is executed if the expression compares
unequal to 0," which never happens in your code. Therefore the
`return 1;' should not be executed (because the `if' never fires),
and the compilers that execute it have oatmeal between their ears,
if not something worse.
Ugh.
--
Eric Sosman
eso...@ieee-dot-org.invalid
Oops, there I go believing what I read on the web...
Your assessment about what's going on in the compilers sounds right.
> and the compilers that execute it have oatmeal between their ears,
> if not something worse.
Yeah I'd certainly agree with this.
John Regehr
What happens if you move a, b, c out of fermat() and inspect them in
main() if / when fermat() returns? Does the program behave differently?
I still have a very hard time believing any compiler would do this; my
money is on some sort of overflow / wraparound.
DES
--
Dag-Erling Smørgrav - d...@des.no
For icc and suncc, moving any of a, b, or c out of function scope
causes the emitted program to hang.
> I still have a very hard time believing any compiler would do this; my
> money is on some sort of overflow / wraparound.
I don't think so, but am happy to be corrected. All of these
compilers are easy to obtain.
Everything I've seen is consistent with C implementations that prefer
performance over correctness.
John Regehr
That kind of optimization is apparently becomming more common. The
current C1X draft explicitly allows it in 6.8.5:
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>
<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
--
Larry Jones
At times like these, all Mom can think of is how long she was in
labor with me. -- Calvin
Let me say again:
Ugh.
If I write:
while (1) {
/* statements */
}
fputs("ERROR: Shouldn't get here\n", stderr);
I should never see that message. If I do see it, the compiler is
broken. If the compiler generates code that produces the message
because the language allows it to do so, the language is broken.
If the body of the loop has no effect, I'd welcome a warning, and I'd
even accept treating it as an error and rejecting the program. Silently
removing an infinite loop is, IMNSHO, unacceptable.
Can you think of an example where the permission to remove an infinite
loop is actually beneficial?
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
I've only two things to say: (1) "Ugh," which I've said
before, and (2) "Time for the torchlight parade; bring your
own pitchfork."
--
Eric Sosman
eso...@ieee-dot-org.invalid
I appreciate the sentiment - loops are notoriously hard to prove even
for humans, and provably unprovable in the general case - but this is a
radical departure from the as-if rule. I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.
In a free-standing setting, perhaps something like
void kernel_panic(int code) {
stop_devices_abruptly();
display_panic_code(code);
for (;;) {
/* This would be a good place for a HALT, if the
* CPU's instruction set included such a thing.
*/
}
}
... would count as "useful." I for one would be unhappy if a
call to kernel_panic() returned.
--
Eric Sosman
eso...@ieee-dot-org.invalid
It's a corner case for sure. However...
In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.
Deleting infinite loops also seems like an impediment to debugging.
If I accidentally write an infinite loop, I'd prefer my program to
hang so I can use a debugger to find the problem. If the compiler
deletes the loop and also computes a nonsensical result, as in the
Fermat example, I have no way to find the latent error in my system.
John Regehr
Even if the controllong expression is a constant "1" and there's no "break"
or "return" inside?
Ugh.
I hope there's no plan to also remove goto from C1X?
Or to extend the above to also apply to loops made with goto?
Part of Linux 2.4.0:
NORET_TYPE void panic(const char * fmt, ...)
{
/* ... */
for(;;) {
CHECK_EMERGENCY_SYNC
/* the above macro may be empty */
}
}
I admit, pretty old, but hey! Besides, didn't check versions between
2.4.0 and 2.6.12-rc2 so this code may be present even in newer
versions.
Suffice to say, the wording of the draft would allow a compiler to break
a rather widely used application (and that's just a single example).
John Regehr <reg...@cs.utah.edu> writes:
> In embedded software one occasionally writes a deliberate infinite
> loop. For example to hang up the CPU if main() returns, or to hang
> the processor waiting for the watchdog timer to reboot the system. To
> stop the compiler from deleting these loops, people are going to have
> to put in a side-effecting operation such as an access to a volatile-
> qualified object.
Either that or people will include some inline assembler. In fact,
I think there was this comment that for gcc the following is enough:
for(;;) asm volatile("":::"memory");
Keith Thompson <ks...@mib.org> writes:
> while (1) {
> /* statements */
> }
> fputs("ERROR: Shouldn't get here\n", stderr);
It goes better:
for(;;);
start_a_termonuclear_war();
;)
Nevertheless, I think it's safe to assume that the members of the
C standard committee aren't actually batcrackers insane. There's
probably some good rational reason for adding this permission to
the standard. I look forward to reading about this reason.
Just to be clear: the C standard is not explicit on this matter,
whereas the C++0x draft standard says that a C++ implementation may
assume that loops free of side effects are terminating, meaning that
non-terminating, side-effect free loops effectively have undefined
behavior.
Hans Boehm explained the C++ committee's rationale to me as follows
(I'm paraphrasing -- any mistakes/mischaracterizations are mine):
- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.
- It's hard to find real examples where this matters.
- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.
- Many compilers already have this behavior anyway.
Hope this is helpful.
John Regehr
As Larry Jones mentioned upthread, the C201X draft (not yet a standard)
is explicit. See
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1425.pdf>, 6.8.5p6:
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
with a footnote:
This is intended to allow compiler transformations such as removal
of empty loops even when termination cannot be proven.
I would argue that the C99 standard implicitly but unambiguously
disallows such removal by its description of the semantics of loops:
An iteration statement causes a statement called the loop body to be
executed repeatedly until the controlling expression compares
equal to 0.
I just don't see any wiggle room there.
> Hans Boehm explained the C++ committee's rationale to me as follows
> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):
>
> - A variety of advanced optimizations are handicapped or broken by the
> requirement to match the termination behavior of the abstract machine.
I'm sure they're also handicapped by the requirement to produce correct
results. 8-)}
> - It's hard to find real examples where this matters.
>
> - It's better for the standard to take a position than to leave this
> up in the air, as it seems to be in C.
>
> - Many compilers already have this behavior anyway.
I don't think it's currently up in the air. But I'm sure Hans
Boehm knows this stuff better than I do. It's likely I'm missing
something; I just don't know what.
Any paper on that as it sounds interesting and I don't really see it?
> - A variety of advanced optimizations are handicapped or broken by the
> requirement to match the termination behavior of the abstract machine.
>
> - It's hard to find real examples where this matters.
I just gave one (Linux 2.4.0) in mere minutes. I'm sure people who are
in embeded and operating systems are likely to give more.
> - It's better for the standard to take a position than to leave this
> up in the air, as it seems to be in C.
I think we all agree on that.
Personally I'd rather standard said compiler can reject such programs
or produce diagnostic then modify it in an undefined way.
Michal Nazarewicz <min...@tlen.pl> writes:
> Personally I'd rather standard said compiler can reject such programs
> or produce diagnostic then modify it in an undefined way.
On second thought, implementations can already produce diagnostics
whenever they want so let the standard just say that compiler cannot
assume loops terminate.
If it's such a burden for optimisation (which I still don't see)
implementations can simply provide a standard-incompatible option which
would turn such optimisations of.
I could not imagine code qualifying for such removal as being correct
(which, given the examples other people have provided from the kernal
and embedded programming world, simply proves my lack of imagination).
Therefore, I would always want it to execute in accordance to the way it
was actually written, which would at least allow it's unintended
behavior to be debugged.
If an implementation can determine that a particular loop can never
terminate (which, of course, it cannot do in the general case), it could
generate a warning message, and I'd be happy to see that, because in my
code it would almost always be the result of a coding error. However,
given the examples other people have provided, rejecting such code or
specifying that it's behavior is undefined would not be acceptable.
I wouldn't mind if the compiler also gave a warning even when it
couldn't prove non-termination, so long as it was also having an unusual
amount of trouble proving that the loop can terminate. This warning,
however, should definitely be one that can be turned off.
Since these warnings cannot be made mandatory, and must not prevent
compilation of the code according to it's specified semantics, saying
anything about such warnings falls outside the scope of the standard. It
should simply remain silent on this issue.
That is my opinion, at least until I see a more detailed and compelling
argument that the corresponding optimizations are sufficiently useful
and beneficial to make up for these disadvantages.
The C committee's rationale is basically, "what they said". :-)
--
Larry Jones
It COULD'VE happened by accident! -- Calvin
An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.
lawrenc...@siemens.com writes:
> John Regehr <reg...@cs.utah.edu> wrote:
>> Hans Boehm explained the C++ committee's rationale to me as follows
>> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):
>>
>> - A variety of advanced optimizations are handicapped or broken by the
>> requirement to match the termination behavior of the abstract machine.
Are there concrete examples of this?
>> - It's hard to find real examples where this matters.
At least one has been posted in this thread. Here's another
hypothetical but entirely plausible one:
while (1) {
#ifdef FEATURE_ONE
do_feature_one();
#endif
#ifdef FEATURE_TWO
do_feature_two();
#endif
#ifdef FEATURE_THREE
do_feature_three();
#endif
}
fputs("Internal error\n", stderr);
>> - It's better for the standard to take a position than to leave this
>> up in the air, as it seems to be in C.
In my opinion, it's not up in the air at all. The C standard
defines the semantics of iteration statements. It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".
I ask again, am I missing something? Can the current wording of
the C99 standard could be read in such a way as to permit this kind
of transformation?
>> - Many compilers already have this behavior anyway.
Examples?
> The C committee's rationale is basically, "what they said". :-)
My own rationale is this: If the standard permits the compiler to
violate the explicit semantics of a program in one special case,
it damages the credibility of the standard and of the language.
Just to make sure we all understand this correctly, does the new
rule actually permit this program:
#include <stdio.h>
int main(void)
{
while (1)
;
puts("Hello");
return 0;
}
to print "Hello"? As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate. (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)
[snip]
> The C standard defines the semantics of iteration statements. It
> doesn't allow non-terminating loops to terminate, any more than it
> allows ``if (0) puts("oops")'' to print "oops".
If you look back at the original code, the thing that is actually violated
is not the while statement's semantics, but that of "if".
static int
f(void)
{
while (1) {
if (impossible) return 0;
}
return -1;
}
Quoting myself (sorry!) from
<http://blog.regehr.org/archives/140#comment-323>,
"The standard doesn't say anything about the selection statement *having*
to evaluate to nonzero, and the compiler errs when it still assumes so."
(Obviously badly worded, it should have been "the controlling expression
of the selection statement".)
Cheers,
lacos
> In my opinion, it's not up in the air at all. The C standard
> defines the semantics of iteration statements. It doesn't allow
> non-terminating loops to terminate, any more than it allows
> ``if (0) puts("oops")'' to print "oops".
>
> I ask again, am I missing something? Can the current wording of
> the C99 standard could be read in such a way as to permit this kind
> of transformation?
N1124 says this:
"The least requirements on a conforming implementation are:
— At sequence points, volatile objects are stable in the sense that
previous accesses are complete and subsequent accesses have not yet
occurred.
— At program termination, all data written into files shall be
identical to the result that execution of the program according to the
abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take
place as specified in 7.19.3. The intent of these requirements is that
unbuffered or line-buffered output appear as soon as possible, to
ensure that prompting messages actually appear prior to a program
waiting for input."
The question is: Is the middle bullet talking about termination in the
abstract or actual semantics? If the former, then the compiler seems
to be permitted to terminate infinite loops. If the latter, then the
compiler is not permitted to do so. That's my reading at least.
> >> - Many compilers already have this behavior anyway.
>
> Examples?
C compilers from Intel, Sun, Open64, LLVM, and Microsoft.
> #include <stdio.h>
> int main(void)
> {
> while (1)
> ;
> puts("Hello");
> return 0;
>
> }
>
> to print "Hello"? As far as I can tell, it does; the proposed
> wording doesn't exclude cases where the loop can trivially be shown
> not to terminate. (I can imagine an argument that no real compiler
> would perform the transformation in this particular case, but that's
> not my question.)
The behavior of this program is undefined in C++0x. In C we don't
know...
John Regehr
I can think of several possible interpretations of the above -- it would be
good if the final standard made it clearer which of them, if any, is the
intended one:
#1 If, for a certain combination of program inputs, an iteration statement
described above loops forever (in the abstract machine), the behaviour is
undefined.
#2 If such an iteration statement loops forever, the implementation is free
to make it behave as if the controlling expression returned zero after an
arbitrary number of operations (even if it is a non-zero constant or some
other expression that cannot possibly ever return zero).
#3 If such an iteration statement loops forever, the implementation is free
to make it behave as if the controlling expression returned zero, or to
choose an arbitrary combination of "if" and "switch" statements containing a
"break" or "return" statement or a "goto" statement leading outside the
loop, and make them behave as if their controlling expressions returned
values that led to the execution of the "break" or "return" or "goto"
statement, even if some of them are constant expressions inconsistent with
such behaviour.
#4 Same as #3, but also add impossible "goto" statements leading to places
in the loop that make termination possible.
#5 Same as #4, but also allows executing "dead" statements such as the one
after a "break" or "return" statement.
#5 Same as #4, but also allow termination be reaching an unreachable "exit",
"raise", or "abort" call.
Example:
int fun( void ) {
int x = 0;
while ( 1 ) {
if ( 0 ) {
x = 1;
break;
ret2: return 2;
}
if ( 0 ) {
goto ret2;
return 3;
}
}
return x;
}
Presumably the function is allowed to return 0; but can it return 1? 2? 3?
Forgot about an important one:
#0 If, for a certain combination of program inputs, an iteration statement
described above loops forever (in the abstract machine), the implementation
is free to make it behave as if it called exit().
Um, *iterations*...
It probably doesn't matter in this case, but N1256 is newer.
N1124 includes the C99 standard plus TC1 and TC2; N1256 adds TC3.
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf>
> "The least requirements on a conforming implementation are:
> — At sequence points, volatile objects are stable in the sense that
> previous accesses are complete and subsequent accesses have not yet
> occurred.
> — At program termination, all data written into files shall be
> identical to the result that execution of the program according to the
> abstract semantics would have produced.
> — The input and output dynamics of interactive devices shall take
> place as specified in 7.19.3. The intent of these requirements is that
> unbuffered or line-buffered output appear as soon as possible, to
> ensure that prompting messages actually appear prior to a program
> waiting for input."
>
> The question is: Is the middle bullet talking about termination in the
> abstract or actual semantics? If the former, then the compiler seems
> to be permitted to terminate infinite loops. If the latter, then the
> compiler is not permitted to do so. That's my reading at least.
Are you saying that since the second bullet applies only to
file contents *at program termination*, all bets are off for
non-terminating programs?
If so, that's a much stronger statement than the proposal to allow
possibly infinite loops to be removed. And adding permission
to remove such loops really changes nothing; the behavior is
undefined anyway, so the compiler could replace the loop with a
call to format_disk().
Hmm. I suppose the requirements regarding volatile objects and
interactive devices are still in place.
>> >> - Many compilers already have this behavior anyway.
>>
>> Examples?
>
> C compilers from Intel, Sun, Open64, LLVM, and Microsoft.
According to <http://blog.regehr.org/archives/140>, LLVM no longer
does this; apparently its maintainers considered the behavior to
be a bug.
>> #include <stdio.h>
>> int main(void)
>> {
>> while (1)
>> ;
>> puts("Hello");
>> return 0;
>>
>> }
>>
>> to print "Hello"? As far as I can tell, it does; the proposed
>> wording doesn't exclude cases where the loop can trivially be shown
>> not to terminate. (I can imagine an argument that no real compiler
>> would perform the transformation in this particular case, but that's
>> not my question.)
>
> The behavior of this program is undefined in C++0x. In C we don't
> know...
The latest C201X draft contains wording similar (possibly identical, I
haven't checked) to the wording in the latest C++ draft.
I meant to ask about the behavior under the proposed C201X standard.
I don't see how it's undefined. The compiler has permission to
remove the loop. If it does so, the program prints "Hello"; if
it doesn't, the program never terminates and produces no output.
I'd say it's unspecified, not undefined.
Sorry, I didn't read carefully enough. The proposed language
doesn't just give the compiler permission to remove the loop; it
says that it "may be assumed by the implementation to terminate".
I suppose that means that if the program violates this assumption
(by not terminating), then the behavior is undefined.
Another way to state this is that it's a change to the meaning of
a loop.
By writing a loop, the programmer is asserting *either* that the
loop does something with visible behavior (performs I/O, accesses
volatile objects, or performs synchronization or atomic operations),
*or* that it eventually terminates. If I violate that assumption
by writing an infinite loop with no side effects, I am lying to the
compiler, and my program's behavior is undefined. (If I'm lucky,
the compiler might warn me about it.) The language doesn't provide
(no longer provides?) a way to express an infinite loop without
visible behavior.
I suggest that something similar to the previous paragraph is
a clearer way to express the idea than what's currently in the
draft. It expresses what the code means rather than giving the
implementation permission to do something to the code.
I'm still not at all convinced this is a good idea, but it's probably
not as obviously insane as I've been thinking it is.
This is, I believe, exactly right.
John Regehr
As far as the file contents go, yes. C permits buffered I/O. If your
program gets stuck in an infinite loop, buffered writes may never make it to
the disk.
> If so, that's a much stronger statement than the proposal to allow
> possibly infinite loops to be removed. And adding permission
> to remove such loops really changes nothing; the behavior is
> undefined anyway, so the compiler could replace the loop with a
> call to format_disk().
> Hmm. I suppose the requirements regarding volatile objects and
> interactive devices are still in place.
Right. It's not undefined behaviour -- only the contents of files are
unspecified.
Perhaps, although if that's the case, then I think it would be nice if the
standard made that clearer by either saying "shall terminate" or explicitly
mentioning undefined behaviour.
> Another way to state this is that it's a change to the meaning of
> a loop.
>
> By writing a loop, the programmer is asserting *either* that the
> loop does something with visible behavior (performs I/O, accesses
> volatile objects, or performs synchronization or atomic operations),
> *or* that it eventually terminates. If I violate that assumption
> by writing an infinite loop with no side effects, I am lying to the
> compiler, and my program's behavior is undefined. (If I'm lucky,
> the compiler might warn me about it.) The language doesn't provide
> (no longer provides?) a way to express an infinite loop without
> visible behavior.
Actually, "back: goto back;" is such a loop, and since it's not an iteration
statement, the text in question does not apply to it.
> I suggest that something similar to the previous paragraph is
> a clearer way to express the idea than what's currently in the
> draft. It expresses what the code means rather than giving the
> implementation permission to do something to the code.
I wouldn't even think that making an assumption about the code counts as
doing something *to* the code. :)
My main concern would be that it may make it harder to debug loops that are
infinite by mistake.
The standard allows a seeming infinite loop to terminate the program. For
instance, the standard allows a divide-by-zero to "trap-and-terminate" via
SIGFPE. Thus:
for (unsigned char i = 1; i > -1; i++) {
int n = 1/i;
}
Of course, divide-by-zero causes undefined behavior, so it's a little
incongruous that the standard recognizes and condones SIGFPE and it's
termination behavior. Really divide-by-zero occupies a middle ground between
unspecified and undefined (i.e. one of the enumerated behaviors is
undefinedness), as in reality a lot of other things in the standard, such as
perhaps type-punning.
Developers rely on, and C derives strength from, this structural ambiguity.
But the proposed change now gives the compiler license to shoot you in the
head for doing things that have always been condoned, if disfavored.
Condoned yet disfavored because in choosing not to incorporate more
sophisticated typing and control flow mechanisms, it's understood that the
trade-off includes providing some guarantees for unorthodox usage,
epitomized by the "as-if" rule. By abrogating this rule the standard is
shredding it's foundational contract. It's giving license to the compiler to
make assumptions, leaving the developer in the lurch without the ability to
refute those assumptions with new control flow constructs--e.g. not reached
or "may reach" statement qualifiers.
Without the complementary constructs the committee is shirking its duty. The
end result will be less confidence in the standard by developers, making
dependence on extensions comparatively more appealing, and so while
intending to bring implementors into the fold paradoxically creating the
conditions for the opposite to happen.
My main concern is that it's surprising. Programmers who aren't aware
of the new rule (even if it's really a consequence of existing rules)
will naturally assume that this:
while (1)
;
is an infinite loop, and nothing following it can ever execute.
Whether this is a good idea or not, it's bound to cause confusion.
And sometimes (rarely) you really need a way to execute forever without
doing anything. With this new rule, they'll either need to add an
otherwise unnecessary side effect, or use some other construct such as
``loop: goto loop;'' (with at least a paragraph of comments explaining
why it's necessary), or compile without optimization.
By "loop", I really meant "iteration statement".
In other words, I was wrong. Thanks for the correction.
I also have a sneaking suspicion that this new permission to assume
that certain loops terminate is potentially just one special case
of a whole class of such things.
The new rule covers "while (1) {}" but not "loop: goto loop;". Why?
And what other cases might there, where breaking the naive semantics
of certain code can make optimization easier?
Is this a slippery slope that logically leads to making the behavior
of even the most obvious code undefined, producing wrong answers
*really* quickly?
Or is there a more general rule hiding behind this special case,
something that can be expressed clearly enough that you don't have
to be a compiler expert to understand it?
(By asking these questions, I don't mean to suggest that I have a
clue what the answers might be.)
> In my opinion, it's not up in the air at all. The C standard
> defines the semantics of iteration statements. It doesn't allow
> non-terminating loops to terminate, any more than it allows
> ``if (0) puts("oops")'' to print "oops".
But does the time of the abstract machine need to correspond to the
physical time? What I mean is that an infinite time (in the abstract
machine) can mathematically be compressed into a finite time (in the
real world), and what is beyond this finite time is out of the scope
of the C standard. So,
while (1) { }
printf ("foo\n");
could actually be transformed into an empty statement with an
implementation-defined termination (at least in a freestanding
environment).
--
Vincent Lef�vre <vin...@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Ar�naire project (LIP, ENS-Lyon)