is C implementation allowed to terminate an infinite loop?

642 views
Skip to first unread message

John Regehr

unread,
May 3, 2010, 12:06:34 PM5/3/10
to
The C99 standard, as I read it, is not completely clear about whether
a conforming implementation may change the termination behavior of a
conforming program. In contrast, the Java language definition makes
it clear that the JVM must not terminate an infinite loop. The C++0x
draft standard is explicit that an implementation *can* terminate an
infinite loop.

The arguments seem to be as follows:

One one hand: Non-terminating executions are not mentioned in the list
of "least requirements" in 5.1.2.3 of N1124, so the compiler is free
to terminate an infinite loop.

On the other hand: Termination of infinite loops will change the side
effecting operations performed by a program, so a conforming
implementation must not do this.

Can anyone comment on the intent of the standards committee on this
point? Is there any chance this can be explicitly cleared up in a
future standard?

Consider this example, where "icc" is the current version of Intel's
compiler for x86:

regehr@john-home:~$ icc fermat2.c -o fermat2
regehr@john-home:~$ ./fermat2
Fermat's Last Theorem has been disproved.

Where fermat2.c is this code:

int fermat (void)
{
const int MAX = 1000;
int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>

int main (void)
{
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}

A number of other C implementations have this behavior, whereas others
preserve the termination behavior of their inputs.

The other day I wrote a slightly lighthearted blog entry about this
issue and it generated quite a bit of discussion:

http://blog.regehr.org/archives/140

Thanks.

John Regehr

Wojtek Lerch

unread,
May 3, 2010, 1:13:21 PM5/3/10
to
"John Regehr" <reg...@cs.utah.edu> wrote in message
news:d1525ae0-9759-4332...@q36g2000prg.googlegroups.com...

> The C99 standard, as I read it, is not completely clear about whether
> a conforming implementation may change the termination behavior of a
> conforming program. In contrast, the Java language definition makes
> it clear that the JVM must not terminate an infinite loop. The C++0x
> draft standard is explicit that an implementation *can* terminate an
> infinite loop.
>
> The arguments seem to be as follows:
>
> One one hand: Non-terminating executions are not mentioned in the list
> of "least requirements" in 5.1.2.3 of N1124, so the compiler is free
> to terminate an infinite loop.

If by "terminating" you mean something along the lines of "letting the next
statement after the loop execute", then I think it's a question of whether
5.1.2.3 should be interpreted as allowing implementations to produce side
effects of expressions that would never be evaluated in the abstrace machine
(and I believe the answer is no). If, on the other hand, you mean just
terminating the program instead of letting it loop forever, then I think
it's much more subtle.

There was a discussion here about this a while ago:

http://groups.google.ca/group/comp.std.c/browse_thread/thread/bbebffa8a5d5fecf/cae898722d689f49?hl=en&lnk=gst

Eric Sosman

unread,
May 3, 2010, 1:17:16 PM5/3/10
to
On 5/3/2010 12:06 PM, John Regehr wrote:
> [...]

> Consider this example, where "icc" is the current version of Intel's
> compiler for x86:
>
> regehr@john-home:~$ icc fermat2.c -o fermat2
> regehr@john-home:~$ ./fermat2
> Fermat's Last Theorem has been disproved.
>
> Where fermat2.c is this code:
> [... see up-thread ...]

Conjecture: The version that terminated was built with compiler
options resulting in a 16-bit `int'. (When I simulate this with a
whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)

Is today the First of April in the Orthodox calendar or in
hexadecimal or in metric or something like that?

--
Eric Sosman
eso...@ieee-dot-org.invalid

John Regehr

unread,
May 3, 2010, 1:34:39 PM5/3/10
to
On May 3, 11:17 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:

>      Conjecture: The version that terminated was built with compiler
> options resulting in a 16-bit `int'.  (When I simulate this with a
> whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
> where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)

No-- the Intel compiler on x86 has sizeof(int)==4. My code is
designed to not let any int exceed 2^31-1.

John

John Regehr

unread,
May 3, 2010, 1:46:42 PM5/3/10
to
On May 3, 11:13 am, "Wojtek Lerch" <wojte...@yahoo.ca> wrote:
> There was a discussion here about this a while ago:
> http://groups.google.ca/group/comp.std.c/browse_thread/thread/bbebffa...

Ah thanks. I'd done a cursory scan of the archives and missed this.

For the curious, here's the list of x86 compilers I've tried.

Will terminate an infinite loop in such a way that the observable
behavior of a program changes: Microsoft (Visual Studio 2008 and
2010), LLVM/Clang 2.7, Open64-x86 4.2.3, Sun CC 5.10, and Intel CC
11.1.072.

Has not been observed to change the observable behavior of a program
containing an infinite loop: various versions of gcc 3.x and 4.x, Wind
River Diab C compiler.

John Regehr

Eric Sosman

unread,
May 3, 2010, 1:48:34 PM5/3/10
to
On 5/3/2010 1:34 PM, John Regehr wrote:
> On May 3, 11:17 am, Eric Sosman<esos...@ieee-dot-org.invalid> wrote:
>
>> Conjecture: The version that terminated was built with compiler
>> options resulting in a 16-bit `int'. (When I simulate this with a
>> whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
>> where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)
>
> No-- the Intel compiler on x86 has sizeof(int)==4.

It still seems an "astonishing coincidence," and worth a
closer look. Try printing the value of INT_MAX, for example.

> My code is
> designed to not let any int exceed 2^31-1.

... whereas what's actually needed is code that doesn't
try to exceed INT_MAX. Initializing MAX to cbrt(INT_MAX/2)
instead of to 1000 might be prudent.

--
Eric Sosman
eso...@ieee-dot-org.invalid

Michal Nazarewicz

unread,
May 3, 2010, 1:59:03 PM5/3/10
to
> On May 3, 11:17 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
>>      Conjecture: The version that terminated was built with compiler
>> options resulting in a 16-bit `int'.  (When I simulate this with a
>> whole lot of (short) casts, I get a "disproof" at a=745, b=30, c=1,
>> where a*a*a = 413493625 = 0x18A56979 is calculated as 0x6979 = 27001.)

John Regehr <reg...@cs.utah.edu> writes:
> No-- the Intel compiler on x86 has sizeof(int)==4. My code is
> designed to not let any int exceed 2^31-1.

You could change to "unsigned long" to finish all comments like that.

--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86-tlen.pl>--<jid:mina86-jabber.org>--ooO--(_)--Ooo--

John Regehr

unread,
May 3, 2010, 5:45:41 PM5/3/10
to
On May 3, 11:48 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
> closer look.  Try printing the value of INT_MAX, for example.

regehr@john-home:~$ icc fermat2.c -o fermat2
regehr@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647


Fermat's Last Theorem has been disproved.

regehr@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;


int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>
#include <limits.h>

int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);

John Regehr

unread,
May 3, 2010, 5:52:03 PM5/3/10
to
On May 3, 11:17 am, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
>      Is today the First of April in the Orthodox calendar or in
> hexadecimal or in metric or something like that?

I wish. Eric I see that you're at Sun, so perhaps you'll prefer this
version:

regehr@john-home:~$ suncc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files. Use 'cc -flags' for details
regehr@john-home:~$ suncc -O fermat2.c -o fermat2
"fermat2.c", line 20: warning: statement not reached


regehr@john-home:~$ ./fermat2
sizeof(int)=4, INT_MAX=2147483647

Fermat's Last Theorem has been disproved.

regehr@john-home:~$ cat fermat2.c
int fermat (void)
{
const int MAX = 10;


int a=1,b=1,c=1;
while (1) {
if (((a*a*a) == ((b*b*b)+(c*c*c)))) return 1;
a++;
if (a>MAX) {
a=1;
b++;
}
if (b>MAX) {
b=1;
c++;
}
if (c>MAX) {
c=1;
}
}
return 0;
}

#include <stdio.h>
#include <limits.h>

int main (void)
{
printf ("sizeof(int)=%d, INT_MAX=%d\n", sizeof(int), INT_MAX);
if (fermat()) {

Eric Sosman

unread,
May 3, 2010, 6:31:19 PM5/3/10
to
On 5/3/2010 5:52 PM, John Regehr wrote:
> On May 3, 11:17 am, Eric Sosman<esos...@ieee-dot-org.invalid> wrote:
>> Is today the First of April in the Orthodox calendar or in
>> hexadecimal or in metric or something like that?
>
> I wish. Eric I see that you're at Sun,

Old news, I'm afraid. On the bright side, I had an enviable
87.5% success rate in surviving layoffs there ...

> so perhaps you'll prefer this
> version:
>
> regehr@john-home:~$ suncc -V
> cc: Sun C 5.10 Linux_i386 2009/06/03
> usage: cc [ options] files. Use 'cc -flags' for details
> regehr@john-home:~$ suncc -O fermat2.c -o fermat2
> "fermat2.c", line 20: warning: statement not reached
> regehr@john-home:~$ ./fermat2
> sizeof(int)=4, INT_MAX=2147483647
> Fermat's Last Theorem has been disproved.
> regehr@john-home:~$ cat fermat2.c

> [... code snipped, see up-thread ...]

Okay, the "closer look" proves my conjecture wrong. Also,
by limiting all arithmetic results to no more than 2000 you've
ruled out overflow even on the narrowest possible int. There's
undefined behavior from printing a size_t with "%d", but I'm just
not going to believe that makes a difference.

At a guess -- and it's only a guess, the compiler "sees"

- That the `return 0;' cannot execute (it says as much)

- Hence, if any `return' executes it must be the `return 1;'
inside the loop

- That the only side-effects are to a,b,c, all unobservable
outside the function

- Hence, those side-effects are unneeded once the function
returns

... hence, the whole function body can be replaced with `return 1;'
with no change in externally observable behavior.

Ugh.

If (*if*) that's indeed what's going on. A couple tests you
might try:

- Make any or all of a,b,c static

- Make any or all of a,b,c file-scope static

- Return the value of a,b, or c (or an expression using them)

But still: My reaction remains "Ugh." I think the reasoning
behind the final "hence" above is simply putrid. 6.8.5p4 says
that "An iteration statement causes a statement called the loop
body to be executed repeatedly until the controlling expression
compares equal to 0," which never happens with your constant
controlling expression `1'. 6.8.4.1p2 says that "[...] the first
substatement [of an `if'] is executed if the expression compares
unequal to 0," which never happens in your code. Therefore the
`return 1;' should not be executed (because the `if' never fires),
and the compilers that execute it have oatmeal between their ears,
if not something worse.

Ugh.

--
Eric Sosman
eso...@ieee-dot-org.invalid

John Regehr

unread,
May 3, 2010, 7:01:20 PM5/3/10
to
On May 3, 4:31 pm, Eric Sosman <esos...@ieee-dot-org.invalid> wrote:
>      Old news, I'm afraid.  On the bright side, I had an enviable
> 87.5% success rate in surviving layoffs there ...

Oops, there I go believing what I read on the web...

Your assessment about what's going on in the compilers sounds right.

> and the compilers that execute it have oatmeal between their ears,
> if not something worse.

Yeah I'd certainly agree with this.

John Regehr

Dag-Erling Smørgrav

unread,
May 4, 2010, 4:33:22 AM5/4/10
to
John Regehr <reg...@cs.utah.edu> writes:
> Consider this example, where "icc" is the current version of Intel's
> compiler for x86:
>
> regehr@john-home:~$ icc fermat2.c -o fermat2
> regehr@john-home:~$ ./fermat2
> Fermat's Last Theorem has been disproved.

What happens if you move a, b, c out of fermat() and inspect them in
main() if / when fermat() returns? Does the program behave differently?
I still have a very hard time believing any compiler would do this; my
money is on some sort of overflow / wraparound.

DES
--
Dag-Erling Smørgrav - d...@des.no

John Regehr

unread,
May 4, 2010, 12:08:57 PM5/4/10
to
On May 4, 2:33 am, Dag-Erling Smørgrav <d...@des.no> wrote:
> What happens if you move a, b, c out of fermat() and inspect them in
> main() if / when fermat() returns?  Does the program behave differently?

For icc and suncc, moving any of a, b, or c out of function scope
causes the emitted program to hang.

> I still have a very hard time believing any compiler would do this; my
> money is on some sort of overflow / wraparound.

I don't think so, but am happy to be corrected. All of these
compilers are easy to obtain.

Everything I've seen is consistent with C implementations that prefer
performance over correctness.

John Regehr

lawrenc...@siemens.com

unread,
May 4, 2010, 11:57:32 AM5/4/10
to
Eric Sosman <eso...@ieee-dot-org.invalid> wrote:
>
> At a guess -- and it's only a guess, the compiler "sees"
>
> - That the `return 0;' cannot execute (it says as much)
>
> - Hence, if any `return' executes it must be the `return 1;'
> inside the loop
>
> - That the only side-effects are to a,b,c, all unobservable
> outside the function
>
> - Hence, those side-effects are unneeded once the function
> returns
>
> ... hence, the whole function body can be replaced with `return 1;'
> with no change in externally observable behavior.
>
> Ugh.

That kind of optimization is apparently becomming more common. The
current C1X draft explicitly allows it in 6.8.5:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling
expression, or (in the case of a for statement) its expression-3,
may be assumed by the implementation to terminate.<footnote>

<footnote> This is intended to allow compiler transformations
such as removal of empty loops even when termination cannot be
proven.
--
Larry Jones

At times like these, all Mom can think of is how long she was in
labor with me. -- Calvin

Keith Thompson

unread,
May 4, 2010, 12:41:59 PM5/4/10
to
lawrenc...@siemens.com writes:
> Eric Sosman <eso...@ieee-dot-org.invalid> wrote:
[...]

>> ... hence, the whole function body can be replaced with `return 1;'
>> with no change in externally observable behavior.
>>
>> Ugh.
>
> That kind of optimization is apparently becomming more common. The
> current C1X draft explicitly allows it in 6.8.5:
>
> An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no
> synchronization or atomic operations in its body, controlling
> expression, or (in the case of a for statement) its expression-3,
> may be assumed by the implementation to terminate.<footnote>
>
> <footnote> This is intended to allow compiler transformations
> such as removal of empty loops even when termination cannot be
> proven.

Let me say again:

Ugh.

If I write:

while (1) {
/* statements */
}
fputs("ERROR: Shouldn't get here\n", stderr);

I should never see that message. If I do see it, the compiler is
broken. If the compiler generates code that produces the message
because the language allows it to do so, the language is broken.

If the body of the loop has no effect, I'd welcome a warning, and I'd
even accept treating it as an error and rejecting the program. Silently
removing an infinite loop is, IMNSHO, unacceptable.

Can you think of an example where the permission to remove an infinite
loop is actually beneficial?

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Eric Sosman

unread,
May 4, 2010, 1:28:57 PM5/4/10
to
On 5/4/2010 11:57 AM, lawrenc...@siemens.com wrote:
> [... "optimization" finitizes an infinite loop ...]

> That kind of optimization is apparently becomming more common. The
> current C1X draft explicitly allows it in 6.8.5:
>
> An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no
> synchronization or atomic operations in its body, controlling
> expression, or (in the case of a for statement) its expression-3,
> may be assumed by the implementation to terminate.<footnote>
>
> <footnote> This is intended to allow compiler transformations
> such as removal of empty loops even when termination cannot be
> proven.

I've only two things to say: (1) "Ugh," which I've said
before, and (2) "Time for the torchlight parade; bring your
own pitchfork."

--
Eric Sosman
eso...@ieee-dot-org.invalid

Dag-Erling Smørgrav

unread,
May 4, 2010, 1:33:01 PM5/4/10
to
lawrenc...@siemens.com writes:
> That kind of optimization is apparently becomming more common. The
> current C1X draft explicitly allows it in 6.8.5:
>
> An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no
> synchronization or atomic operations in its body, controlling
> expression, or (in the case of a for statement) its expression-3,
> may be assumed by the implementation to terminate.<footnote>
>
> <footnote> This is intended to allow compiler transformations
> such as removal of empty loops even when termination cannot be
> proven.

I appreciate the sentiment - loops are notoriously hard to prove even
for humans, and provably unprovable in the general case - but this is a
radical departure from the as-if rule. I can't think of a case where
applying this rule would break *useful* code, but it still feels wrong.

Eric Sosman

unread,
May 4, 2010, 2:03:00 PM5/4/10
to

In a free-standing setting, perhaps something like

void kernel_panic(int code) {
stop_devices_abruptly();
display_panic_code(code);
for (;;) {
/* This would be a good place for a HALT, if the
* CPU's instruction set included such a thing.
*/
}
}

... would count as "useful." I for one would be unhappy if a
call to kernel_panic() returned.

--
Eric Sosman
eso...@ieee-dot-org.invalid

John Regehr

unread,
May 4, 2010, 2:05:25 PM5/4/10
to
On May 4, 11:33 am, Dag-Erling Smørgrav <d...@des.no> wrote:
> radical departure from the as-if rule.  I can't think of a case where
> applying this rule would break *useful* code, but it still feels wrong.

It's a corner case for sure. However...

In embedded software one occasionally writes a deliberate infinite
loop. For example to hang up the CPU if main() returns, or to hang
the processor waiting for the watchdog timer to reboot the system. To
stop the compiler from deleting these loops, people are going to have
to put in a side-effecting operation such as an access to a volatile-
qualified object.

Deleting infinite loops also seems like an impediment to debugging.
If I accidentally write an infinite loop, I'd prefer my program to
hang so I can use a debugger to find the problem. If the compiler
deletes the loop and also computes a nonsensical result, as in the
Fermat example, I have no way to find the latent error in my system.

John Regehr

Message has been deleted
Message has been deleted

Wojtek Lerch

unread,
May 4, 2010, 2:44:31 PM5/4/10
to
<lawrenc...@siemens.com> wrote in message
news:cos5b7...@jones.homeip.net...

> That kind of optimization is apparently becomming more common. The
> current C1X draft explicitly allows it in 6.8.5:
>
> An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no
> synchronization or atomic operations in its body, controlling
> expression, or (in the case of a for statement) its expression-3,
> may be assumed by the implementation to terminate.<footnote>

Even if the controllong expression is a constant "1" and there's no "break"
or "return" inside?

Ugh.

I hope there's no plan to also remove goto from C1X?

Or to extend the above to also apply to loops made with goto?

Michal Nazarewicz

unread,
May 4, 2010, 4:32:25 PM5/4/10
to
> On May 4, 11:33 am, Dag-Erling Smørgrav <d...@des.no> wrote:
>> radical departure from the as-if rule.  I can't think of a case where
>> applying this rule would break *useful* code, but it still feels wrong.

Part of Linux 2.4.0:

NORET_TYPE void panic(const char * fmt, ...)
{
/* ... */
for(;;) {
CHECK_EMERGENCY_SYNC
/* the above macro may be empty */
}
}

I admit, pretty old, but hey! Besides, didn't check versions between
2.4.0 and 2.6.12-rc2 so this code may be present even in newer
versions.

Suffice to say, the wording of the draft would allow a compiler to break
a rather widely used application (and that's just a single example).

John Regehr <reg...@cs.utah.edu> writes:
> In embedded software one occasionally writes a deliberate infinite
> loop. For example to hang up the CPU if main() returns, or to hang
> the processor waiting for the watchdog timer to reboot the system. To
> stop the compiler from deleting these loops, people are going to have
> to put in a side-effecting operation such as an access to a volatile-
> qualified object.

Either that or people will include some inline assembler. In fact,
I think there was this comment that for gcc the following is enough:

for(;;) asm volatile("":::"memory");

Michal Nazarewicz

unread,
May 4, 2010, 4:33:41 PM5/4/10
to
> lawrenc...@siemens.com writes:
>> That kind of optimization is apparently becomming more common. The
>> current C1X draft explicitly allows it in 6.8.5:
>>
>> An iteration statement that performs no input/output operations,
>> does not access volatile objects, and performs no
>> synchronization or atomic operations in its body, controlling
>> expression, or (in the case of a for statement) its expression-3,
>> may be assumed by the implementation to terminate.<footnote>
>>
>> <footnote> This is intended to allow compiler transformations
>> such as removal of empty loops even when termination cannot be
>> proven.

Keith Thompson <ks...@mib.org> writes:
> while (1) {
> /* statements */
> }
> fputs("ERROR: Shouldn't get here\n", stderr);

It goes better:

for(;;);
start_a_termonuclear_war();

;)

Keith Thompson

unread,
May 4, 2010, 6:06:32 PM5/4/10
to
Michal Nazarewicz <min...@tlen.pl> writes:
>> lawrenc...@siemens.com writes:
>>> That kind of optimization is apparently becomming more common. The
>>> current C1X draft explicitly allows it in 6.8.5:
>>>
>>> An iteration statement that performs no input/output operations,
>>> does not access volatile objects, and performs no
>>> synchronization or atomic operations in its body, controlling
>>> expression, or (in the case of a for statement) its expression-3,
>>> may be assumed by the implementation to terminate.<footnote>
>>>
>>> <footnote> This is intended to allow compiler transformations
>>> such as removal of empty loops even when termination cannot be
>>> proven.
>
> Keith Thompson <ks...@mib.org> writes:
>> while (1) {
>> /* statements */
>> }
>> fputs("ERROR: Shouldn't get here\n", stderr);
>
> It goes better:
>
> for(;;);
> start_a_termonuclear_war();
>
> ;)

Nevertheless, I think it's safe to assume that the members of the
C standard committee aren't actually batcrackers insane. There's
probably some good rational reason for adding this permission to
the standard. I look forward to reading about this reason.

John Regehr

unread,
May 4, 2010, 6:56:45 PM5/4/10
to
On May 4, 4:06 pm, Keith Thompson <ks...@mib.org> wrote:
> Nevertheless, I think it's safe to assume that the members of the
> C standard committee aren't actually batcrackers insane.  There's
> probably some good rational reason for adding this permission to
> the standard.  I look forward to reading about this reason.

Just to be clear: the C standard is not explicit on this matter,
whereas the C++0x draft standard says that a C++ implementation may
assume that loops free of side effects are terminating, meaning that
non-terminating, side-effect free loops effectively have undefined
behavior.

Hans Boehm explained the C++ committee's rationale to me as follows
(I'm paraphrasing -- any mistakes/mischaracterizations are mine):

- A variety of advanced optimizations are handicapped or broken by the
requirement to match the termination behavior of the abstract machine.

- It's hard to find real examples where this matters.

- It's better for the standard to take a position than to leave this
up in the air, as it seems to be in C.

- Many compilers already have this behavior anyway.

Hope this is helpful.

John Regehr

Keith Thompson

unread,
May 4, 2010, 7:37:43 PM5/4/10
to
John Regehr <reg...@cs.utah.edu> writes:
> On May 4, 4:06 pm, Keith Thompson <ks...@mib.org> wrote:
>> Nevertheless, I think it's safe to assume that the members of the
>> C standard committee aren't actually batcrackers insane.  There's
>> probably some good rational reason for adding this permission to
>> the standard.  I look forward to reading about this reason.
>
> Just to be clear: the C standard is not explicit on this matter,
> whereas the C++0x draft standard says that a C++ implementation may
> assume that loops free of side effects are terminating, meaning that
> non-terminating, side-effect free loops effectively have undefined
> behavior.

As Larry Jones mentioned upthread, the C201X draft (not yet a standard)
is explicit. See
<http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1425.pdf>, 6.8.5p6:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.

with a footnote:

This is intended to allow compiler transformations such as removal
of empty loops even when termination cannot be proven.

I would argue that the C99 standard implicitly but unambiguously
disallows such removal by its description of the semantics of loops:

An iteration statement causes a statement called the loop body to be
executed repeatedly until the controlling expression compares

equal to 0.

I just don't see any wiggle room there.

> Hans Boehm explained the C++ committee's rationale to me as follows
> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):
>
> - A variety of advanced optimizations are handicapped or broken by the
> requirement to match the termination behavior of the abstract machine.

I'm sure they're also handicapped by the requirement to produce correct
results. 8-)}

> - It's hard to find real examples where this matters.
>
> - It's better for the standard to take a position than to leave this
> up in the air, as it seems to be in C.
>
> - Many compilers already have this behavior anyway.

I don't think it's currently up in the air. But I'm sure Hans
Boehm knows this stuff better than I do. It's likely I'm missing
something; I just don't know what.

Michal Nazarewicz

unread,
May 5, 2010, 3:13:40 AM5/5/10
to
John Regehr <reg...@cs.utah.edu> writes:
> Hans Boehm explained the C++ committee's rationale to me as follows
> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):

Any paper on that as it sounds interesting and I don't really see it?

> - A variety of advanced optimizations are handicapped or broken by the
> requirement to match the termination behavior of the abstract machine.
>
> - It's hard to find real examples where this matters.

I just gave one (Linux 2.4.0) in mere minutes. I'm sure people who are
in embeded and operating systems are likely to give more.

> - It's better for the standard to take a position than to leave this
> up in the air, as it seems to be in C.

I think we all agree on that.

Personally I'd rather standard said compiler can reject such programs
or produce diagnostic then modify it in an undefined way.

Michal Nazarewicz

unread,
May 5, 2010, 3:21:14 AM5/5/10
to
> John Regehr <reg...@cs.utah.edu> writes:
>> - It's better for the standard to take a position than to leave this
>> up in the air, as it seems to be in C.

Michal Nazarewicz <min...@tlen.pl> writes:
> Personally I'd rather standard said compiler can reject such programs
> or produce diagnostic then modify it in an undefined way.

On second thought, implementations can already produce diagnostics
whenever they want so let the standard just say that compiler cannot
assume loops terminate.

If it's such a burden for optimisation (which I still don't see)
implementations can simply provide a standard-incompatible option which
would turn such optimisations of.

James Kuyper

unread,
May 5, 2010, 8:33:04 AM5/5/10
to
lawrenc...@siemens.com wrote:
...

> That kind of optimization is apparently becomming more common. The
> current C1X draft explicitly allows it in 6.8.5:
>
> An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no
> synchronization or atomic operations in its body, controlling
> expression, or (in the case of a for statement) its expression-3,
> may be assumed by the implementation to terminate.<footnote>
>
> <footnote> This is intended to allow compiler transformations
> such as removal of empty loops even when termination cannot be
> proven.

I could not imagine code qualifying for such removal as being correct
(which, given the examples other people have provided from the kernal
and embedded programming world, simply proves my lack of imagination).
Therefore, I would always want it to execute in accordance to the way it
was actually written, which would at least allow it's unintended
behavior to be debugged.

If an implementation can determine that a particular loop can never
terminate (which, of course, it cannot do in the general case), it could
generate a warning message, and I'd be happy to see that, because in my
code it would almost always be the result of a coding error. However,
given the examples other people have provided, rejecting such code or
specifying that it's behavior is undefined would not be acceptable.

I wouldn't mind if the compiler also gave a warning even when it
couldn't prove non-termination, so long as it was also having an unusual
amount of trouble proving that the loop can terminate. This warning,
however, should definitely be one that can be turned off.

Since these warnings cannot be made mandatory, and must not prevent
compilation of the code according to it's specified semantics, saying
anything about such warnings falls outside the scope of the standard. It
should simply remain silent on this issue.

That is my opinion, at least until I see a more detailed and compelling
argument that the corresponding optimizations are sufficiently useful
and beneficial to make up for these disadvantages.

lawrenc...@siemens.com

unread,
May 5, 2010, 11:43:10 AM5/5/10
to
John Regehr <reg...@cs.utah.edu> wrote:
>
> Hans Boehm explained the C++ committee's rationale to me as follows
> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):
>
> - A variety of advanced optimizations are handicapped or broken by the
> requirement to match the termination behavior of the abstract machine.
>
> - It's hard to find real examples where this matters.
>
> - It's better for the standard to take a position than to leave this
> up in the air, as it seems to be in C.
>
> - Many compilers already have this behavior anyway.

The C committee's rationale is basically, "what they said". :-)
--
Larry Jones

It COULD'VE happened by accident! -- Calvin

Keith Thompson

unread,
May 5, 2010, 12:47:06 PM5/5/10
to
Reference: N1425 6.8.5p6:

An iteration statement that performs no input/output operations,
does not access volatile objects, and performs no synchronization
or atomic operations in its body, controlling expression,
or (in the case of a for statement) its expression-3, may be
assumed by the implementation to terminate.

lawrenc...@siemens.com writes:
> John Regehr <reg...@cs.utah.edu> wrote:
>> Hans Boehm explained the C++ committee's rationale to me as follows
>> (I'm paraphrasing -- any mistakes/mischaracterizations are mine):
>>
>> - A variety of advanced optimizations are handicapped or broken by the
>> requirement to match the termination behavior of the abstract machine.

Are there concrete examples of this?

>> - It's hard to find real examples where this matters.

At least one has been posted in this thread. Here's another
hypothetical but entirely plausible one:

while (1) {
#ifdef FEATURE_ONE
do_feature_one();
#endif
#ifdef FEATURE_TWO
do_feature_two();
#endif
#ifdef FEATURE_THREE
do_feature_three();
#endif
}
fputs("Internal error\n", stderr);

>> - It's better for the standard to take a position than to leave this
>> up in the air, as it seems to be in C.

In my opinion, it's not up in the air at all. The C standard
defines the semantics of iteration statements. It doesn't allow
non-terminating loops to terminate, any more than it allows
``if (0) puts("oops")'' to print "oops".

I ask again, am I missing something? Can the current wording of
the C99 standard could be read in such a way as to permit this kind
of transformation?

>> - Many compilers already have this behavior anyway.

Examples?

> The C committee's rationale is basically, "what they said". :-)

My own rationale is this: If the standard permits the compiler to
violate the explicit semantics of a program in one special case,
it damages the credibility of the standard and of the language.

Just to make sure we all understand this correctly, does the new
rule actually permit this program:

#include <stdio.h>
int main(void)
{
while (1)
;
puts("Hello");
return 0;
}

to print "Hello"? As far as I can tell, it does; the proposed
wording doesn't exclude cases where the loop can trivially be shown
not to terminate. (I can imagine an argument that no real compiler
would perform the transformation in this particular case, but that's
not my question.)

Ersek, Laszlo

unread,
May 5, 2010, 2:18:11 PM5/5/10
to
On Wed, 5 May 2010, Keith Thompson wrote:

[snip]

> The C standard defines the semantics of iteration statements. It
> doesn't allow non-terminating loops to terminate, any more than it
> allows ``if (0) puts("oops")'' to print "oops".

If you look back at the original code, the thing that is actually violated
is not the while statement's semantics, but that of "if".

static int
f(void)
{
while (1) {
if (impossible) return 0;
}
return -1;
}

Quoting myself (sorry!) from
<http://blog.regehr.org/archives/140#comment-323>,

"The standard doesn't say anything about the selection statement *having*
to evaluate to nonzero, and the compiler errs when it still assumes so."

(Obviously badly worded, it should have been "the controlling expression
of the selection statement".)

Cheers,
lacos

John Regehr

unread,
May 5, 2010, 3:04:12 PM5/5/10
to
On May 5, 10:47 am, Keith Thompson <ks...@mib.org> wrote:

> In my opinion, it's not up in the air at all.  The C standard
> defines the semantics of iteration statements.  It doesn't allow
> non-terminating loops to terminate, any more than it allows
> ``if (0) puts("oops")'' to print "oops".
>
> I ask again, am I missing something?  Can the current wording of
> the C99 standard could be read in such a way as to permit this kind
> of transformation?

N1124 says this:

"The least requirements on a conforming implementation are:
— At sequence poin