Don't trust your optimizer :-)

Martin B.

unread,

Nov 9, 2010, 2:52:04 PM11/9/10

to

Hi all,

I found this[1] discussion on the MS VC++Lang forum very interesting and
informative.

[1] :
http://social.msdn.microsoft.com/Forums/en/vclanguage/thread/24807869-f717-4c66-993b-d14bb16fd428

It is about how you can't trust the compiler/optimizer even if you only have one
thread. (You can't trust it to do what you think it will do. What it did was
perfectly acceptable according to the std.)

The OP wanted to time some code but got his timing statements unexpectedly
reordered by the optimizer:

...
start = clock();
result = foo(ITERATIONS);
elapsed = clock() - start;
...

Due to foo being known to the compiler (by whole program optimization) the
optimizer decided that since foo was a pure function:

....
double foo (int iterations) {
double acc=4.0;
int i;
for (i = 1 ; i <= iterations ; i++)
acc += (i&1) ?
((double) -4 / (2*i+1)) : ((double) 4/ (2*i+1));
return acc;
}
...

it could *reorder* the calls to calculate start and elapsed.
Boom! Timing gone.

cheers,
Martin

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

ZMZ

unread,

Nov 10, 2010, 12:27:25 PM11/10/10

to

On Nov 10, 3:52 am, "Martin B." <0xCDCDC...@gmx.at> wrote:
> Hi all,
>
> I found this[1] discussion on the MS VC++Lang forum very interesting and
> informative.
>

> [1] :http://social.msdn.microsoft.com/Forums/en/vclanguage/thread/24807869...

Such optimization is really a very old problem inherited from C.

clock() to me is a hardware-related function.

Imagine the following pesudocode, you have to put volatile to stop the
optimization.

global int event = 0;
__interrupt void timer_interrupt()
{
event = 1;
}

int main()
{
init_timer();
while(1)
{
//oops, the whole if statement is optimized away
//because compiler thinks that event is always 0
if(event)
{
do_something();
event = 0;
}
}
return 0;

Vaclav Haisman

unread,

Nov 10, 2010, 12:29:12 PM11/10/10

to

Martin B. wrote, On 9.11.2010 20:52:
> Hi all,
>
> I found this[1] discussion on the MS VC++Lang forum very interesting and
> informative.
>
> [1] :
> http://social.msdn.microsoft.com/Forums/en/vclanguage/thread/24807869-f717-4c66-993b-d14bb16fd428
>
>
> It is about how you can't trust the compiler/optimizer even if you only have
> one thread. (You can't trust it to do what you think it will do. What it did
> was perfectly acceptable according to the std.)
>
> The OP wanted to time some code but got his timing statements unexpectedly
> reordered by the optimizer:
>
> ...
>

> Due to foo being known to the compiler (by whole program optimization) the
> optimizer decided that since foo was a pure function:
>
> ....
>

> it could *reorder* the calls to calculate start and elapsed.
> Boom! Timing gone.

I think that it is just the expectations of the OP that went "Boom!" And
also, I think the timing is precise, there is simply nothing to be done in
between the two calls to clock(). The compiler was smarter than the OP
expected. The OP should be happy it instead of complaining.

--
VH

Goran Pusic

unread,

Nov 10, 2010, 12:28:10 PM11/10/10

to

On Nov 9, 8:52 pm, "Martin B." <0xCDCDC...@gmx.at> wrote:
> Hi all,
>
> I found this[1] discussion on the MS VC++Lang forum very interesting and
> informative.
>

> [1] :http://social.msdn.microsoft.com/Forums/en/vclanguage/thread/24807869...

>
> It is about how you can't trust the compiler/optimizer even if you only have one
> thread. (You can't trust it to do what you think it will do. What it did was
> perfectly acceptable according to the std.)
>
> The OP wanted to time some code but got his timing statements unexpectedly
> reordered by the optimizer:
>
> ...
> start = clock();
> result = foo(ITERATIONS);
> elapsed = clock() - start;
> ...
>
> Due to foo being known to the compiler (by whole program optimization) the
> optimizer decided that since foo was a pure function:
>
> ....
> double foo (int iterations) {
> double acc=4.0;
> int i;
> for (i = 1 ; i <= iterations ; i++)
> acc += (i&1) ?
> ((double) -4 / (2*i+1)) : ((double) 4/ (2*i+1));
> return acc;
> }
> ...
>
> it could *reorder* the calls to calculate start and elapsed.
> Boom! Timing gone.

That was... Informative :-(.

Being in C++, I typically use a class, whose ctor takes the start, and
dtor traces out the timing. I wonder now, does this help? To be
honest, I see no reason why it would, compiler could be (or get) smart
enough to slap constructor and destructor call together.

But... Using the function you gave here and this code on VS 2008 Pro:

extern double foo (int iterations); // in another file
int _tmain(int argc, TCHAR* argv[], TCHAR* envp[])
{
clock_t start=clock();
foo(10000);
printf("%d\n", clock()-start);
return 0;
}

The above gives:

int _tmain(int argc, TCHAR* argv[], TCHAR* envp[])
{
00B61000 push esi
clock_t start=clock();
00B61001 mov esi,dword ptr [__imp__clock (0B6209Ch)]
00B61007 push edi
00B61008 call esi
00B6100A mov edi,eax
foo(10000);
printf("%d\n", clock()-start);
00B6100C call esi
00B6100E sub eax,edi
00B61010 push eax
00B61011 push 0B62198h
00B61016 call dword ptr [__imp__printf (0B62098h)]
00B6101C add esp,8
00B6101F pop edi
return 0;
00B61020 xor eax,eax
00B61022 pop esi
}
00B61023 ret

I reckon, two call esi are calls to clock(), there's a call to printf
etc., but the important thing is: ___there's no call to foo at all___
(I checked the assembly output, too). And in fact, in the assembly
output in VC thread, there's no call to foo either (and calls to
clock() can't be reordered at all, since there's nothing to
reorder :-).

Did everybody on VC thread over there miss the boat? That's what I am
seeing, although it's hard to believe, there's some people there I
consider smart ( not that this is saying much about them or me ;-) ).

So I kinda find that the whole thread is moot, really ( except for the
poor soul who does want to measure foo() without any side-
effects ;-) ). I did find myself fighting the optimizer before when
trying to measure these kinds of things, but I never bothered to find
exact reasons and approaches to "get it". I did put it down to
optimization and was happy to lower optimization level or add a side-
effect, just to get the ballpark figure, which was my goal anyhow.
Perhaps OP should measure e.g. like this:.

start = clock();
printf("%f", foo());
printf("%d", clock()-start);

start = clock();
printf("%f", 123.456);
printf("%d", clock()-start);

and substract the difference from running printf using time from the
second snippet (he will need to run it multiple times, or reach for
performance counters). That's not as accurate, but gets it done.

By the way, these micro-tests are so widely influenced by the rest of
the code, data size or iteration numbers, it's not even funny. That's
just not the way to test speed these days. One simply needs bigger
code, real code. Only ballpark figures can be found out this way. Or
Java versus C++ stuff, which is kinda not interesting, as approaches
to solving a problem vary +/- wildly with the language, and simple
arithmetics like the above won't really kill your speed, not even in
Java.

Goran.

Bo Persson

unread,

Nov 10, 2010, 5:19:07 PM11/10/10

to

Why should it call foo, when all that does is calculate some value that is never
used? Dead code elimination is certainly one pass of the optimizer.

Bo Persson

Jens Schmidt

unread,

Nov 10, 2010, 5:20:35 PM11/10/10

to

Goran Pusic wrote:

> I reckon, two call esi are calls to clock(), there's a call to printf
> etc., but the important thing is: ___there's no call to foo at all___
> (I checked the assembly output, too). And in fact, in the assembly
> output in VC thread, there's no call to foo either (and calls to
> clock() can't be reordered at all, since there's nothing to
> reorder :-).
>
> Did everybody on VC thread over there miss the boat? That's what I am
> seeing, although it's hard to believe, there's some people there I
> consider smart ( not that this is saying much about them or me ;-) ).

That is why there was a printf for the result of foo() in the original
code.

With advanced compiler technology, the compiler can also find out that
the result of foo() depends on its known argument only and precompute that.
Then not even printing the result would ensure a call to foo().
--
Greetings,
Jens Schmidt

Jens Schmidt

unread,

Nov 10, 2010, 5:20:19 PM11/10/10

to

ZMZ wrote:

> global int event = 0;
> __interrupt void timer_interrupt()
> {
> event = 1;
> }
>
>
> int main()
> {
> init_timer();
> while(1)
> {
> //oops, the whole if statement is optimized away
> //because compiler thinks that event is always 0
> if(event)
> {
> do_something();
> event = 0;
> }
> }
> return 0;
> }

The if-part can be optimized away only if the compiler is able to prove that
init_timer() can't change event. But then, after the first test, all further
tests are guaranteed to be false. So the code can be rewritten to:

int main()
{
init_timer();

if(event)
{
do_something();
event = 0;
}

while(1)
{
}
// not reached
}
--
Greetings,
Jens Schmidt

restor

unread,

Nov 11, 2010, 8:02:33 AM11/11/10

to

> I found this[1] discussion on the MS VC++Lang forum very interesting and
> informative.
>
> [1]

:http://social.msdn.microsoft.com/Forums/en/vclanguage/thread/24807869...

> it could *reorder* the calls to calculate start and elapsed.
> Boom! Timing gone.

Hi, thanks for sharing this very interesting example with us. The
conclusion I draw from it, is that although we often say "optimizer
rearranges the code so that it still has the same behavior," it is
absolutely incorrect. In fact what we expect of the optimizer is just
the oposite. We want optimizer to change the behaviort of our program:
we want program to run faster - this is a change of behavior; we want
it to eat less RAM - it is a change of behavior; we want it to occupy
less disk space (is it possible?) - a change in behaviour. Whereas in
most cases those changes in behavior are desireable, in some
situations they just spoil your program, as in the example you
mentioned. I can think of more such examples.
You want to write a program that tests RAM and wants to use all of it.
Every piece of memory.
Or you want to control your executable's size, so that it is the same
as the previous version's.
Or, you might want to write every byte of your disk, in order to make
sure everything that was there was ereased.
Or, you might want to emulate a program being hung by writing:

for(;;){}

even that can be optimized away. If one of these aspects is crucial to
your program, like times when certain lines of code get executed,
optimizer becomes your enemy: after all its purpose is to change (some
aspects of) the meaning of the program.

Regards,
&rzej

Bart van Ingen Schenau

unread,

Nov 11, 2010, 8:01:17 AM11/11/10

to

On Nov 10, 6:28 pm, Goran Pusic <gor...@cse-semaphore.com> wrote:
>
> That was... Informative :-(.
>
> Being in C++, I typically use a class, whose ctor takes the start, and
> dtor traces out the timing. I wonder now, does this help? To be
> honest, I see no reason why it would, compiler could be (or get) smart
> enough to slap constructor and destructor call together.
>

I sure as hell hope that no compiler is stupid enough to meddle with
the lifetime of class-type objects whose constructor and/or destructor
have side-effects.
Such a compiler would break all RAII-based scoped-locking constructs.

Bart v Ingen Schenau

Goran

unread,

Nov 11, 2010, 12:30:29 PM11/11/10

to

On Nov 10, 11:20 pm, Jens Schmidt <Jens.Schmidt...@gmx.de> wrote:
> Goran Pusic wrote:
> > I reckon, two call esi are calls to clock(), there's a call to printf
> > etc., but the important thing is: ___there's no call to foo at all___
> > (I checked the assembly output, too). And in fact, in the assembly
> > output in VC thread, there's no call to foo either (and calls to
> > clock() can't be reordered at all, since there's nothing to
> > reorder :-).
>
> > Did everybody on VC thread over there miss the boat? That's what I am
> > seeing, although it's hard to believe, there's some people there I
> > consider smart ( not that this is saying much about them or me ;-) ).
>
> That is why there was a printf for the result of foo() in the original
> code.

Ugh. Yes, I missed the call to foo, further down in Spyker's assembly.
What a stupid oversight :-(.

Goran.

--

Falk Tannhäuser

unread,

Nov 12, 2010, 10:42:13 AM11/12/10

to

Am 11.11.2010 14:02, schrieb restor:
> Hi, thanks for sharing this very interesting example with us. The
> conclusion I draw from it, is that although we often say "optimizer
> rearranges the code so that it still has the same behavior," it is
> absolutely incorrect. In fact what we expect of the optimizer is just
> the oposite. We want optimizer to change the behaviort of our program:
> we want program to run faster - this is a change of behavior; we want
> it to eat less RAM - it is a change of behavior; we want it to occupy
> less disk space (is it possible?) - a change in behaviour.

The Standard explicitly specifies only *observable* behaviour, which is access
to volatile objects and I/O operations. In particular, neither execution time
nor memory footprint nor executable file size constitute observable behaviour in
the sense of the Standard - these are rather QoI issues. (Well, one could argue
that the clock() function is a kind of input function, but this doesn't change
the fact that the absolute timing of what is executed between 2 calls to clock()
isn't part of the observable behaviour.)
For some algorithms of the Standard Library, the algorithmic complexity is
specified (in terms of number of operations), but nevertheless the Standard
remains silent about absolute timings.
There are only very few situations where the Standard leaves to the
implementation the freedom to alter observable behaviour for optimisation
purpose (eliminating calls to copy constructors, even if they have observable
side effects, is the only situation that comes to my mind right now).

> Whereas in
> most cases those changes in behaviour are desirable, in some

> situations they just spoil your program, as in the example you
> mentioned. I can think of more such examples.
> You want to write a program that tests RAM and wants to use all of it.
> Every piece of memory.
> Or you want to control your executable's size, so that it is the same
> as the previous version's.
> Or, you might want to write every byte of your disk, in order to make

> sure everything that was there was erased.

You're right - you can't do these things with the means of Standard C++ alone
(be in with or without optimisation), you need to know the target platform (CPU,
HW and OS) to do these. How do you test your RAM without knowing about caching,
virtual memory, segmentation/pagination? How do you write "every byte of your
disk" when the underling OS performs caching or transparent on-the-fly
compression or encryption, or when it implements disk quota or RAID etc. - and
even without these things there are zones on your disk that will never get
written to when you just write to a file on said disk (which is all you can do
as long as you stick with the Standard I/O functions).

> Or, you might want to emulate a program being hung by writing:
>
> for(;;){}
>
> even that can be optimized away. If one of these aspects is crucial to
> your program, like times when certain lines of code get executed,
> optimizer becomes your enemy: after all its purpose is to change (some
> aspects of) the meaning of the program.

The endless loop can't be optimised away by a Standard conforming compiler, as
it would lead to (potentially observable) execution of the statement following
the loop.

However, a compiler has the right to optimise away the loop
for(unsigned long long i = 0;
i < std::numeric_limits<unsigned long long>::max();
++i) {}
which otherwise is likely to run until long after we all are dead, even on a
modern desktop machine. FWIW GCC 4.5.0 does this optimisation when given -O2 (or
higher) option.

Falk

Joshua Maurice

unread,

Nov 12, 2010, 9:59:32 PM11/12/10

to

On Nov 12, 7:42 am, Falk Tannh�user <clcppm-pos...@this.is.invalid>
wrote:

Actually, I wish this was the case. AFAIK, this is the case for C++98
and C++03. However, the wording has been changed in some of the newer
drafts, much to the dismay of many on these boards, including myself.
Specifically, one of the new C++0x drafts specifically says that the
compiler /can/ optimize away endless loops, such as:
for (;;);
Here's a post I made on comp.std.c++ about the issue. More details can
be found there.

http://groups.google.com/group/comp.std.c++/browse_thread/thread/fb0f396824b9a801#

restor

unread,

Nov 14, 2010, 7:46:53 AM11/14/10

to

On Nov 12, 4:42 pm, Falk Tannh�user <clcppm-pos...@this.is.invalid>
wrote:

> Am 11.11.2010 14:02, schrieb restor:
>
> There are only very few situations where the Standard leaves to the
> implementation the freedom to alter observable behaviour for optimisation
> purpose (eliminating calls to copy constructors, even if they have observable
> side effects, is the only situation that comes to my mind right now).

Just as a side note, let me remark that when concept axioms are
introduced into the standard (not C++0x apparently), we will have a
tool for allowing the compiler alterations to the side effects of a
program.
Expression (x + 1 - 1) cannot be currently reduced to (x) because we
add ints left to right, and x + 1 may overflow, and this must be
signalled. With
axiom AvoidOverflow( T x, T y ) { x + y - y <=> x } compiler can
reduce the expression.

Regards,
&rzej

Marc

unread,

Nov 14, 2010, 7:26:33 PM11/14/10

to

restor wrote:

> On Nov 12, 4:42 pm, Falk Tannhäuser <clcppm-pos...@this.is.invalid>

> wrote:
>> There are only very few situations where the Standard leaves to the
>> implementation the freedom to alter observable behaviour for optimisation
>> purpose (eliminating calls to copy constructors, even if they have observable
>> side effects, is the only situation that comes to my mind right now).

It is even too strict in the way it gives this permission (core issue
1049).
(assuming that loops finish is the only other case I can think of)

> Just as a side note, let me remark that when concept axioms are
> introduced into the standard (not C++0x apparently), we will have a
> tool for allowing the compiler alterations to the side effects of a
> program.

Axioms seemed to be one of the least understood parts of concepts. It is
not clear how easily a compiler could make use of them.

> Expression (x + 1 - 1) cannot be currently reduced to (x) because we
> add ints left to right, and x + 1 may overflow, and this must be
> signalled.

Why and how must this be signalled? I thought that overflow of int was
undefined behavior and the compiler was perfectly allowed to do this
simplification (assuming x is an int of course).

restor

unread,

Nov 15, 2010, 3:12:34 PM11/15/10

to

> > Expression (x + 1 - 1) cannot be currently reduced to (x) because we
> > add ints left to right, and x + 1 may overflow, and this must be
> > signalled.
>
> Why and how must this be signalled? I thought that overflow of int was
> undefined behavior and the compiler was perfectly allowed to do this
> simplification (assuming x is an int of course).

You are right. I am looking at one of the current standard drafts
(N3126) and it says in 5.4:
"If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable values for
its type, the behavior is undefined."

My claim is based on another section 1.9:9, which I quote below.
Perhaps I over-interpeted it.
Regards,
&rzej

operators can be regrouped according to the usual mathematical rules
only where the operators really
are associative or commutative.7 For example, in the following
fragment
int a, b;
/* ... */
a = a + 32760 + b + 5;
the expression statement behaves exactly the same as
a = (((a + 32760) + b) + 5);
due to the associativity and precedence of these operators. Thus, the
result of the sum (a + 32760) is next
added to b, and that result is then added to 5 which results in the
value assigned to a. On a machine in which
overflows produce an exception and in which the range of values
representable by an int is [-32768,+32767],
the implementation cannot rewrite this expression as
a = ((a + b) + 32765);
since if the values for a and b were, respectively, -32754 and -15,
the sum a + b would produce an exception
while the original expression would not; nor can the expression be
rewritten either as
a = ((a + 32765) + b);
or
a = (a + (b + 32765));
since the values for a and b might have been, respectively, 4 and -8
or -17 and 12. However on a machine in
which overflows do not produce an exception and in which the results
of overflows are reversible, the above
expression statement can be rewritten by the implementation in any of
the above ways because the same
result will occur.

Bart van Ingen Schenau

unread,

Nov 15, 2010, 3:12:21 PM11/15/10

to

On Nov 14, 1:46 pm, restor <akrze...@gmail.com> wrote:

> Expression (x + 1 - 1) cannot be currently reduced to (x) because we
> add ints left to right, and x + 1 may overflow, and this must be
> signalled.

This is not true. Any current compiler is allowed to change
int x = /*something*/;
int y = x + 1 - 1;
to
int x = /*something*/;
int y = x;
under the 'as-if' rule.
If x == INT_MAX, then the first version invokes Undefined Behaviour,
so all bets are off and in all other cases, there is no difference in
observable behaviour.
As a compiler is completely free to do what it wants if the program
invokes UB, the compiler can perform the rewrite unconditionally. Note
that the compiler is not even required to behave consistently in the
face of UB.

>
> Regards,
> &rzej
>
Bart v Ingen Schenau

Daniel Krügler

unread,

Nov 16, 2010, 9:03:21 AM11/16/10

to

On 13.11.2010 03:59, Joshua Maurice wrote:
> On Nov 12, 7:42 am, Falk Tannhäuser<clcppm-pos...@this.is.invalid>

> wrote:
>> Am 11.11.2010 14:02, schrieb restor:
>>> Or, you might want to emulate a program being hung by writing:
>>
>>> for(;;){}
>>
>>> even that can be optimized away. If one of these aspects is crucial to
>>> your program, like times when certain lines of code get executed,
>>> optimizer becomes your enemy: after all its purpose is to change (some
>>> aspects of) the meaning of the program.
>>
>> The endless loop can't be optimised away by a Standard conforming
compiler, as
>> it would lead to (potentially observable) execution of the statement
following
>> the loop.
>>
>> However, a compiler has the right to optimise away the loop
>> for(unsigned long long i = 0;
>> i< std::numeric_limits<unsigned long long>::max();
>> ++i) {}
>> which otherwise is likely to run until long after we all are dead, even
on a
>> modern desktop machine. FWIW GCC 4.5.0 does this optimisation when given
-O2 (or
>> higher) option.
>
> Actually, I wish this was the case.

I assume this refers to the endless loop case, right?

> AFAIK, this is the case for C++98
> and C++03. However, the wording has been changed in some of the newer
> drafts, much to the dismay of many on these boards, including myself.
> Specifically, one of the new C++0x drafts specifically says that the
> compiler /can/ optimize away endless loops, such as:
> for (;;);

Well, this is correct as written, because the loop alone does not induce
observable behavior. Can you give a more specific example program that
would demonstrate such a difference and that would also show that these
rules have negative aspects? For the discussion assume that the most
recently approved constraints would hold as addition to 1.10:

<quote>
The implementation is allowed to assume that any thread will eventually
do one of the following:

* terminate,
* make a call to a library I/O function,
* access or modify a volatile object, or
* perform a synchronization operation or an atomic operation.

[ Note: This is intended to allow compiler transformations, such as
removal of empty loops, even when termination cannot be proven. — end note ]
</quote>

and assume a strike of the current 6.5p5.

> Here's a post I made on comp.std.c++ about the issue. More details can
> be found there.
>
>
http://groups.google.com/group/comp.std.c++/browse_thread/thread/fb0f396824b
9a801#

I did look into that, but could not find an answer to my previous question.

Greetings from Bremen,

Daniel Krügler

Joshua Maurice

unread,

Nov 17, 2010, 9:08:16 PM11/17/10

to

On Nov 16, 6:03 am, Daniel Krügler <daniel.krueg...@googlemail.com>
wrote:

Yes. Odd that it munged the link. Let me try again:

http://groups.google.com/group/comp.std.c++/browse_thread/thread/fb0f396824b9a801#

In case it munges it again, here's the header:

Newsgroups: comp.std.c++
From: Joshua Maurice <joshuamaur...@gmail.com>
Date: Tue, 21 Sep 2010 11:41:03 CST
Local: Tues, Sep 21 2010 9:41 am
Subject: Defect Report: non-terminating loops should not be undefined
behavior

> > AFAIK, this is the case for C++98
> > and C++03. However, the wording has been changed in some of the newer
> > drafts, much to the dismay of many on these boards, including myself.
> > Specifically, one of the new C++0x drafts specifically says that the
> > compiler /can/ optimize away endless loops, such as:
> > for (;;);
>
> Well, this is correct as written, because the loop alone does not induce
> observable behavior. Can you give a more specific example program that
> would demonstrate such a difference and that would also show that these
> rules have negative aspects? For the discussion assume that the most
> recently approved constraints would hold as addition to 1.10:
>
> <quote>
> The implementation is allowed to assume that any thread will eventually
> do one of the following:
>
> * terminate,
> * make a call to a library I/O function,
> * access or modify a volatile object, or
> * perform a synchronization operation or an atomic operation.
>
> [ Note: This is intended to allow compiler transformations, such as
> removal of empty loops, even when termination cannot be proven. — end note ]
> </quote>
>
> and assume a strike of the current 6.5p5.

Thanks for the most up to date wording.

I think my points in the other thread can be summed up as:

C++ inherits a lot of its design goals from C. C is a portable
assembly language, or at least C is meant to be portable, and to be
cost competitive with hand written assembly. C++ shares this design
goal: C++ should be portable, and when used correctly it should be
cost competitive with hand written assembly.

When I write a piece of code, I expect that piece of code toe be
faithfully executed, or at least "as if" it was faithfully executed.
This new compiler allowance is intended to allow optimizations that
wouldn't otherwise be legal under the "as if" rule. AFAIK, the only
other kind of allowance in the standard like this is the copy
constructor elision allowances. However, the copy constructor elision
allowances better appeal to my common sense insofaras removing
redundant copies seems more sensible than removing an infinite loop.
Thus, the onus should be on the backers of this new allowance in
demonstrating that it has a pretty big value to introduce completely
non-obvious semantics into the language.

Moreover, the copy constructor elision rules for most code generally
result in well formed programs with sensible results. If the new
"infinite loop" rule was used by a compiler when the programmer did
not intend so, then you could result with a program with entirely
unpredictable behavior aka undefined behavior. Arguably, such cases
would be rare, but that's no excuse for a rule that says "If the sum
of any addition is 237404, then the program has undefined behavior."
Thus, this new allowance could break programs by introducing undefined
behavior, and probably without any diagnostic, error or warning
message. Thus again the onus is on the backers of this new allowance
to demonstrate that it has a very big value that greatly outweighs the
completely non-obvious semantics and silent undefined behavior.

Finally, it's a code level breaking change from the past standard.
This again reinforces the notion that the onus should be on the
backers to demonstrate that the benefits greatly outweigh the cons.

So, let's try to weigh the pros and cons as I see them. Admittingly, I
am not as well educated on the topic as I could be.

True infinite loops without "side effects" exist in a very small
subset of legitimate programs. An example given in some thread was a
signal driven program. main was simply:
int main() { for (;;); }
Some signal handlers weer installed, and all of the interesting work
was done in those signal handlers.

Infinite loops without "side effects" exist frequently in testing.
I've found myself resorting to them frequently enough to be troubled
by this new rule.

There is a whole class of infinite loops which might exist in code as
a bug. On one implementation, it might be entirely removed, but on
another it might remain as an actual infinite loop. This allowance
would hide the existence of the bug with some probably random
behavior, or perhaps sensible behavior. I don't like things which
further hurt the portability of C++ code.

So, what are the pros? Some hand-waived optimization opportunities as
far as I can tell. Can anyone actually present a plausible example
which would occur in real code? It's easy enough to construct a fake
example, but then it's a fake example, so the performance gains are
minimal. A fake example does not demonstrate a great performance gain
which I claim is required to introduce these completely non-obvious
semantics to the otherwise (mostly) sane C++ programming language. The
catch goes both ways: either the loop is "important" enough that it
cannot be optimized away, or it's "trivial" enough that it can be. It
seems to me that such trivial loops are very rare, and thus very
little performance gains will occur in practice.

As I understand it, a supposed big gain of this approach is to allow
the compiler to reorder a read or write from before the loop to after
the loop, or vice versa. This could be used to then allow further
optimizations, like common sub-expression elimination, and so on. This
seems a tad promising, though I would greatly prefer a real life
example. At this moment, it's my strong inclination that any supposed
benefits would not make up for all of the cons mentioned above.

Again, if I'm wrong, please correct me, preferably with a non-
contrived counter-example.

Daniel Krügler

unread,

Nov 18, 2010, 4:44:27 AM11/18/10

to

On 11/18/2010 03:08, Joshua Maurice wrote:
>
> On Nov 16, 6:03 am, Daniel Krügler<daniel.krueg...@googlemail.com>
> wrote:
>> I assume this refers to the endless loop case, right?
>
> Yes. Odd that it munged the link. Let me try again:
>
> http://groups.google.com/group/comp.std.c++/browse_thread/thread/fb0f396824b9a801#
>
> In case it munges it again, here's the header:
>
> Newsgroups: comp.std.c++
> From: Joshua Maurice<joshuamaur...@gmail.com>
> Date: Tue, 21 Sep 2010 11:41:03 CST
> Local: Tues, Sep 21 2010 9:41 am
> Subject: Defect Report: non-terminating loops should not be undefined
> behavior

<nod> I got that, but I was asking for a complete example that is as
simple as possible, preferably by code. In this discussion it is
essential that no hand-waving arguments are involved.

Let me insert here, that C++ is not going this new way alone. It does so
"hand-by-hand" with C1x. If you check out the most recent C1x draft,
n1526, you will find similar wording in

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1516.pdf

6.8.5/6:

"An iteration statement that performs no input/output operations, does
not access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a for
statement) its expression-3, may be assumed by the implementation to
terminate.(Footnote 155: This is intended to allow compiler
transformations such as removal of empty loops even when
termination cannot be proven.)"

> If the new
> "infinite loop" rule was used by a compiler when the programmer did
> not intend so, then you could result with a program with entirely
> unpredictable behavior aka undefined behavior. Arguably, such cases
> would be rare, but that's no excuse for a rule that says "If the sum
> of any addition is 237404, then the program has undefined behavior."

Can you please provide a specific code example that demonstrates this?

> Finally, it's a code level breaking change from the past standard.

I'm not yet convinced that this is true, given the definition of
observable behaviour and the lack of progress guarantees in C++03.

> True infinite loops without "side effects" exist in a very small
> subset of legitimate programs. An example given in some thread was a
> signal driven program. main was simply:
> int main() { for (;;); }
> Some signal handlers weer installed, and all of the interesting work
> was done in those signal handlers.

Obviously this means that

int main() { for (;;); }

is not a complete example program that demonstrates the defect. Can you
please provide one? Note that a conforming signal handler is required to
perform modifications on either volatile
sig_atomic_t or on lock-free atomic objects, which are explicitly
excluded in above optimization allowance.

> Infinite loops without "side effects" exist frequently in testing.
> I've found myself resorting to them frequently enough to be troubled
> by this new rule.
>
> There is a whole class of infinite loops which might exist in code as
> a bug. On one implementation, it might be entirely removed, but on
> another it might remain as an actual infinite loop. This allowance
> would hide the existence of the bug with some probably random
> behavior, or perhaps sensible behavior. I don't like things which
> further hurt the portability of C++ code.

Again, hand-waving is not enough, we need something specific to discuss.
Note that above wording does not directly say in a normative form that
every empty loop can be optimized away.

> So, what are the pros? Some hand-waived optimization opportunities as
> far as I can tell. Can anyone actually present a plausible example
> which would occur in real code? It's easy enough to construct a fake
> example, but then it's a fake example, so the performance gains are
> minimal. A fake example does not demonstrate a great performance gain
> which I claim is required to introduce these completely non-obvious
> semantics to the otherwise (mostly) sane C++ programming language. The
> catch goes both ways: either the loop is "important" enough that it
> cannot be optimized away, or it's "trivial" enough that it can be. It
> seems to me that such trivial loops are very rare, and thus very
> little performance gains will occur in practice.
>
> As I understand it, a supposed big gain of this approach is to allow
> the compiler to reorder a read or write from before the loop to after
> the loop, or vice versa. This could be used to then allow further
> optimizations, like common sub-expression elimination, and so on. This
> seems a tad promising, though I would greatly prefer a real life
> example. At this moment, it's my strong inclination that any supposed
> benefits would not make up for all of the cons mentioned above.
>
> Again, if I'm wrong, please correct me, preferably with a non-
> contrived counter-example.

Well, counter examples are much easier to provide. A simple program that
you wrote as

int main() { for (;;); }

can optimize away the end-less loop. This is not what you want to hear,
I guess. To signal that the core language groups of both C and C++ are
going in the wrong direction when imposing this, we need a simple
counter example that can be forwarded to these groups.

No-one is intending to break reasonable programs: The standard
committees are not a group of "crazy scientists" that just invent
wording because they find these funny enough to print them in stone. The
compiler implementors involved try to *satisfy* the needs of their
customers, i.e. us, the programmers. Sometimes the selected wording does
have unwanted side-effects that could be interpreted in a way that would
not be helpful. By asking you for at least a single example I'm trying
to find such boo-boos such that it can be forwarded as early as possible
to the corresponding working groups.

Martin B.

unread,

Nov 18, 2010, 4:35:56 PM11/18/10

to

On 18.11.2010 03:08, Joshua Maurice wrote:
>
> [...]

> True infinite loops without "side effects" exist in a very small
> subset of legitimate programs. An example given in some thread was a
> signal driven program. main was simply:
> int main() { for (;;); }
> Some signal handlers weer installed, and all of the interesting work
> was done in those signal handlers.
>

To me this is an interesting question even in the absence of any signal
handlers!

Namely: Is program termination observable behaviour??

cheers,
Martin

Bart van Ingen Schenau

unread,

Nov 18, 2010, 5:07:15 PM11/18/10

to

On Nov 18, 10:44 am, Daniel Krügler <daniel.krueg...@googlemail.com>
wrote:

> On 11/18/2010 03:08, Joshua Maurice wrote:
>
> > If the new
> > "infinite loop" rule was used by a compiler when the programmer did
> > not intend so, then you could result with a program with entirely
> > unpredictable behavior aka undefined behavior. Arguably, such cases
> > would be rare, but that's no excuse for a rule that says "If the sum
> > of any addition is 237404, then the program has undefined behavior."
>
> Can you please provide a specific code example that demonstrates this?
>
> > Finally, it's a code level breaking change from the past standard.
>
> I'm not yet convinced that this is true, given the definition of
> observable behaviour and the lack of progress guarantees in C++03.

In the software I am currently working on there is an infinite loop
that would cause great trouble if the compiler decided to optimise it
away.
The software runs on an embedded device and the loop is used to
trigger a reset by the hardware watchdog in case some serious problems
are detected (most notably, when a trap from the processor is
intercepted).
I just did a double-check of the implementation of the hw_reset
function, and found that the loop modifies a volatile-qualified
(local) variable, but I can easily imagine such a loop being
implemented without any observable side-effects.

I don"t care to think about what could happen if the compiler
optimised the loop away and the application continues to execute as if
nothing happened.

>
> > True infinite loops without "side effects" exist in a very small
> > subset of legitimate programs. An example given in some thread was a
> > signal driven program. main was simply:
> > int main() { for (;;); }
> > Some signal handlers weer installed, and all of the interesting work
> > was done in those signal handlers.
>
> Obviously this means that
>
> int main() { for (;;); }
>
> is not a complete example program that demonstrates the defect. Can you
> please provide one? Note that a conforming signal handler is required to
> perform modifications on either volatile
> sig_atomic_t or on lock-free atomic objects, which are explicitly
> excluded in above optimization allowance.

How does it matter which observable side-effects occur in code that is
not part of the loop in question?

And no, I can't provide a complete program that has a legitimate
endless-loop. The embedded software I mentioned above is too large to
post and I don't have the right to publish anyway.

<snip>

>
> > So, what are the pros? Some hand-waived optimization opportunities as
> > far as I can tell. Can anyone actually present a plausible example
> > which would occur in real code? It's easy enough to construct a fake
> > example, but then it's a fake example, so the performance gains are
> > minimal. A fake example does not demonstrate a great performance gain
> > which I claim is required to introduce these completely non-obvious
> > semantics to the otherwise (mostly) sane C++ programming language. The
> > catch goes both ways: either the loop is "important" enough that it
> > cannot be optimized away, or it's "trivial" enough that it can be. It
> > seems to me that such trivial loops are very rare, and thus very
> > little performance gains will occur in practice.
>
> > As I understand it, a supposed big gain of this approach is to allow
> > the compiler to reorder a read or write from before the loop to after
> > the loop, or vice versa. This could be used to then allow further
> > optimizations, like common sub-expression elimination, and so on. This
> > seems a tad promising, though I would greatly prefer a real life
> > example. At this moment, it's my strong inclination that any supposed
> > benefits would not make up for all of the cons mentioned above.
>
> > Again, if I'm wrong, please correct me, preferably with a non-
> > contrived counter-example.
>
> Well, counter examples are much easier to provide. A simple program that
> you wrote as
>
> int main() { for (;;); }
>
> can optimize away the end-less loop.

And why is it so great that this loop can be optimised away? What is
the great benefit to the optimiser?
There is no benefit to the programmer, because
- either he intended the loop to finish and is not made aware of a bug
in the program,
- or he intended the loop to be endless and now has to introduce some
bogus side-effects to keep it from being optimised away.

Bart v Ingen Schenau

Daniel Krügler

unread,

Nov 19, 2010, 3:16:22 PM11/19/10

to

On 11/18/2010 23:07, Bart van Ingen Schenau wrote:
>
> On Nov 18, 10:44 am, Daniel Krügler<daniel.krueg...@googlemail.com>
> wrote:

[..]

>> I'm not yet convinced that this is true, given the definition of
>> observable behaviour and the lack of progress guarantees in C++03.
>
> In the software I am currently working on there is an infinite loop
> that would cause great trouble if the compiler decided to optimise it
> away.
> The software runs on an embedded device and the loop is used to
> trigger a reset by the hardware watchdog in case some serious problems
> are detected (most notably, when a trap from the processor is
> intercepted).
> I just did a double-check of the implementation of the hw_reset
> function, and found that the loop modifies a volatile-qualified
> (local) variable, but I can easily imagine such a loop being
> implemented without any observable side-effects.
>
> I don"t care to think about what could happen if the compiler
> optimised the loop away and the application continues to execute as if
> nothing happened.

It would be very helpful, if you could just define the frame code of
main and the infinite loop. There is no need to reveal details, it would
be sufficient to declare some functions that are called during entry and
in the context of the loop or loops and to argue that this is part of an
existing software product.

>>> True infinite loops without "side effects" exist in a very small
>>> subset of legitimate programs. An example given in some thread was a
>>> signal driven program. main was simply:
>>> int main() { for (;;); }
>>> Some signal handlers weer installed, and all of the interesting work
>>> was done in those signal handlers.
>>
>> Obviously this means that
>>
>> int main() { for (;;); }
>>
>> is not a complete example program that demonstrates the defect. Can you
>> please provide one? Note that a conforming signal handler is required to
>> perform modifications on either volatile
>> sig_atomic_t or on lock-free atomic objects, which are explicitly
>> excluded in above optimization allowance.
>
> How does it matter which observable side-effects occur in code that is
> not part of the loop in question?

I don't think that above example program is helpful in this discussion.
If it has any effects they are only relevant outside of the context of
C++ (for the layer that calls the program and waits for the result) and
since we are speaking about C++ here, this is more a form of
philosophization but anything else. Any program makes sense that has any
form of C++ observable behaviour that can be argued about.

> And no, I can't provide a complete program that has a legitimate
> endless-loop. The embedded software I mentioned above is too large to
> post and I don't have the right to publish anyway.

I'm not asking for any form of detailed source code. A basic sketch
concentrated on some loop(s) within main would suffice. It is even more
helpful with your argument that this is part of an existing software
product.

>>> Again, if I'm wrong, please correct me, preferably with a non-
>>> contrived counter-example.
>>
>> Well, counter examples are much easier to provide. A simple program that
>> you wrote as
>>
>> int main() { for (;;); }
>>
>> can optimize away the end-less loop.
>
> And why is it so great that this loop can be optimised away? What is
> the great benefit to the optimiser?
> There is no benefit to the programmer, because
> - either he intended the loop to finish and is not made aware of a bug
> in the program,
> - or he intended the loop to be endless and now has to introduce some
> bogus side-effects to keep it from being optimised away.

Both committees have *not* specifically provided wording to make exactly
this program transformable in the indicated way (as response to the "why
is it so great" question). But the currently discussed wording *can* be
interpreted to allow this program transformation. This is a gray zone,
if you want. No-one can currently safely proof that the new C/C++
wording would allow to optimize away the empty loop in the above shown
empty program or similar ones.

In fact, it has been explained to me that one of the interesting cases
to support are along the line of the following: Given count and count2
are global variables or variables which are potentially referenced and p
a local variable which is not referenced elsewhere - to be allowed to
transform a local loop

for (p = q; p != 0; p = p->next) {
++count;
}
for (p = q; p != 0; p = p->next) {
++count2;
}

into

for (p = q; p != 0; p = p->next) {
++count;
++count2;
}

Most compilers are not able to proof whether the first corresponding
loop is indeed infinite (e.g. via a circular list). If it would be
required to proof it, the compiler where not allowed to merge these into
a single loop, because that would mean that a legal program could
access or modify count2 by another thread in parallel to the first. In
this case the merge done by the compiler would introduce a data race.

Both standards are not standardizing a new idea. They try to standardize
existing practice. If we don't agree with that direction, we should make
that clear. Objective arguments do help in this discussion.

Last but not least a removal of the whole part 5.1.2.3 p.6 in the
current C draft (n1516) as suggested by a recent proposal

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1509.pdf

would do more harm than good, because this would have the effect that we
decrease the number of constraints to compiler vendors and they could
still interpret all what is currently said in this paragraph (and more).
I don't think that we want this. But if we think that more explicit
wording needs to be added that would forbid the removal of infinite
loops, we should collect some arguments to present to the committees.
This is an invitation to provide any form of data to support or reject
the potential freedom to optimize away some forms of infinite loops. It
is very safe to assume that both C and C++ will follow the same route in
*either* direction, so I think that even pure C programs should be
allowed as reply to this message even in this newsgroup. I hope this is
fine for the moderators as well.

Thanks & Greetings from Bremen,

Daniel Krügler

Martin B.

unread,

Nov 22, 2010, 11:02:43 AM11/22/10

to

On 19.11.2010 21:16, Daniel Krügler wrote:
>
> On 11/18/2010 23:07, Bart van Ingen Schenau wrote:
>>
>> On Nov 18, 10:44 am, Daniel Krügler<daniel.krueg...@googlemail.com>
>> wrote:

>>> [...]

>>> Obviously this means that
>>>
>>> int main() { for (;;); }
>>>
>>> is not a complete example program that demonstrates the defect. Can you

>>> [...]

>
> Both committees have *not* specifically provided wording to make exactly
> this program transformable in the indicated way (as response to the "why
> is it so great" question). But the currently discussed wording *can* be
> interpreted to allow this program transformation. This is a gray zone,
> if you want. No-one can currently safely proof that the new C/C++
> wording would allow to optimize away the empty loop in the above shown
> empty program or similar ones.

> [...]
>

Excuse me, but this is insane. We have a current std (C++03) that leaves
the behaviour unspecified - and apparently already some compilers *will*
optimize away endless loops (see: http://blog.regehr.org/archives/161 ).

Now a new paragraph is added, leaving what you describe as "This is a
gray zone, if you want." How then does this new paragraph help anyone?

> Last but not least a removal of the whole part 5.1.2.3 p.6 in the
> current C draft (n1516) as suggested by a recent proposal
>
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1509.pdf
>
> would do more harm than good, because this would have the effect that we
> decrease the number of constraints to compiler vendors and they could
> still interpret all what is currently said in this paragraph (and more).
> I don't think that we want this. But if we think that more explicit
> wording needs to be added that would forbid the removal of infinite
> loops, we should collect some arguments to present to the committees.
> This is an invitation to provide any form of data to support or reject
> the potential freedom to optimize away some forms of infinite loops. It

> [...]

One question that I do not yet understand is whether program termination
(or, rather, program non-termination) constitutes valid observable
behaviour in C(++).

The best example I have seen so far is the program provided by Mr.
Regehr (http://blog.regehr.org/archives/161):

bool fermat() {
for(;;) { // algorithm snipped
}
}

int main () {
if (fermat()) {
printf ("Fermat's Last Theorem has been disproved.\n");
} else {
printf ("Fermat's Last Theorem has not been disproved.\n");
}
return 0;
}

Given *any* algorithm than is *assumed to be* non-terminating, can a
C(++) program be used to "test" this assumption by letting it run forever?

If a C++ program is allowed to not terminate, then the paragraph
[quote N3126 6.5/5]
A loop that, outside of the for-init-statement in the case of a for
statement,
— makes no calls to library I/O functions, and
— does not access or modify volatile objects, and
— performs no synchronization operations (1.10) or
atomic operations (Clause 29)

may be assumed by the implementation to terminate.

[/quote]

is *wrong*, because this disallows non-termination as a valid "answer"
of a C++ program.

Is it (yet) specified if a C++ program is allowed not to terminate
without side-effects?

cheers,
Martin

Joshua Maurice

unread,

Nov 23, 2010, 8:37:44 AM11/23/10

to

On Nov 18, 1:44 am, Daniel Krügler <daniel.krueg...@googlemail.com>
wrote:

> int main() { for (;;); }
>
> can optimize away the end-less loop. This is not what you want to hear,
> I guess. To signal that the core language groups of both C and C++ are
> going in the wrong direction when imposing this, we need a simple
> counter example that can be forwarded to these groups.
>
> No-one is intending to break reasonable programs: The standard
> committees are not a group of "crazy scientists" that just invent
> wording because they find these funny enough to print them in stone. The
> compiler implementors involved try to *satisfy* the needs of their
> customers, i.e. us, the programmers. Sometimes the selected wording does
> have unwanted side-effects that could be interpreted in a way that would
> not be helpful. By asking you for at least a single example I'm trying
> to find such boo-boos such that it can be forwarded as early as possible
> to the corresponding working groups.

I was wondering the reverse. Do you know of any real case studies on
real compilers and real code which shows a measurable performance gain
when allowing this transformation? That would go a long way to
convince me this isn't insane, at the very least.

Bart van Ingen Schenau

unread,

Nov 23, 2010, 8:36:02 AM11/23/10

to

On Nov 19, 9:16 pm, Daniel Krügler <daniel.krueg...@googlemail.com>
wrote:

First, the software is written in C for a freestanding implementation
using an ARM processor. The software has a built-in cooperative
multitasking OS.
The loop in question is also NOT located in main or a function called
from there.
The code in main goes roughly like this:

void main()
{
// initialise system clock

// initialise EEPROM

// initialise HW interrupt handlers
// initialise HW watchdog

start_os(); /* never returns; the OS performs task switches, but
does not explicitly loop */
}

The endless loop is located in some of the interrupt/trap handlers:
__interrupt void hw_int_trap12_isr( void )
{
generate_hw_reset( );
}

void generate_hw_reset( void )
{
// -----------------------------
// Disable all interrupts
// -----------------------------
__disable_interrupt();

/* Generate HW Reset: Write to SWRESET register */

// if that fails, the watchdog does the job

/* Generate HW Reset: Let watchdog expire */

/* enable watchdog */

// -------------------------------------------
// And loop until reset occurs
// -------------------------------------------
bool8 boFlag = FALSE;
word wI = 0;
do
{
if( ++wI == 60000 )
{
wI = 0;
if( boFlag ^= 1 )
{
/* HW_FEA_TWM_LED_ON // macro defined as empty */
}
else
{
/* HW_FEA_TWM_LED_OFF // macro defined as empty */
}
}

}
while( 1 );
}

The hw_int_trap12_isr function is invoked when the processor indicates
a TRAP-12 interrupt (invalid memory access).
Similar interrupt handlers are installed for other faults indicated by
the processor.

<snip>

>
> > And no, I can't provide a complete program that has a legitimate
> > endless-loop. The embedded software I mentioned above is too large to
> > post and I don't have the right to publish anyway.
>
> I'm not asking for any form of detailed source code. A basic sketch
> concentrated on some loop(s) within main would suffice. It is even more
> helpful with your argument that this is part of an existing software
> product.

I have given an illustration of the code above.
And although each task in the embedded OS contains an endless loop,
those loops are not interesting for the current discussion, as they
all contain observable side-effects. Even the idle loop accesses a
volatile variable to pet the SW watchdog.

<snip>

> In fact, it has been explained to me that one of the interesting cases
> to support are along the line of the following: Given count and count2
> are global variables or variables which are potentially referenced and p
> a local variable which is not referenced elsewhere - to be allowed to
> transform a local loop
>
> for (p = q; p != 0; p = p->next) {
> ++count;}
>
> for (p = q; p != 0; p = p->next) {
> ++count2;
>
> }
>
> into
>
> for (p = q; p != 0; p = p->next) {
> ++count;
> ++count2;
>
> }
>
> Most compilers are not able to proof whether the first corresponding
> loop is indeed infinite (e.g. via a circular list). If it would be
> required to proof it, the compiler where not allowed to merge these into
> a single loop, because that would mean that a legal program could
> access or modify count2 by another thread in parallel to the first. In
> this case the merge done by the compiler would introduce a data race.

I can understand that it would be nice to be able to merge those two
loops.
The problem is just that the proposed text also affects loops that can
easily be proven to not terminate.

I think most objections could be met if the license to assume a loop
terminates is only granted if the controlling expression is not a
constexpr (or empty in case of a for loop).
The clause could be written as:

6 An iteration statement that performs no input/output operations,

does not access volatile objects, and performs no synchronization or
atomic operations in its body, controlling expression, or (in the case

of a for statement) its expression-3, and whose controlling expression
is not a constexpr, may be assumed by the implementation to terminate.
154)

This means that, if the termination can be determined at compile time,
then the compiler may not make any assumptions. It has to execute the
loop as the abstract machine would do it.
If the controlling expression can not be evaluated at compile time,
the compiler may assume the loop will eventually terminate and perform
optimisations based on that assumption.

To my knowledge, if an infinite loop is intended, then the controlling
expression will invariably be a constant expression.

<snip>

> Thanks & Greetings from Bremen,
>
> Daniel Krügler
>

Bart v Ingen Schenau

Daniel Krügler

unread,

Nov 23, 2010, 7:46:10 PM11/23/10

to

Am 23.11.2010 14:36, schrieb Bart van Ingen Schenau:
>
> On Nov 19, 9:16 pm, Daniel Krügler<daniel.krueg...@googlemail.com>
> wrote:
>> It would be very helpful, if you could just define the frame code of
>> main and the infinite loop. There is no need to reveal details, it would
>> be sufficient to declare some functions that are called during entry and
>> in the context of the loop or loops and to argue that this is part of an
>> existing software product.
>
> First, the software is written in C for a freestanding implementation
> using an ARM processor. The software has a built-in cooperative
> multitasking OS.
> The loop in question is also NOT located in main or a function called
> from there.
> The code in main goes roughly like this:

[..]

Thanks very much Bart. I have forwarded this example to the core language group as a real world example to consider in the discussions.

> I can understand that it would be nice to be able to merge those two
> loops.
> The problem is just that the proposed text also affects loops that can
> easily be proven to not terminate.
>
> I think most objections could be met if the license to assume a loop
> terminates is only granted if the controlling expression is not a
> constexpr (or empty in case of a for loop).
> The clause could be written as:
>
> 6 An iteration statement that performs no input/output operations,
> does not access volatile objects, and performs no synchronization or
> atomic operations in its body, controlling expression, or (in the case
> of a for statement) its expression-3, and whose controlling expression
> is not a constexpr, may be assumed by the implementation to terminate.
> 154)
>
> This means that, if the termination can be determined at compile time,
> then the compiler may not make any assumptions. It has to execute the
> loop as the abstract machine would do it.
> If the controlling expression can not be evaluated at compile time,
> the compiler may assume the loop will eventually terminate and perform
> optimisations based on that assumption.
>
> To my knowledge, if an infinite loop is intended, then the controlling
> expression will invariably be a constant expression.

I agree that this looks like a very reasonable criterion. I have been told that C currently discusses some infinite-loop patterns, but AFAIK they have not found yet good general rules - the simple cases are covered, though, e.g.

while (c) { /../ }

do { /../ } while (c);

for (/../; ; /../) { /../ }

where c is a constant expression != 0.

Greetings from Bremen,

Daniel Krügler

Joshua Maurice

unread,

Dec 11, 2010, 5:29:34 PM12/11/10

to

I've noticed no reply, so let me ask this just once more to anyone.
Has anyone done a study on "real" production code which shows that
this allowance can result in measurable performance gains? Links
please. Again, this would go a long way to convincing me that this
allowance has actual value and isn't insane.

Öö Tiib

unread,

Dec 11, 2010, 8:50:48 PM12/11/10

to

On Dec 12, 12:29 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> On Nov 23, 5:37 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
>
>
>
>

>> On Nov 18, 1:44 am, Daniel Kr gler <daniel.krueg...@googlemail.com>

>> wrote:
>
>>> int main() { for (;;); }
>
>>> can optimize away the end-less loop. This is not what you want to hear,
>>> I guess. To signal that the core language groups of both C and C++ are
>>> going in the wrong direction when imposing this, we need a simple
>>> counter example that can be forwarded to these groups.
>
>>> No-one is intending to break reasonable programs: The standard
>>> committees are not a group of "crazy scientists" that just invent
>>> wording because they find these funny enough to print them in stone. The
>>> compiler implementors involved try to *satisfy* the needs of their
>>> customers, i.e. us, the programmers. Sometimes the selected wording does
>>> have unwanted side-effects that could be interpreted in a way that would
>>> not be helpful. By asking you for at least a single example I'm trying
>>> to find such boo-boos such that it can be forwarded as early as possible
>>> to the corresponding working groups.
>
>> I was wondering the reverse. Do you know of any real case studies on
>> real compilers and real code which shows a measurable performance gain
>> when allowing this transformation? That would go a long way to
>> convince me this isn't insane, at the very least.
>
> I've noticed no reply, so let me ask this just once more to anyone.
> Has anyone done a study on "real" production code which shows that
> this allowance can result in measurable performance gains? Links
> please. Again, this would go a long way to convincing me that this
> allowance has actual value and isn't insane.

I do not understand. You really need a deep study if optimizing 'for
(;;);' away will result with a considerable performance gain or not?

Most people would perhaps like a diagnostic regardless if compiler
optimizes it away or not. Likelihood that the author did want to write
exactly that is low. Busy sleeping feels quite inefficient way of
doing nothing ... so i don't think that such study is needed.

Miles Bader

unread,

Dec 12, 2010, 6:51:18 AM12/12/10

to

Öö Tiib <oot...@hot.ee> writes:
>> I've noticed no reply, so let me ask this just once more to anyone.
>> Has anyone done a study on "real" production code which shows that
>> this allowance can result in measurable performance gains? Links
>> please. Again, this would go a long way to convincing me that this
>> allowance has actual value and isn't insane.
>
> I do not understand. You really need a deep study if optimizing 'for
> (;;);' away will result with a considerable performance gain or not?
>
> Most people would perhaps like a diagnostic regardless if compiler
> optimizes it away or not. Likelihood that the author did want to write
> exactly that is low. Busy sleeping feels quite inefficient way of
> doing nothing ... so i don't think that such study is needed.

I thought the goal of this "allowance" wasn't really to allow
elimination of empty infinite loops, but to allow optimizations that
move calculations across loops that would be invalid only in the case
where the loop doesn't terminate, and where the optimizer can't prove
that it will eventually terminate (even though in almost every such
case, the loop _will_ eventually terminate).

-Miles

--
Cat is power. Cat is peace.

Joshua Maurice

unread,

Dec 12, 2010, 6:50:30 AM12/12/10

to

It's my assertion that those who want to introduce this C++ source
code level breaking change into the language have the onus on them to
demonstrate that it has tangible benefits. If they are unable to
demonstrate tangible benefits, then the whole thing is a sham.

I would take a shot from a hip - a guess - that you are correct that
this would not provide much benefit. Still, I am unsure, so that is
why I ask.

PS: We're not really talking about just removing "for(;;);" loops.
What we are talking about is moving loads and stores across a loop
without requiring that the compiler prove that the loop terminates.
This would then facilitate other basic compiler optimizations.

Öö Tiib

unread,

Dec 13, 2010, 6:32:11 AM12/13/10

to

On Dec 12, 1:50 pm, Joshua Maurice <joshuamaur...@gmail.com> wrote:

> On Dec 11, 5:50 pm, Tiib <oot...@hot.ee> wrote:
> > On Dec 12, 12:29 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
> > > On Nov 23, 5:37 am, Joshua Maurice <joshuamaur...@gmail.com> wrote:
>
> > > I've noticed no reply, so let me ask this just once more to anyone.
> > > Has anyone done a study on "real" production code which shows that
> > > this allowance can result in measurable performance gains? Links
> > > please. Again, this would go a long way to convincing me that this
> > > allowance has actual value and isn't insane.
>
> > I do not understand. You really need a deep study if optimizing 'for
> > (;;);' away will result with a considerable performance gain or not?
>
> > Most people would perhaps like a diagnostic regardless if compiler
> > optimizes it away or not. Likelihood that the author did want to write
> > exactly that is low. Busy sleeping feels quite inefficient way of
> > doing nothing ... so i don't think that such study is needed.
>
> It's my assertion that those who want to introduce this C++ source
> code level breaking change into the language have the onus on them to
> demonstrate that it has tangible benefits. If they are unable to
> demonstrate tangible benefits, then the whole thing is a sham.
>
> I would take a shot from a hip - a guess - that you are correct that
> this would not provide much benefit. Still, I am unsure, so that is
> why I ask.

The optimization is perhaps useful in collaboration with meta-
programming using templates, preprocessor and/or code generators.
Clumsy meta-programming produces sometimes code that has pointless
loops in it. Pointless in sense that these end sooner or later without
side-effects. It sure improves performance when such loops are
optimized away.

Considerable amount of processing may go into compiler proving if such
a loop always ends (or sometimes may hang runtime). Compilation time
of C++ compilers is something about what i am very sure that users
around us complain, and that slowness is usually related to templates
or preprocessor.

I would still prefer compilers to attempt an analysis and to issue
diagnostics on cases it is unable to prove that it is not hang but
still wants to (has said to/allowed to) optimize there.

> PS: We're not really talking about just removing "for(;;);" loops.
> What we are talking about is moving loads and stores across a loop
> without requiring that the compiler prove that the loop terminates.
> This would then facilitate other basic compiler optimizations.

Yes, 'for(;;);' is clear, most cases are not that clear. Good case
examples (when busy hang is what developer did really desire) might be
handy. Without examples it is sort of emotional issue. It is hard to
suggest anything rational based on such emotions.

Martin B.

unread,

Dec 13, 2010, 12:03:46 PM12/13/10

to

Since you "reopened" this :-) ... my posting/question in another sub-thread,
whether "non-termination" without side-effects is an allowed state for a C++
program is also still unanswered:
On 22.11.2010 17:02, Martin B. wrote:
> [...]

> Given *any* algorithm than is *assumed to be* non-terminating, can a
> C(++) program be used to "test" this assumption by letting
> it run forever?
>
> If a C++ program is allowed to not terminate, then the paragraph
> [quote N3126 6.5/5]

> [...]

> is *wrong*, because this disallows non-termination as a
> valid "answer" of a C++ program.
>
> Is it (yet) specified if a C++ program is allowed not to terminate
> without side-effects?
>

Note also: When I did some digging around in November, I thought I found some
reference that it is considered to change the wording so that only loops that do
not entirely depend on a constant expression are allowed to be assumed to
terminate. But I can't find that link/posting atm ...

cheers,
Martin

Bart van Ingen Schenau

unread,

Dec 14, 2010, 2:57:03 PM12/14/10

to

I have presented a real-world example else-thread (in response to
Daniel Krügler) and that example has been forwarded to the C++
standards committee.
Additionally, I proposed an alternative wording to allow the
'assumption of termination' only when the compiler can not easily
prove termination or non-termination (based on compile-time evaluation
of constant expressions).

Bart v Ingen Schenau

Joshua Maurice

unread,

Dec 15, 2010, 7:40:27 PM12/15/10

to

On Dec 13, 9:03 am, "Martin B." <0xCDCDC...@gmx.at> wrote:
> On 23.11.2010 14:37, Joshua Maurice wrote:
> > I was wondering the reverse. Do you know of any real case studies on
> > real compilers and real code which shows a measurable performance gain
> > when allowing this transformation? That would go a long way to
> > convince me this isn't insane, at the very least.
>
> Since you "reopened" this :-) ... my posting/question in another sub-thread,
> whether "non-termination" without side-effects is an allowed state for a C++
> program is also still unanswered:
> On 22.11.2010 17:02, Martin B. wrote:
>
> > [...]
> > Given *any* algorithm than is *assumed to be* non-terminating, can a
> > C(++) program be used to "test" this assumption by letting
> > it run forever?
>
> > If a C++ program is allowed to not terminate, then the paragraph
> > [quote N3126 6.5/5]
> > [...]
> > is *wrong*, because this disallows non-termination as a
> > valid "answer" of a C++ program.
>
> > Is it (yet) specified if a C++ program is allowed not to terminate
> > without side-effects?

Indeed, in that same thread in comp.std.c++, I talked about this as
well. It's my conclusion that the C++03 standard clearly gives defined
behavior to programs which do not terminate through the interactive IO
rules in "1.9 Program execution / 11". I go into this more in that
comp.std.c++ thread, found here:

Newsgroups: comp.std.c++
From: Joshua Maurice <joshuamaur...@gmail.com>
Date: Tue, 21 Sep 2010 11:41:03 CST
Local: Tues, Sep 21 2010 9:41 am
Subject: Defect Report: non-terminating loops should not be undefined
behavior

http://groups.google.com/group/comp.std.c++/browse_thread/thread/fb0f396824b9a801#

Either way, it's asinine to standardize that programs which do not
terminate do not have defined behavior, and it's asinine to argue with
that as a starting premise. This should hopefully be self evident to
any competent programmer. In case it's not, in the real world we need
to code programs which should be capable of running indefinitely. This
is a common, very important, real world business use case, and it has
been for quite some time.

Unfortunately, I have found a paper on the standards committee website
(a paper in support of this new allowance), and it argues exactly
that:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2052.htm
> Moreover, as the "least requirements" refer to data written to files "at program termination", the presence of a non-terminating loop may even nullify observable behavior preceding entry to the loop (for example, because of buffered output). For these reasons, there are problems with concluding that a strictly-conforming program can contain any non-terminating loop. We therefore conclude that a compiler is free to assume that a simple loop will terminate, and to optimize based on that assumption.

/At best/, the quote from the paper is pedantic rules lawyering:
"Because the past standard can be read to give X, I will argue for X,
and a possible extension to X." Such reasoning is a distortion of the
overall goal of a standards committee, which is to standardize useful
and existing practice. Disallowing programs which run indefinitely
long is asinine in this regard.

PS: I might still be converted to be for this source level breaking
change if someone can show me an actual case study on real production
code which has measurable runtime performance gains from this
allowance. Anyone?