Weird release build bug

Patricia Anaka

unread,

Dec 17, 2015, 2:08:10 PM12/17/15

to

Hello,
I am seeing a weird bug occur in my C++ program -- the bug only happens in my release build and not in the debug. I've tracked it down to a simple function -- a test to determine whether a point is inside a polygon.

I've used this function for years without a problem.

int point_in_polygon(float testx, float testy, Fpoint *points, int nvert)
{
int i, j, c = 0;
// printf("test xy = %f %f\n", testx, testy);

for (i = 0, j = nvert - 1; i < nvert; j = i++) {
// printf("i %d j %d\n", i, j);
// printf("points[i] = %f %f\n", points[i].x, points[i].y);
// printf("points[j] = %f %f\n", points[j].y, points[j].y);

if (((points[i].y>testy) != (points[j].y>testy)) &&
(testx < (points[j].x - points[i].x) * (testy - points[i].y) / (points[j].y - points[i].y) + points[i].x))
c = !c;
}
return c;
}

Here's some sample data I pass the function. It fails the test in the release build -- returns without ever triggering the c = !c line.

fp 337.333313 124.000000
point 0: 221.000000 50.000000
point 1: 421.000000 50.000000
point 2: 421.000000 190.000000
point 3: 221.000000 190.000000
point 4: 221.000000 50.000000

The weird thing is that if I uncomment those print statements, then the function starts working! So it seems like some kind of compiler issue, I don't know. Anyone know what might be the cause of a problem like this?

Thanks!

Victor Bazarov

unread,

Dec 17, 2015, 2:27:46 PM12/17/15

to

Bugs in compilers are not frequent, but like any other software,
compilers are written by people and people do make mistakes.

If adding output seems to fix the problem, it's likely the issue in the
*optimizer*, which creates bad machine code in some cases. I can only
recommend you to _turn optimization off_ either for the entire module or
for this function only (if possible via a pragma directive, see your
compiler documentation for that). You can extract this function in a
separate module of course, and disable optimizations on that module only.

Experiment with different optimization levels, as well.

Best of luck!

V
--
I do not respond to top-posted replies, please don't ask

Barry Schwarz

unread,

Dec 17, 2015, 3:00:28 PM12/17/15

to

One of the differences between debug and release is the action of the
optimizer.

Since uncommenting the print statements seems to eliminate the
problem, this appears to be a case of the optimizer getting something
wrong. Some possible workarounds that come to mind are:

Place the function in a source file by itself and reduce the
optimization level (possibly even down to no optimization).

Restructure the right hand operand of <. You could move the
addition of points[i].x to the front of the expression. You could
also swap the order of the two multiplicands.

At the top of the for loop, extract the four x and y values from
points and place them in local variables. Change the if statement to
use these values instead.

--
Remove del for email

Paavo Helde

unread,

Dec 17, 2015, 4:05:51 PM12/17/15

to

Patricia Anaka <panak...@gmail.com> wrote in
news:28a3af3b-c793-4ac3...@googlegroups.com:

> Hello,
> I am seeing a weird bug occur in my C++ program -- the bug only
> happens in my release build and not in the debug.
>

> Here's some sample data I pass the function. It fails the test in the
> release build -- returns without ever triggering the c = !c line.

How do you know that c=!c line is never triggered? Relying on breakpoints
is not very trustworthy in optimized builds. Maybe the line is triggered
twice?

> if (((points[i].y>testy) != (points[j].y>testy)) &&

I once had a compiler which had troubles with comparing boolean values for
equality or inequality, but this was over 20 years ago and in a different
language IIRC.

Cheers
Paavo

Patricia Anaka

unread,

Dec 17, 2015, 4:53:54 PM12/17/15

to

On Thursday, December 17, 2015 at 2:00:28 PM UTC-6, Barry Schwarz wrote:
> One of the differences between debug and release is the action of the
> optimizer.
>
> Since uncommenting the print statements seems to eliminate the
> problem, this appears to be a case of the optimizer getting something
> wrong. Some possible workarounds that come to mind are:
>
> Place the function in a source file by itself and reduce the
> optimization level (possibly even down to no optimization).
>
> Restructure the right hand operand of <. You could move the
> addition of points[i].x to the front of the expression. You could
> also swap the order of the two multiplicands.
>
> At the top of the for loop, extract the four x and y values from
> points and place them in local variables. Change the if statement to
> use these values instead.
>

Thanks for the replies! I spent some time experimenting with restructuring the loop. It turns out that it was the little "j = i++" thing in the loop that caused it.

So in other words, this works.

int point_in_polygon(float testx, float testy, Fpoint *points, int nvert)
{
int i, j, c = 0;

for (i = 0, j = nvert - 1; i < nvert; i++) {

if (((points[i].y > testy) != (points[j].y > testy)) &&

(testx < (points[j].x - points[i].x) * (testy - points[j].y) / (points[j].y - points[i].y) + points[i].x))
c = !c;

j = i;
}

return c;
}

So weird!

David Brown

unread,

Dec 17, 2015, 5:29:18 PM12/17/15

to

I have often heard people claim their code is correct, and justify it by
"it works in debug mode" or "it works when the optimiser is disabled".
Almost invariably, the problem is the code - it is extraordinarily rare
that a compiler bug is at fault. It is not impossible - but it is far
down in the list of suspects.

In this case, as far as I can see you get division by zero in some cases
- some of your test points are vertically aligned.

When you subtract almost identical floating point values, you are likely
to have all sorts of numerical stability problems - the subtraction
could result in 0, or it could result in almost 0, and the results can
vary according to different details of the optimisation and the actual
instructions used for computation. In particular, on the x86 if the
arithmetic is done using x87 instructions, those instructions and the
registers used are 80 bit - but anything that gets moved into memory or
the stack will get translated to 64 bits. Disabling optimisation, or
calling functions like printf, will push other data out of registers and
onto the stack - hence the difference in the runs.

You are going to have to do a good deal of work on your algorithm to
make it numerically stable with the sort of points you have been giving
it here. Take extreme care when dealing with equal or almost-equal
floating point values - and in your re-designed algorithm, remember not
try to compare floating point values for equality (i.e., don't add a
test for "if (points[j].y == points[i].y)" as a special case).

David Brown

unread,

Dec 17, 2015, 5:30:22 PM12/17/15

to

On 17/12/15 20:27, Victor Bazarov wrote:

> Bugs in compilers are not frequent, but like any other software,
> compilers are written by people and people do make mistakes.
>
> If adding output seems to fix the problem, it's likely the issue in the
> *optimizer*, which creates bad machine code in some cases. I can only
> recommend you to _turn optimization off_ either for the entire module or
> for this function only (if possible via a pragma directive, see your
> compiler documentation for that). You can extract this function in a
> separate module of course, and disable optimizations on that module only.
>
> Experiment with different optimization levels, as well.
>

Think horses, not zebras. It is always much more likely to be a bug in
the users code - compiler bugs are not unknown, but they are extremely
rare in comparison to user bugs.

Patricia Anaka

unread,

Dec 17, 2015, 5:35:48 PM12/17/15

to

> I have often heard people claim their code is correct, and justify it by
> "it works in debug mode" or "it works when the optimiser is disabled".
> Almost invariably, the problem is the code - it is extraordinarily rare
> that a compiler bug is at fault. It is not impossible - but it is far
> down in the list of suspects.
>
> In this case, as far as I can see you get division by zero in some cases
> - some of your test points are vertically aligned.
>
> When you subtract almost identical floating point values, you are likely
> to have all sorts of numerical stability problems - the subtraction
> could result in 0, or it could result in almost 0, and the results can
> vary according to different details of the optimisation and the actual
> instructions used for computation. In particular, on the x86 if the
> arithmetic is done using x87 instructions, those instructions and the
> registers used are 80 bit - but anything that gets moved into memory or
> the stack will get translated to 64 bits. Disabling optimisation, or
> calling functions like printf, will push other data out of registers and
> onto the stack - hence the difference in the runs.
>
> You are going to have to do a good deal of work on your algorithm to
> make it numerically stable with the sort of points you have been giving
> it here. Take extreme care when dealing with equal or almost-equal
> floating point values - and in your re-designed algorithm, remember not
> try to compare floating point values for equality (i.e., don't add a
> test for "if (points[j].y == points[i].y)" as a special case).

I take your point but it has little to do with the reason I posted. The algorithm was failing in every case in the release build, with the simplest of coordinates, if you look at my test data. The people who posted about optimizations appear to have been right.

Marcel Mueller

unread,

Dec 17, 2015, 6:23:46 PM12/17/15

to

On 17.12.15 23.29, David Brown wrote:
> I have often heard people claim their code is correct, and justify it by
> "it works in debug mode" or "it works when the optimiser is disabled".
> Almost invariably, the problem is the code - it is extraordinarily rare
> that a compiler bug is at fault. It is not impossible - but it is far
> down in the list of suspects.

You didn't work much with Borland C++ compilers, didn't you?

> When you subtract almost identical floating point values, you are likely
> to have all sorts of numerical stability problems - the subtraction

This might be part of the problem indeed.

Marcel

Marcel Mueller

unread,

Dec 17, 2015, 6:29:12 PM12/17/15

to

On 17.12.15 22.53, Patricia Anaka wrote:
> Thanks for the replies! I spent some time experimenting with restructuring the loop. It turns out that it was the little "j = i++" thing in the loop that caused it.

I would not bet that this is the full truth. As you have seen other
changes cause the code to work too.

> So in other words, this works.

Which compiler did you use.

> So weird!

Your first optimizer bug, isn't it?

Marcel

Louis Krupp

unread,

Dec 18, 2015, 3:21:35 AM12/18/15

to

If your compiler is up to date, and if you want to go to the trouble
of creating a really simple test case, you can report this to the
vendor. This can be trickier than it sounds, since you don't really
know what triggers the bug.

Instead of using i and j to index an array, you might want to try
saving their values in another array which you would then print at the
end of the function. You want to make the bad output obvious.

As a wild guess, I would give something like this a shot:

struct {
int i;
int j;
} ij[1000]; // pick an upper bound
int k = 0;

for (i = 0, j = nvert - 1; i < nvert; j = i++) {
// printf("i %d j %d\n", i, j);

ij[k].i = i;
ij[k].j = j;
k++;
}

for (i = 0; i < k; i++)

printf("i %d j %d\n", i, j);

It can be a fair amount of work, but if it results in a better
compiler, it could save you and other users a lot of time and trouble
down the road.

Louis

David Brown

unread,

Dec 18, 2015, 3:25:33 AM12/18/15

to

On 18/12/15 00:23, Marcel Mueller wrote:
> On 17.12.15 23.29, David Brown wrote:
>> I have often heard people claim their code is correct, and justify it by
>> "it works in debug mode" or "it works when the optimiser is disabled".
>> Almost invariably, the problem is the code - it is extraordinarily rare
>> that a compiler bug is at fault. It is not impossible - but it is far
>> down in the list of suspects.
>
> You didn't work much with Borland C++ compilers, didn't you?

Not much, though I have heard their reputation. None of the major PC
compilers (gcc, clang, MSVC, icc) these days are at all likely to have
such issues. (I have seen a number of compiler bugs over the years in
the dozens of different tools I have used.)

>
>> When you subtract almost identical floating point values, you are likely
>> to have all sorts of numerical stability problems - the subtraction
>
> This might be part of the problem indeed.
>

This is certainly a problem in the code - but it might be an additional
problem rather than the only one. As I said, compiler bugs are
/possible/, but they are not the first place to look.

Message has been deleted

Paavo Helde

unread,

Dec 18, 2015, 12:38:05 PM12/18/15

to

Patricia Anaka <panak...@gmail.com> wrote in

news:2c0148b3-bf8f-4fc7...@googlegroups.com:

>
> Thanks for the replies! I spent some time experimenting with
> restructuring the loop. It turns out that it was the little "j = i++"
> thing in the loop that caused it.

It looks like the optimizer recognized a regular loop pattern and optimized
the loop somehow, but failed to notice there is a side-effect of assigning
j, and thus it got lost.

What happens if you replace the "j = i++" with "j=i, i++"?

Care to tell us the compiler name and version so we would know to take care
when dealing with it?

Patricia Anaka

unread,

Dec 18, 2015, 1:27:30 PM12/18/15

to

> It looks like the optimizer recognized a regular loop pattern and optimized
> the loop somehow, but failed to notice there is a side-effect of assigning
> j, and thus it got lost.
>
> What happens if you replace the "j = i++" with "j=i, i++"?

If I do that, then it works.

>
> Care to tell us the compiler name and version so we would know to take care
> when dealing with it?

It's the ARM CC 5.04 compiler, a version which I think is specifically for Nintendo (3DS).

Paavo Helde

unread,

Dec 18, 2015, 2:16:29 PM12/18/15

to

Patricia Anaka <panak...@gmail.com> wrote in

news:0d0ca774-b79e-4083...@googlegroups.com:

>> It looks like the optimizer recognized a regular loop pattern and
>> optimized the loop somehow, but failed to notice there is a
>> side-effect of assigning j, and thus it got lost.
>>
>> What happens if you replace the "j = i++" with "j=i, i++"?
>
> If I do that, then it works.

Yes, it looks like the optimizer now noticed that it cannot optimize so
heavily.

>>
>> Care to tell us the compiler name and version so we would know to
>> take care when dealing with it?
>
> It's the ARM CC 5.04 compiler, a version which I think is specifically
> for Nintendo (3DS).

Thanks. As suggested by others you might consider filing a bug report if
this is the latest released version and you can come up with a small
complete example program whose behavior depends on that change (looks like
you have already done that).

Louis Krupp

unread,

Dec 18, 2015, 3:49:30 PM12/18/15

to

It looks like 5.04 is at least a year old; 5.05 and 5.06 have been
released:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0472j/index.html

About 20 years ago, I earned a living as a consultant porting C++ code
from one buggy compiler to another -- cfront, DEC C++, Solaris C++,
etc. I had one client who preferred to pay me to pull all-nighters
working around compiler problems rather than pop for an upgrade. I
had one gentleman at another site tell me I hadn't found a compiler
bug until he found out the upgrade was free and then it was OK to say
it was a compiler bug.

When I found bugs in up-to-date compilers, I'd report them to the
vendor.

Louis

Patricia Anaka

unread,

Dec 18, 2015, 7:27:42 PM12/18/15

to

> It looks like 5.04 is at least a year old; 5.05 and 5.06 have been
> released:
>

5.0.4 is still the newest one for Nintendo. I will see if I can make a simple demo of the bug and then email them about it.

Nobody

unread,

Dec 19, 2015, 2:38:24 AM12/19/15

to

On Thu, 17 Dec 2015 11:07:43 -0800, Patricia Anaka wrote:

> Here's some sample data I pass the function.

> point 0: 221.000000 50.000000
> point 4: 221.000000 50.000000

The first point is coincident with the last. That's going to result in the
first iteration (i=0, j=nvert-1) calculating 0.0/0.0 in the test.

Printing some of the values may coincidentally change the way in which
calculations are performed, meaning that the function coincidentally
produces the desired result.

Paavo Helde

unread,

Dec 19, 2015, 4:10:51 AM12/19/15

to

Nobody <nob...@nowhere.invalid> wrote in
news:pan.2015.12.19....@nowhere.invalid:

> On Thu, 17 Dec 2015 11:07:43 -0800, Patricia Anaka wrote:
>
>> Here's some sample data I pass the function.
>
>> point 0: 221.000000 50.000000
>> point 4: 221.000000 50.000000
>
> The first point is coincident with the last. That's going to result in
> the first iteration (i=0, j=nvert-1) calculating 0.0/0.0 in the test.

>> if (((points[i].y>testy) != (points[j].y>testy)) &&
>> (testx < (points[j].x - points[i].x) * (testy - points[i].y) /
>> (points[j].y - points[i].y) + points[i].x))
>> c = !c;

The first part of the condition (points[i].y>testy) != (points[j].y>
testy) should avoid division by zero.

If the compiler finds a couple of y values unequal in the first line and
yet gets zero when subtracting them on the third line, then I would think
it would be another compiler bug. But I am not a true floating-point
expert.

Anyway, it is easy to test for OP what is actually happening. If the bug
is in optimizing away the j update as I suspect, then printing out j
after the loop should show it is still at the initial value nvert-1. And
any floating-point shenanigans can be avoided by variating the y values a
bit in the sample data.

Cheers
Paavo

David Brown

unread,

Dec 19, 2015, 8:43:40 AM12/19/15

to

On 18/12/15 16:27, bsab...@gmail.com wrote:
> The code is certainly not bullet-proof, but there's only a problem
> with it if the point values are ever going to be very close to each
> other. Which in my case, is not the case. And as I said before,
> that issue is certainly not causing the problem for which I was
> requesting help.
>

I understand now that the problem really was a compiler bug. It is a
rare thing, but it happens. The ARM CC compiler you have here is, if I
understand it correctly, Keil's compiler that ARM took over a number of
years ago. The only time in recent years that I have seen a compiler
that generated clearly incorrect code (rather than just sub-optimal
code) for perfectly good source code was also with a Keil compiler -
this time for the 8051. I believe ARM is in the process of switching to
clang/llvm for their official compiler, but it is unlikely that the
Nitendo platform compiler will change soon.

But I stand by my assertion that the algorithm needs some work to be
suitable. You can well say that it won't be called with points close
together - but that is /now/, while you know of that issue. The issue
was clearly unknown during testing, or else it would not have been
tested with points that lie on vertical and horizontal lines. And you
can expect that the issue will also be unknown in the future when other
people are using the function.

So the code should be fixed, and tested on all sorts of unrealistic
cases as well as realistic ones - because what is "realistic" will
change. Failing that, it should be documented extraordinarily well,
with lots of assertions, and preferably a name change to something like
"point_in_polygon_restricted_cases".

Working around the compiler bug may be the first priority, but don't
leave bombs in the code for future users.

Juha Nieminen

unread,

Dec 21, 2015, 4:17:09 AM12/21/15

to

Patricia Anaka <panak...@gmail.com> wrote:
> The weird thing is that if I uncomment those print statements,
> then the function starts working!

If adding debug printing seems to "fix" the problem (or in some cases
introduces a problem), then in 99.9% of cases you are accessing memory
you shouldn't be (ie. you are accessing an array out of bounds, or
using a dangling pointer.)

In complex code, rather than use raw pointers to arrays, you should
wrap the pointer into a class (a bit like it were an iterator or smart
pointer) and pass around copies of that class, and override operator[]
and such. This introduces zero overhead to your code (at least if all
the member functions are inlined), but has the advantage that you can
temporarily add bounds checking to it, for debugging purposes.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---