Final? floating-point article

Bruce Dawson

unread,

Apr 16, 2012, 3:05:01 AM4/16/12

to altdev...@googlegroups.com

After 10.000001 articles in the floating-point series this may be the last one. A draft of “Exceptional Floating Point” has been posted at:

http://www.altdevblogaday.com/?p=25527&shareadraft=baba25527_4f8bc2320f5b0

It’s about the value that you can get from enabling a few types of floating-point exceptions in order to find bugs. And it comes with sample code. I haven’t prettified the code in the post, but I’ve attached a .zip file containing the sample code, and it is linked from the post as well.

Feedback? Experience with similar techniques? Typo reports (I intentionally left two (2) typos in the post to ensure that typo hunters will not be bored).

FloatExceptions.zip

Jonathan Blow

unread,

Apr 16, 2012, 10:43:06 AM4/16/12

to altdev...@googlegroups.com

I used to be big on fp exceptions but they are very hard to use because of everyone else's shitty code. As you allude to in the article, D3D can't handle having them enabled; I've also had Windows mysteriously bluescreen pretty often (that was back in the 2004 timeframe, dunno about now).

So the question is how to reset the flags robustly when leaving your code. The problem is that if you need to worry about every single system call or whatever, that becomes crazy and error prone. So you can only really debug things where it's all your code all the way out to the leaf with no API calls, which is pretty rare and requires a lot of thought to implement. Not nearly as useful as I'd like, which is just to turn it on globally and be done with it.

Bruce Dawson

unread,

Apr 16, 2012, 11:34:50 AM4/16/12

to altdev...@googlegroups.com

Maybe the more practical usage pattern these days is spot-enabling of FPU exceptions rather than global enable with spot-disable. The effectiveness of using them will vary a lot between products. I’ll be giving them a try this week on my current product in order to track down an errant NaN.

rskorski

unread,

Apr 16, 2012, 10:38:25 PM4/16/12

to AltDevAuthors

An Invalid Operation can also happen because of an FP stack fault. I
don't know of a way to do that without explicitly loading SNAN (and
the way the exception happens is pretty mind boggling) or writing poor
assembly, but it may be worth mentioning.

I really like the way you explain the reasons why these exceptions
aren't always bad. Another example that I've seen is when using SIMD
instructions on a 3 float vector. That last float is ignored, but the
operations are still performed on it.

Bruce Dawson

unread,

Apr 17, 2012, 1:35:31 AM4/17/12

to altdev...@googlegroups.com

I glossed over the floating-point stack over/underflow. It makes my brain
heart.

Good point about the SIMD instructions. I may have to mention that. Thanks.

-----Original Message-----
From: altdev...@googlegroups.com [mailto:altdev...@googlegroups.com]
On Behalf Of rskorski
Sent: Monday, April 16, 2012 7:38 PM
To: AltDevAuthors
Subject: Re: Final? floating-point article

Bruce Dawson

unread,

Apr 20, 2012, 9:12:27 PM4/20/12

to altdev...@googlegroups.com

FWIW, I just had need of floating-point exceptions today. I needed to track down an errant NaN that has been bugging me for weeks. I was able to spot enable them in just the particle system, spot disable them in one function that couldn’t handle them, and then start fixing bugs. A few of the bugs were probably not critical, but one was the source of the NaNs and I don’t know how I would have tracked it down otherwise, given that I was working on ancient code written by somebody else.

I just updated the post to include this last minute update.

From: altdev...@googlegroups.com [mailto:altdev...@googlegroups.com] On Behalf Of Jonathan Blow
Sent: Monday, April 16, 2012 7:43 AM
To: altdev...@googlegroups.com
Subject: Re: Final? floating-point article

I used to be big on fp exceptions but they are very hard to use because of everyone else's shitty code. As you allude to in the article, D3D can't handle having them enabled; I've also had Windows mysteriously bluescreen pretty often (that was back in the 2004 timeframe, dunno about now).

Bruce Dawson

unread,

Apr 26, 2012, 9:23:14 PM4/26/12

to altdev...@googlegroups.com

I just saw this e-mail again. The way that an FP stack fault happens is if
you fstp more times than you fld (underflow), or if you fld nine times more
than you fstp (overflow, since the x87 register stack has just eight
entries). This has happened to me twice. Once was when I had a function
prototype mismatch. A function was returning a float (on the x87 register
stack) but the calling code didn't realize this so it didn't remove it.
After a few calls the FP stack gradually gets overloaded, and a crash
happens.

The other time was when an audio interrupt handler was doing floating-point,
without properly preserving/restoring the state. Nasty.

-----Original Message-----
From: altdev...@googlegroups.com [mailto:altdev...@googlegroups.com]
On Behalf Of rskorski
Sent: Monday, April 16, 2012 7:38 PM
To: AltDevAuthors
Subject: Re: Final? floating-point article

Rich Skorski

unread,

Apr 29, 2012, 3:06:03 PM4/29/12

to altdev...@googlegroups.com

That is nasty. How could the calling code not realize it? Why didn't you get a linker error for a missing symbol? Do I smell another FP post?

The only way I knew to trigger an FP stack fault (until now!) is to load SNAN. If you load SNAN, and you have exceptions turned on, then the fld instruction does not alter the FP stack. So the rest of the code keeps on trucking expecting fld to have pushed onto the stack, and will surely fstp too many times. I don't know of a way to get SNAN other than using numeric_limits<float>::signaling_NaN(), so I had figured that it could only happen deliberately.

Bruce Dawson

unread,

Apr 29, 2012, 6:23:19 PM4/29/12

to altdev...@googlegroups.com

Wait, I’m confused. First of all, whose paragraph is that second one?

Second of all, it seems to contradict itself:

If you load SNAN, and you have exceptions turned on, then the fld instruction does not alter the FP stack.

If you load SNAN and you have exceptions turned on, doesn’t that trigger an exception, thus crashing the program?

So the rest of the code keeps on trucking

That’s the part that confuses me. Shouldn’t it have crashed?

It’s easy enough to write test code to verify this, but given the move away from x87 I don’t think I will.

Why didn't you get a linker error for a missing symbol?

What missing symbol?

From: altdev...@googlegroups.com [mailto:altdev...@googlegroups.com] On Behalf Of Rich Skorski
Sent: Sunday, April 29, 2012 12:06 PM
To: altdev...@googlegroups.com
Subject: Re: Final? floating-point article

That is nasty. How could the calling code not realize it? Why didn't you get a linker error for a missing symbol? Do I smell another FP post?

Bruce Dawson

unread,

Apr 29, 2012, 6:24:32 PM4/29/12

to altdev...@googlegroups.com

Oh wait – I thought you were top-posting (top paragraph replying to the second one), but you’re not. Unfortunately I can’t be certain what message you are replying to. I guess it’s my last one, but I’m not sure which part goes with which.

From: Bruce Dawson [mailto:bruce...@cygnus-software.com]
Sent: Sunday, April 29, 2012 3:23 PM
To: 'altdev...@googlegroups.com'
Subject: RE: Final? floating-point article

Wait, I’m confused. First of all, whose paragraph is that second one?

Second of all, it seems to contradict itself:

If you load SNAN, and you have exceptions turned on, then the fld instruction does not alter the FP stack.

If you load SNAN and you have exceptions turned on, doesn’t that trigger an exception, thus crashing the program?

So the rest of the code keeps on trucking

That’s the part that confuses me. Shouldn’t it have crashed?

It’s easy enough to write test code to verify this, but given the move away from x87 I don’t think I will.

Why didn't you get a linker error for a missing symbol?

What missing symbol?

From: altdev...@googlegroups.com [mailto:altdev...@googlegroups.com] On Behalf Of Rich Skorski

Sent: Sunday, April 29, 2012 12:06 PM
To: altdev...@googlegroups.com

Subject: Re: Final? floating-point article

That is nasty. How could the calling code not realize it? Why didn't you get a linker error for a missing symbol? Do I smell another FP post?

Reply all

Reply to author

Forward