Evidence of an insidious Pentium Bug?

David McClain

unread,

Sep 2, 1999, 7:00:00 AM9/2/99

to

The code snippet below compiles to a harmless and correct sequence of
Pentium instructions, yet the version that negates by subtraction from zero
works correctly while the direct negation with "FCHS" fails, in the routines
"catanh" and "catanhx".

But direct negation does not always fail, as evidenced by the code in
routines "mip" and "mipx".

The error is not the fault of the CAML compilers, but rather to a sometimes
unfortunate timing clash in the Pentium instruction scheduler. I have
verified, by modifying the code generator in OCAMLOPT, that inserting a code
sequence "FLDZ; FADD" after the generated "FCHG" fixes the problem, but what
else does the Pentium do incorrectly?

This error was first noticed on a 350 MHz Pentium II (P5/MMX) processor
running on WinNT 4.0 (1381 pl 3). But I also verified that the problem
exists on an older 100 MHz Pentium (P5) running RedHat Linux 6.0. That
machine is nearly 5 years old, so the problem has apparently been around for
some time.

My confidence in the Pentium architecture has now been quite badly shaken!

David McClain
Sr. Scientist
Raytheon Systems Co.
Tucson, AZ

(* ----------------------------------------------------------------- *)
(* tstfp.ml -- test for FP code sequence failure *)
(* DM/MCFA 09/99 *)

let cnormsq re im =
re *. re +. im *. im

let cabs re im =
match (re,im) with
(_,0.0) -> abs_float re
| (0.0,_) -> abs_float im
| _ -> sqrt (cnormsq re im)

let carg re im =
atan2 im re

let catan re im =
(* As per ANSI X3J13 Jan 1989: *)
(* atan(z) = (log(1+iz) - log(1-iz))/(2i) *)
let (re1,im1) = (1.0 -. im, re) in
let (re2,im2) = (1.0 +. im, 0.0 -. re) in
(0.5 *. (carg re1 im1 -. carg re2 im2),
0.5 *. (log (cabs re2 im2) -. log (cabs re1 im1)))

(* chaning the use of mic to micx does not affect the result *)
let mic(re,im) = (im, 0.0 -. re)
let micx(re,im) = (im, -. re)

let catanh re im = (* uses "(0.0 -. x)" form *)
(* atanh(z) = -i atan(iz) *)
mic(catan (0.0 -. im) re)

let catanhx re im = (* uses "(-. x)" form *)
(* atanh(z) = -i atan(iz) *)
mic(catan (-. im) re)

let main() =
let (re1,im1) = ( 1.2, 0.0) in
let (re2,im2) = (-. 1.2, 0.0) in
let ans1 = catanh re1 im1 in
let ans2 = catanh re2 im2 in
let ans3 = catanhx re1 im1 in
let ans4 = catanhx re2 im2 in
Printf.printf "Using (0.0 -. x)\n";
Printf.printf "%f %f --> %f %f\n" re1 im1 (fst ans1) (snd ans1);
Printf.printf "%f %f --> %f %f\n" re2 im2 (fst ans2) (snd ans2);
Printf.printf "Using (-. x)\n";
Printf.printf "%f %f --> %f %f\n" re1 im1 (fst ans3) (snd ans3);
Printf.printf "%f %f --> %f %f\n" re2 im2 (fst ans4) (snd ans4);
Printf.printf "Correct answers are:\n";
Printf.printf " ( 1.2,0) --> (1.198948, -1.570796)\n";
Printf.printf " (-1.2,0) --> (-1.198948, 1.570796)\n"

let _ = Printexc.catch main ()

Erik Naggum

unread,

Sep 3, 1999, 7:00:00 AM9/3/99

to

* "David McClain" <dmcc...@azstarnet.com>

| The code snippet below compiles to a harmless and correct sequence of
| Pentium instructions, yet the version that negates by subtraction from
| zero works correctly while the direct negation with "FCHS" fails, in the
| routines "catanh" and "catanhx".

why post this in ML when you're really trying to show a problem in the
Pentium? I'd appreciate the assembly listing and discussion about that.

| My confidence in the Pentium architecture has now been quite badly shaken!

wow, it took you this long? ;)

#:Erik
--
save the children: just say NO to sex with pro-lifers

David McClain

unread,

Sep 3, 1999, 7:00:00 AM9/3/99

to

Something Xavier showed got me thinking and I ran some more tests this
morning. It turns out that the problem previously reported actually has
nothing to do with FCHS vs FSUB. Rather, it has to do with how one adds a
real value to a complex value, especially when the complex value has a
IEEE(-0.0) imaginary component.

Adding a real to a complex number:

a + (bre, bim) != (a+bre, bim)

rather

a + (bre, bim) == (a + bre, (+0.0) + bim)

This matters only when bim is (-0.0) because,

FPADD (+0.0) (+0.0) = +0.0
FPADD (+0.0) (-0.0) = +0.0 (!!!)
FAPDD (+0.0) x = x (for all other x)

When this is properly handled, the routines all give the correct branch cut
behavior. It turns out that, indeed, I*(are, aim) = (- aim, are) where the
negative sign is FCHS, instead of (0.0 - aim, are). The second form is
incorrect.

While all very esoteric and nit-picking, it does show the importance in
compiler writing of being very careful during constant-folding and
strength-reduction phases of optimization. I am probably guilty, years ago,
of assuming that 0.0 + anything => anything, in my old C compilers. This
subtle fact simply isn't true in IEEE FP.

- DM

David McClain wrote in message ...

>The code snippet below compiles to a harmless and correct sequence of
>Pentium instructions, yet the version that negates by subtraction from zero
>works correctly while the direct negation with "FCHS" fails, in the
routines
>"catanh" and "catanhx".
>

>But direct negation does not always fail, as evidenced by the code in
>routines "mip" and "mipx".
>
>The error is not the fault of the CAML compilers, but rather to a sometimes
>unfortunate timing clash in the Pentium instruction scheduler. I have
>verified, by modifying the code generator in OCAMLOPT, that inserting a
code
>sequence "FLDZ; FADD" after the generated "FCHG" fixes the problem, but
what
>else does the Pentium do incorrectly?
>
>This error was first noticed on a 350 MHz Pentium II (P5/MMX) processor
>running on WinNT 4.0 (1381 pl 3). But I also verified that the problem
>exists on an older 100 MHz Pentium (P5) running RedHat Linux 6.0. That
>machine is nearly 5 years old, so the problem has apparently been around
for
>some time.
>

>My confidence in the Pentium architecture has now been quite badly shaken!
>

Barry Margolin

unread,

Sep 3, 1999, 7:00:00 AM9/3/99

to

In article <rt04ah...@corp.supernews.com>,

David McClain <dmcc...@azstarnet.com> wrote:
>This matters only when bim is (-0.0) because,
>
> FPADD (+0.0) (+0.0) = +0.0
> FPADD (+0.0) (-0.0) = +0.0 (!!!)
> FAPDD (+0.0) x = x (for all other x)

Presumably the following is also supposed to be true:

FPADD x (-0.0) = x

When adding +0.0 and -0.0 it's not possible for this to be true and for your
third equivalence to be true for all x.

Floating point differs from normal arithmetic in subtle ways. It often
fails associativity and commutativity, and as you've seen the identity
relationships cannot always be maintained when you have two identity
values.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

David McClain

unread,

Sep 3, 1999, 7:00:00 AM9/3/99

to

In light of my recent trip around the mulberry bush, I thought I would post
a preamble from some of my code for everyone's sake.

- DM

(* ------------------------------------------------------------ *)
(* Considerable care has been taken to be IEEE correct *)
(* in the face of distinct (+0.0) and (-0.0). *)

(* The rules of the game are: *)

(* Addition => commutative a + b = b + a *)
(* Subtraction => anticommutative a - b = -(b - a) *)
(* Multiplication => commutative a * b = b * a *)
(* Division => inverse-commutative a / b = 1.0 / (b / a) *)

(* Identity under addition: *)
(* x + (-0.0) = x but x + (+0.0) != x *)

(* Identity under subtraction: *)
(* x - (+0.0) = x but x - (-0.0) != x *)

(* Negation by subtraction: *)
(* (-0.0) - x = neg x (= FCHS x) but (+0.0) - x != neg x *)

(* Equivalence of subtraction by addition of negative: *)
(* a - b = a + neg b *)

(* Distribution of negation: *)
(* a - b = -(b - a) = -b - (-a) = -b + a *)
(* ;; remember that addition is commutative *)

(* Complex conjugation by negation: *)
(* conj(re,im) = (re, -im) != (re, 0 - im) *)

(* Since the constant 0.0 generally denotes (+0.0) and since it is *)
(* generally difficult to determine the sign of a given *)
(* zero constant value, one simply cannot assume that: *)
(* 0.0 + x = x *)
(* x - 0.0 = x *)
(* 0.0 - x = neg x *)
(* Rather, one must carry out what appears to be a null operation, *)
(* and be certain to negate when appropriate, instead of subtracting *)
(* from zero. *)

Barry Margolin wrote in message ...

David McClain

unread,

Sep 3, 1999, 7:00:00 AM9/3/99

to

Ohmygosh! I just ran a simple proof checker in Mathematica against the IEEE
FP spec with distinct (+0.0) and (-0.0). Do you realize that subtraction no
longer anti-commutes?

That is, in general there is no way to perform (a -. b) as a correction to
(b -. a). Subtraction almost anti-commutes, except when a = b. In that case
(a -. b) = +0.0, and (b -. a) = +0.0 also! Hence we cannot expect that
(a -. b) = -. (b -. a).

Also can't use (-0.0) -. (b -. a) since that's exactly the same as direct
negation with FCHS.

Can't use (+0.0) -. (b -. a) since a might be -0.0 and b might be +0.0 in
which case (a -. b) = -0.0 and (b -. a) = (+0.0), and hence (+0.0) -. (b -.
a) = (+0.0) -. (+0.0) = (+0.0).

We're stuck! We MUST evaluate (a -. b) in the order indicated because there
is in general no way to correct the reverse order. Yikes!

- DM

David McClain wrote in message ...

David McClain

unread,

Sep 5, 1999, 7:00:00 AM9/5/99

to

All the recent difficulties I've had with IEEE FP math can be traced to the
bipolar nature of zero. Direct use of this means that although the two
compare as equal, they cause a situation where subtraction is no longer
anitcommutative. They manifest this difficulty most in complex arithmetic
when the function atan2 is used to compute the phase angle of a complex
number.

I recommend, instead, that complex arithmetic use the function phase,
defined as,

phase re im = atan2 (im +. 0.0) (re +. 0.0)

Doing so preserves the affine infinities, and preserves the use of bipolar
zero for those who need it, but restores the sanity of our number system in
the complex plane.

The phase angle of (-1, 0-) and (-1, 0+) are thus on the same Riemann sheet,
addition retains its cummutative behavior, negation is still idempotent, and
subtraction is restored to anticommutativity.

The only way to detect which zero one might have is through the use of the
curious function atan2, which allows one to see ever so slightly beyond one
Riemann sheet by peeking under the principal sheet along the branch cut on
the negative real axis. But since this is a curious function, and since
complex arithmetic now uses phase to obtain its angles, we have a sensible
system.

Furthermore, the old identities (a + 0) = a are restored, as well as (0 -
a) = -a.
Hence the kinds of quick simplifications one is tempted to do are okay and
will not violate the arithmetic. Our faith can be restored in FP math, and
branch cuts are all properly preserved regardless of the number and kind of
quick simplifications we might be tempted to perform during translation from
algebra to computer code.

Those needing the bipolar zeros for interval arithmetic and for detecting
the sense of arithmetic rounding are still able to achieve their goals, by
means of the atan2 function.