Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Non-deterministic numerical inaccuracies in Mathematica 7

30 views
Skip to first unread message

Sjoerd C. de Vries

unread,
Dec 30, 2008, 5:53:12 AM12/30/08
to
Hi all,

Last week Budgemano posted an issue which involved numerical
inaccuracies in Mathematica 7. Due to the xmas season it hasn't
received much attention, or so it seems.

Meanwhile, I've been able to drill the problem down to the following
code:

myMod := Module[
{p},
Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
0.67220312964800834576456745674567}, {-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
0.2715072333003815`, 0.7842300557273234`}]

For[ii = 1, ii <= 250, ii++,
If[myMod =!= myMod,
Print["Iteration " <> ToString[ii] <> " fails comparison "]];]

As you can see the module myMod is called twice every iteration. Its
calculations only involve constants, so the results should always be
the same. However, on my system (Mathematica 7.WinXP) the results differ from one
call to the other at seemingly random times.

This phenomenon crucially depends on the presence of the local
variable p (that isn't used!). With this variable it's there, without
it isn't. In M6 everything is OK, so this is a bug specific to Mathematica 7.

Could you check on the various platforms whether you can reproduce
this?

Cheers -- Sjoerd

Jean-Marc Gulliet

unread,
Dec 31, 2008, 6:07:15 AM12/31/08
to
In article <gjcuio$cnf$1...@smc.vnet.net>,
"Sjoerd C. de Vries" <sjoerd.c...@gmail.com> wrote:

<snip>

> Meanwhile, I've been able to drill the problem down to the following
> code:
>
> myMod := Module[
> {p},
> Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
> 0.67220312964800834576456745674567}, {-0.5359693986690249`, \
> -0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
> -0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
> 0.2715072333003815`, 0.7842300557273234`}]
>
> For[ii = 1, ii <= 250, ii++,
> If[myMod =!= myMod,
> Print["Iteration " <> ToString[ii] <> " fails comparison "]];]
>
> As you can see the module myMod is called twice every iteration. Its
> calculations only involve constants, so the results should always be
> the same. However, on my system (Mathematica 7.WinXP) the results differ from
> one
> call to the other at seemingly random times.
>
> This phenomenon crucially depends on the presence of the local
> variable p (that isn't used!). With this variable it's there, without
> it isn't. In M6 everything is OK, so this is a bug specific to Mathematica 7.
>
> Could you check on the various platforms whether you can reproduce
> this?

Hi Sjoerd,

I have tested the code on two platforms. The results are as follows:

..1 Mathematica 6.0.3 64-bit Intel Mac OS X Leopard 10.5.6: the first as
well as consecutive evaluations of the loop does not yield any numerical
difference.

..2 However, Mathematica 7.0 32-bit Windows XP Pro. SP3 generates
numerous *random* differences run after run.

Best regards,
-- Jean-Marc

In[1]:= {$Version, $ReleaseNumber}

myMod :=
Module[{p},
Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
0.67220312964800834576456745674567}, {-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
0.2715072333003815`, 0.7842300557273234`}]

For[ii = 1, ii <= 250, ii++,
If[myMod =!= myMod,
Print["Iteration " <> ToString[ii] <> " fails comparison "]];]

Out[1]= {"6.0 for Mac OS X x86 (64-bit) (May 21, 2008)", 3}

In[4]:= For[ii = 1, ii <= 250, ii++,

If[myMod =!= myMod,
Print["Iteration " <> ToString[ii] <> " fails comparison "]];]

{$Version, $ReleaseNumber}

myMod :=
Module[{p},
Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
0.67220312964800834576456745674567}, {-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
-0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
0.2715072333003815`, 0.7842300557273234`}]

For[ii = 1, ii <= 250, ii++,
If[myMod =!= myMod,
Print["Iteration " <> ToString[ii] <> " fails comparison "]];]

{7.0 for Microsoft Windows (32 - bit) (November 10, 2008), 0}

Iteration 2 fails comparison
Iteration 8 fails comparison
Iteration 19 fails comparison
Iteration 21 fails comparison
Iteration 61 fails comparison
Iteration 150 fails comparison
Iteration 154 fails comparison
Iteration 162 fails comparison
Iteration 189 fails comparison
Iteration 201 fails comparison
Iteration 220 fails comparison
Iteration 224 fails comparison
Iteration 229 fails comparison
Iteration 230 fails comparison
Iteration 237 fails comparison
Iteration 249 fails comparison

For[ii = 1, ii <= 250, ii++,
If[myMod =!= myMod,
Print["Iteration " <> ToString[ii] <> " fails comparison "]];]

Iteration 7 fails comparison
Iteration 20 fails comparison
Iteration 36 fails comparison
Iteration 41 fails comparison
Iteration 52 fails comparison
Iteration 60 fails comparison
Iteration 64 fails comparison
Iteration 111 fails comparison
Iteration 114 fails comparison
Iteration 119 fails comparison
Iteration 143 fails comparison
Iteration 146 fails comparison
Iteration 162 fails comparison
Iteration 163 fails comparison
Iteration 164 fails comparison
Iteration 185 fails comparison
Iteration 236 fails comparison
Iteration 238 fails comparison
Iteration 239 fails comparison
Iteration 244 fails comparison
Iteration 245 fails comparison
Iteration 249 fails comparison

ADL

unread,
Dec 31, 2008, 6:09:45 AM12/31/08
to
In my platform "7.0 for Microsoft Windows (32-bit) (November 10,
2008)" what you found is exactly reproduced.

I also noticed that Block and With never give rise to the same
problem, so that this error appears to be specific to Module. In
particular, probably, to the part of the internal code which creates
local variables.

Cheers.


> ....

Jens-Peer Kuska

unread,
Dec 31, 2008, 6:07:47 AM12/31/08
to
Hi,

work fine on Windows Vista 64 Bit.

Regards
Jens

Bob F

unread,
Jan 1, 2009, 7:26:47 AM1/1/09
to
Works fine on a Mac Pro with Mathematica V7.0 on Mac OS X 10.5.6 with
dual Intel Xeon's. Would guess this is specific to the Intel floating
point routines on Windows XP, so maybe this is a bug in Intel's code
and not Wolfram's, but if that were true why does Mathematica V6 work
fine? But since Vista does not show it, this makes me think even more
about the Intel floating point code.

Even tried bumping up the number of iterations to over 1 million and
no difference on the Mac at all.

-Bob

Luis Rademacher

unread,
Jan 1, 2009, 7:27:29 AM1/1/09
to
What makes you think that this is a bug? Is there really a guarantee
that the execution of Mathematica code is deterministic (especially,
involving machine-precision)? I am no expert in floating point numbers,
but I am aware that dealing with them can easily become tricky.
Have a look at the following quotations and their sources:

> We shall discuss the following myths, among others: [...] Arithmetic
> operations are deterministic; that is, if I do z=x+y in two places in
> the same program and my program never touches x and y in the
> meantime, then the results should be the same.
from "The pitfalls of verifying floating-point computations"
http://hal.archives-ouvertes.fr/docs/00/28/14/29/PDF/floating-point-article.pdf
http://hal.archives-ouvertes.fr/hal-00128124

> Incidentally, some people think that the solution to such anomalies
> is never to compare floating-point numbers for equality, but instead
> to consider them equal if they are within some error bound E. This is
> hardly a cure-all because it raises as many questions as it answers.
> What should the value of E be? If x < 0 and y > 0 are within E,
> should they really be considered to be equal, even though they have
> different signs? Furthermore, the relation defined by this rule, a ~
> b |a - b| < E, is not an equivalence relation because a ~ b and b ~
> c does not imply that a ~ c.
from "What Every Computer Scientist Should Know About Floating-Point
Arithmetic"
http://docs.sun.com/source/806-3568/ncg_goldberg.html

Luis.

Sjoerd C. de Vries wrote:

David Bailey

unread,
Jan 1, 2009, 8:30:50 PM1/1/09
to
Luis Rademacher wrote:
> What makes you think that this is a bug? Is there really a guarantee
> that the execution of Mathematica code is deterministic (especially,
> involving machine-precision)? I am no expert in floating point numbers,
> but I am aware that dealing with them can easily become tricky.
> Have a look at the following quotations and their sources:
>
>> We shall discuss the following myths, among others: [...] Arithmetic
>> operations are deterministic; that is, if I do z=x+y in two places in
>> the same program and my program never touches x and y in the
>> meantime, then the results should be the same.

First, the bug manifests itself as a variation at the same place in the
code - the loop goes round and the results vary - so I don't think this
caveat applies.

Furthermore, I think the reason that z=x+y could generate slightly
different results at two different points in a C or Fortran program,
would be as a result of some sort of reordering of the code by an
optimiser, and would not apply here.

I certainly think this is a bug, and I can even guess what may have
happened - see my reply to a later post on this issue!

David Bailey
http://www.dbaileyconsultancy.co.uk

Sjoerd C. de Vries

unread,
Jan 1, 2009, 8:29:15 PM1/1/09
to
I learned from WRI's Arnoud Buzing that they're currently thinking of
a compiler issue.

Cheers -- Sjoerd

Luis Rademacher

unread,
Jan 1, 2009, 8:30:40 PM1/1/09
to
What makes you think that this is a bug? Is there really a guarantee
that the execution of Mathematica code is deterministic (especially,
involving machine-precision)? I am no expert in floating point numbers,
but I am aware that dealing with them can easily become tricky.
Have a look at the following quotations and their sources:

> We shall discuss the following myths, among others: [...] Arithmetic
> operations are deterministic; that is, if I do z=x+y in two places in
> the same program and my program never touches x and y in the
> meantime, then the results should be the same.

> Incidentally, some people think that the solution to such anomalies
> is never to compare floating-point numbers for equality, but instead
> to consider them equal if they are within some error bound E. This is
> hardly a cure-all because it raises as many questions as it answers.
> What should the value of E be? If x < 0 and y > 0 are within E,
> should they really be considered to be equal, even though they have
> different signs? Furthermore, the relation defined by this rule, a ~
> b |a - b| < E, is not an equivalence relation because a ~ b and b ~
> c does not imply that a ~ c.
from "What Every Computer Scientist Should Know About Floating-Point
Arithmetic"
http://docs.sun.com/source/806-3568/ncg_goldberg.html

Luis.

Sjoerd C. de Vries wrote:

fiami...@gmail.com

unread,
Jan 2, 2009, 6:56:20 AM1/2/09
to
On Dec 30 2008, 4:53 am, "Sjoerd C. de Vries"
<sjoerd.c.devr...@gmail.com> wrote:
> Hi all,
>
> Last week Budgemano posted an issue which involved numericalinaccuraci=
esin Mathematica 7. Due to the xmas season it hasn't

> received much attention, or so it seems.
>
> Meanwhile, I've been able to drill the problem down to the following
> code:
>
> myMod := Module[
> {p},
> Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
> 0.67220312964800834576456745674567}, {-0.5359693986690249`, \
> -0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
> -0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
> 0.2715072333003815`, 0.7842300557273234`}]
>
> For[ii = 1, ii <= 250, ii++,
> If[myMod =!= myMod,
> Print["Iteration " <> ToString[ii] <> " fails comparison "]];]
>
> As you can see the module myMod is called twice every iteration. Its
> calculations only involve constants, so the results should always be
> the same. However, on my system (Mathematica 7.WinXP) the results differ =

from one
> call to the other at seemingly random times.
>
> Cheers -- Sjoerd

I believe I can tell you exactly why this is happening. It is not
caused by Mathematica. Google says that Mathematica uses the
Intel Math-Kernel-Library. We use the M.K.L. in our programs too,
and we see the same stuff.

The bad news is that it is a consequence of the M.K.L. which can't
be avoided. The good news is that you should not care about it.
If the difference seems important then you are doing something
wrong.

What I finally discovered is that Intel wrote two implementations
of many functions in their M.K.L. Basic-Linear-Algebra-System
(B.L.A.S.). One version works if the memory blocks that store the
data begin at a memory address that is a multiple of 16. This is
faster. If the data does not begin at a multiple of 16. All memory
blocks begin at a multiple of 8.

The algorithms for the multiple of 16 are the same as the non-
multiple of 16, but the code is different. The fast code uses
the SSE registers. The slow code uses the FPU registers.

Programs do not have any choice about where memory blocks begin.
When you use new to get an array in C++ the computer will pick
some memory block somewhere. Sometimes it begins at a memory
address that is divisible by 16. When that array is used by
the B.L.A.S. in the M.K.L. then the fast code is used. Other
times the slow code is used because the memory address is not
divisible by 16. Memory addresses are randomly chosen by the
computer's memory allocater.

The B.L.A.S. algorithm is always the same. The only difference is
in the lowest order digits. The FPU registers dont work the same
as the SSE registers exactly. If you see big important
differences between the two results it means your results are
garbage eitherways. Because of cancellation or an illconditioned
algorithm or a sensitive matrix.

While you use a program that uses the Math-Kernel-Library you
will always see the non-deterministic numerical inaccuracies
if you use linear algebra and floating point numbers. This is
not caused by Mathematica, they are victims of Intel also!!

On the 64-bit Intel CPU, there are no FPU registers, so there are
not two different implementations in the M.K.L. for functions.
The bug does not appear on 64-bit. There is a different M.K.L.
library from Intel for the 64-bit CPU. And also it only is
manufactured for Linux and Windows, not Mac.

leh...@gmail.com

unread,
Jan 3, 2009, 5:55:45 AM1/3/09
to
On 2 =D1=CE=D7, 14:56, fiamina...@gmail.com wrote:
> I believe I can tell you exactly why this is happening.

But why this happens only with Mathematica 7 but not with Mathematica
5.2 and not with Mathematica 6?

Sjoerd C. de Vries

unread,
Jan 3, 2009, 5:56:31 AM1/3/09
to
Excellent post. This confirms WRI's suspicions.

Cheers -- Sjoerd

On Jan 2, 1:56 pm, fiamina...@gmail.com wrote:
> On Dec 30 2008, 4:53 am, "Sjoerd C. de Vries"<sjoerd.c.devr...@gmail.com>=
wrote:
> > Hi all,
>
> > Last week Budgemano posted an issue which involved numericalinaccura=


ci=
>
> esin Mathematica 7. Due to the xmas season it hasn't
>
>
>
> > received much attention, or so it seems.
>
> > Meanwhile, I've been able to drill the problem down to the following
> > code:
>
> > myMod := Module[
> > {p},
> > Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,

> > 0.67220312964800834576456745674567}, {-0.5359693986690249`,=


\
> > -0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
> > -0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
> > 0.2715072333003815`, 0.7842300557273234`}]
>
> > For[ii = 1, ii <= 250, ii++,
> > If[myMod =!= myMod,
> > Print["Iteration " <> ToString[ii] <> " fails comparison "]];]
>
> > As you can see the module myMod is called twice every iteration. Its
> > calculations only involve constants, so the results should always be

> > the same. However, on my system (Mathematica 7.WinXP) the results diffe=

magma

unread,
Jan 4, 2009, 7:31:29 AM1/4/09
to

Well, the explanation given by fiaminamor raises more questions than
it solves.
Consider this:

The test fails only on Mathematica V 7.0, not on V. 6.0.3
The test fails also on AMD 64 X2 dual core 5200+ machines (just
tested).
The test fails only if the localized p variable is present.

So it is hard to believe this is exclusively Intel's fault.
Mathematica 7.0 must be doing something different than in V. 6.0.3 related to
the localized variables in Module.

It is good that WRI is investigating this.

Sjoerd C. de Vries

unread,
Jan 4, 2009, 7:34:09 AM1/4/09
to
On Jan 3, 12:55 pm, lehi...@gmail.com wrote:
> On 2 =D1=CE=D7, 14:56, fiamina...@gmail.com wrote:
>
> > I believe I can tell you exactly why this is happening.
>
> But why this happens only with Mathematica 7 but not with Mathematica
> 5.2 and not with Mathematica 6?

A newer compiler was used for 7

Daniel Lichtblau

unread,
Jan 6, 2009, 4:09:23 AM1/6/09
to
On Jan 2, 5:56 am, fiamina...@gmail.com wrote:
> On Dec 30 2008, 4:53 am, "Sjoerd C. de Vries"<sjoerd.c.devr...@gmail.com>=
wrote:
> > Hi all,
>
> > Last week Budgemano posted an issue which involved numerical inaccur=
acies

> > in Mathematica 7. Due to the xmas season it hasn't
>
> > received much attention, or so it seems.
>
> > Meanwhile, I've been able to drill the problem down to the following
> > code:
>
> > myMod := Module[
> > {p},
> > Dot[{0.5018839897852984567457648, 0.544293499215831476457645674576,
> > 0.67220312964800834576456745674567}, {-0.5359693986690249`,=

\
> > -0.41257352636613975`, -0.736559494563862`}]*{-0.5359693986690249`, \
> > -0.41257352636613975`, -0.736559494563862`} - {0.5579131132707782`,
> > 0.2715072333003815`, 0.7842300557273234`}]
>
> > For[ii = 1, ii <= 250, ii++,
> > If[myMod =!= myMod,
> > Print["Iteration " <> ToString[ii] <> " fails comparison "]];]
>
> > As you can see the module myMod is called twice every iteration. Its
> > calculations only involve constants, so the results should always be
> > the same. However, on my system (Mathematica 7.WinXP) the results diffe=

I work on a Linux platform that happens to show the two results. In a
debug kernel I was able to ascertain that this even/odd alignment mod
8 is in fact the cause. Specifically, we get two possible results for
the following input.

myMod5 := Dot[{0.50188398978529847, 0.54429349921583148,
0.67220312964800832},
{-0.53596939866902493, -0.41257352636613975, -0.73655949456386205}]

As one can see below, myMod5 will give one of two results.

In[86]:= InputForm[myMod5]
Out[86]//InputForm= -0.988673145974262

In[87]:= InputForm[myMod5]
Out[87]//InputForm= -0.9886731459742619

Use of Module/Block, Table/Do, etc. are sort of red herrings. They do
in fact affect the outcome, but I believe that is because they affect
the memory layout in some subtle way. For example, if I do

Do[Print[InputForm[myMod5]],{20}]

I see the less precise result (the one ending ...262 ). If instead I
do

InputForm[Table[myMod5,{20}]]

I only see the more precise result (ending in ...2619).

But when I just enter

InputForm[myMod5]

(as above, In[86] and In[87]), I can get either result. Tracking in a
debugger revealed that the one with the more precision arises when one
or both addresses of the vector are equal to 8 mod 16 (that is,
aligned to 8 but not 16 bytes). The less precise result appears only
to happen when both vectors are aligned to 16 bytes. The code that
produces these results is ddot in the MKL.

My own theory, apparently incorrect, was that registers were all the
same but that store-to-memory happened in ways that depended on
vagaries of register availability. But this theory might account for
the following similar item, that showed up in some private email I
recently had regarding platform dependent differences of the logistic
map.

x = 0.052903130124606165;

On a 64 bit Linux OS:

In[24]:= InputForm[x = 4*x*(1 - x)]
Out[24]//InputForm= 0.2004175557905006

In[25]:= InputForm[x = 4*x*(1 - x)]
Out[25]//InputForm= 0.6410014364858487

On a 32-bit Linux OS:

In[23]:= InputForm[x = 4*x*(1 - x)]
Out[23]//InputForm= 0.20041755579050063

In[24]:= InputForm[x = 4*x*(1 - x)]
Out[24]//InputForm= 0.6410014364858488

This platform dependent behavior is by no means a bug; it just shows
that IEEE arithmetic can give rise to different results depending on
usage of extended precision registers and intermediate stores. I do
not know whether the different results on one platform, due to
alignment dependent MKL behavior, is regarded as a bug by the
developers of that library.

Anyway, my thanks to several people in this thread for helpful
analysis of the issue. Even the things that turned out to be red
herrings seem interesting (and were by no means obviously off the
mark).

Daniel Lichtblau
Wolfram Research

David Bailey

unread,
Jan 25, 2009, 6:51:15 AM1/25/09
to
Daniel Lichtblau wrote:
>>
>> The bad news is that it is a consequence of the M.K.L. which can't
>> be avoided. The good news is that you should not care about it.
>> If the difference seems important then you are doing something
>> wrong.
>>

I think we should care about it! Lots of programs have different
behaviours depending on a real number threshold, and this behaviour
means that there is the possibility of a test flipping different ways on
different runs of a program!

This could be a debugging nightmare, and will be in the back of people's
minds whenever anything 'strange' happens.

Would it not be possible to revert to the previous, slightly less
optimal, library until (presumably) the library is fixed?

David Bailey
http://www.dbaileyconsultancy.co.uk

da...@wolfram.com

unread,
Jan 25, 2009, 9:48:11 PM1/25/09
to
> Daniel Lichtblau wrote:
>>>
>>> The bad news is that it is a consequence of the M.K.L. which can't
>>> be avoided. The good news is that you should not care about it.
>>> If the difference seems important then you are doing something
>>> wrong.

First I should point out that the above remark was not mine. It was in a
note I had at least in part quoted. That said, I neither fully agree nor
disagree with it. If small machine arithmetic fuzz is bedeviling your
code, you almost certainly are doing something a bit unsafe.


> I think we should care about it! Lots of programs have different
> behaviours depending on a real number threshold, and this behaviour
> means that there is the possibility of a test flipping different ways on
> different runs of a program!

One has to be careful about choosing the threshold, etc. A few bits for
BLAS machine arithmetic tolerance seems not unreasonable to me.


> This could be a debugging nightmare, and will be in the back of people's
> minds whenever anything 'strange' happens.
>
> Would it not be possible to revert to the previous, slightly less
> optimal, library until (presumably) the library is fixed?
>
> David Bailey
> http://www.dbaileyconsultancy.co.uk

My understranding is that the library vendors do not regard it as broken,
so I am not optimistic that the library itself will change. We have made
some effort to align machine floats to 16 bits so this particular behavior
should stabilize in a future release.

I am not really competent to discuss the issues involved in going to a
different library. Maybe others can address that. I can only voice my
suspicion that it would be difficult, and offer dubious gain.


Daniel Lichtblau
Wolfram Research

David Bailey

unread,
Jan 27, 2009, 6:56:31 AM1/27/09
to
da...@wolfram.com wrote:
>> Daniel Lichtblau wrote:

>
> My understranding is that the library vendors do not regard it as broken,
> so I am not optimistic that the library itself will change. We have made
> some effort to align machine floats to 16 bits so this particular behavior
> should stabilize in a future release.
>
> I am not really competent to discuss the issues involved in going to a
> different library. Maybe others can address that. I can only voice my
> suspicion that it would be difficult, and offer dubious gain.
>
>

Surely aligning floats on 16-bytes would only be a partial fix because
if an array A were packed and 16-bytes aligned, A[2;;] would only be
8-bytes aligned!

It seems to me that it was a wretched decision of the C compiler writers
to plant code that delivers different results - however insignificant -
depending on data alignment! I wonder if anyone has reported to them
that this difference exists!

I was not suggesting using a different library - only a previous version
- such as that used inside 6.0.3, which was presumably compiled with an
earlier compiler.

The mere fact that several people have discovered this problem, suggests
to me that it will be an irritant.

David Bailey
http://www.dbaileyconsultancy.co.uk

Daniel Lichtblau

unread,
Jan 28, 2009, 6:28:00 AM1/28/09
to
David Bailey wrote:
> da...@wolfram.com wrote:
>>> Daniel Lichtblau wrote:
>
>> My understranding is that the library vendors do not regard it as broken,
>> so I am not optimistic that the library itself will change. We have made
>> some effort to align machine floats to 16 bits so this particular behavior
>> should stabilize in a future release.
>>
>> I am not really competent to discuss the issues involved in going to a
>> different library. Maybe others can address that. I can only voice my
>> suspicion that it would be difficult, and offer dubious gain.
>>
>>
> Surely aligning floats on 16-bytes would only be a partial fix because
> if an array A were packed and 16-bytes aligned, A[2;;] would only be
> 8-bytes aligned!

That's true, or at least I should hope it is. But it will not matter in
terms of getting reproducible results, because it will always go through
the code that handles the non-16-byte-aligned case. The bad situation is
when we create packed tensors on-the-fly inside Dot (as we repeat the
same operation in a loop, say), and sometimes they are 16-byte-aligned
and other times not. We made a change to have them always 16 byte
aligned, so as to preclude this from happening in a future release.


> It seems to me that it was a wretched decision of the C compiler writers
> to plant code that delivers different results - however insignificant -
> depending on data alignment! I wonder if anyone has reported to them
> that this difference exists!

Not that it matters too much, but I don't think this is the compiler
writers' doing. It's in the BLAS library. I'm fairly certain they are
aware of the discrepancy.


> I was not suggesting using a different library - only a previous version
> - such as that used inside 6.0.3, which was presumably compiled with an
> earlier compiler.

I would surmise there might be speed losses associated with this. But
again, I'm just guessing. As for the software engineering aspect, it's
probably not so hard since it's the same library. This assumes the API
does not change in small ways between releases. I am not sure that this
is a safe assumption.


> The mere fact that several people have discovered this problem, suggests
> to me that it will be an irritant.
>
> David Bailey
> http://www.dbaileyconsultancy.co.uk

I imagine it is. But more at the level of a puzzlement rather than an
application killer.


Daniel Lichtblau
Wolfram Research

0 new messages