[Numpy-discussion] Apparently non-deterministic behaviour of complex array multiplication

532 views
Skip to first unread message

Karl Kappler

unread,
Nov 30, 2011, 8:44:41 PM11/30/11
to numpy-di...@scipy.org
Hello,
I am somewhat new to scipy/numpy so please point me in the right direction if I am posting to an incorrect forum.

The experience which has prompted my post is the following:
I have a numpy array Y where the elements of Y are
type(Y[0,0])
Out[709]: <type 'numpy.complex128'>

The absolute values of the real and complex values do not far exceed say 1e-10.  The shape of Y is (24, 49218).
When I perform the operation: C = dot(Y,Y.conj().transpose), i.e. I form the covariance matrix by multiplying T by its conjugate transpose, I sometimes get NaN in the array C.

I can imagine some reasons why this may happen, but what is truly puzzling to me is that I will be working in ipython and will execute for example:
find(isnan(C)) and will be returned an list of elements of C which are NaN,
fine, but then I recalculate C, and repeat the find(isnan(C)) command and I get a different answer.

I type:
find(isnan(dot(Y,Y.conj().transpose)))
and an empty array is returned.  Repeated calls of the same command however result in a non-empty array.  In fact, the sequence of arrays returned from several consecutive calls varies. Sometimes there are tens of NaNs, sometimes none.

I have been performing a collection of experiments for some hours and cannot get to the bottom of this;
Some things I have tried:
1. Cast the array Y as a matrix X and calculate X*X.H --- in this case i get the same result in that sometimes I have NaN and sometimes I do not.
2. set A=X.H and calculate X*A --- same results*
3. set b=A.copy() and calc X*b --- same results*.
4. find(isnan(real(X*X.H))) --- same results*
5. find(isnan(real(X)*real(X.H))) - no NaN appear

*N.B. "Same results" does not mean that the same indices were going NaN, simply that I was getting back a different result if I ran the command say a dozen times.

So it would seem that it has something to do with the complex multiplication.   I am wondering if there is too much dynamic range being used in the calculation?  It absolutely amazes me that I can perform the same complex-arithmetic operation sitting at the command line and obtain different results each time.  In one case I ran a for loop where I performed the multiplication 1000 times and found that 694 trials had no NaN and 306 trials had NaN.

Saving X to file and then reloading it in a new ipython interpreter typically resulted in no NaN.

For a fixed interpretter and instance of X or Y, the indices which go NaN (when they do) sometimes repeat many times and sometimes they vary apparently at random.

Also note that I have had a similar problem with much smaller arrays, say 24 x 3076

I have also tried 'upping' the numpy array to complex256, I have like 12GB of RAM...

This happens both in ipython and when I call my function from the command line.

Does this sound familiar to anyone?  Is my machine possessed?

Olivier Delalleau

unread,
Nov 30, 2011, 8:57:11 PM11/30/11
to Discussion of Numerical Python
I guess it's just a typo on your part, but just to make sure, you are using .transpose(), not .transpose, correct?

-=- Olivier

2011/11/30 Karl Kappler <magnetot...@gmail.com>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Pierre Haessig

unread,
Dec 1, 2011, 6:17:42 AM12/1/11
to numpy-di...@scipy.org
Le 01/12/2011 02:44, Karl Kappler a écrit :
> Also note that I have had a similar problem with much smaller arrays,
> say 24 x 3076
Hi Karl,
Could you post a self-contained code with such a "small" array (or even
smaller. the smaller, the better...) so that we can run it and play with
it ?
--
Pierre

kneil

unread,
Dec 1, 2011, 3:17:19 PM12/1/11
to numpy-di...@scipy.org

Hi Oliver, indeed that was a typo, I should have used cut and paste. I was
using .transpose()

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32898294.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.

kneil

unread,
Dec 1, 2011, 3:47:11 PM12/1/11
to numpy-di...@scipy.org

Hi Pierre,
I was thinking about uploading some examples but strangely, when I store the
array using for example: np.save('Y',Y)
and then reload it in a new workspace, I find that the problem does not
reproduce. It would seem somehow to be
associated with the 'overhead' of the workspace I am in...

The context here is that I read in 24 files, totaling about 7GB, and then
forming data matrices of size 24 x N, where N varies. I tried for example
this morning to run the same code, but working with only 12 of the files -
just to see if NaNs appeared. No NaN appeared however when the machine was
being less 'taxed'.

Strangely enough, I also seterr(all='raise') in the workspace before
executing this (in the case where I read all 24 files and do net NaN) and I
do not get any messages about the NaN while the calculation is taking place.

If you want to play with this I would be willing to put the data on a file
sharing site (its around 7.1G of data) together with the code and you could
play with it from there. The code is not too many lines - under 100 lines,
and I am sure I could trim it down from there.

Let me know if you are interested.
cheers,
K

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32898383.html


Sent from the Numpy-discussion mailing list archive at Nabble.com.

_______________________________________________

Joe Kington

unread,
Dec 1, 2011, 5:19:51 PM12/1/11
to Discussion of Numerical Python
On Thu, Dec 1, 2011 at 2:47 PM, kneil <magnetot...@gmail.com> wrote:

Hi Pierre,
I was thinking about uploading some examples but strangely, when I store the
array using for example: np.save('Y',Y)
and then reload it in a new workspace, I find that the problem does not
reproduce.  It would seem somehow to be
associated with the 'overhead' of the workspace I am in...

The context here is that I read in 24 files, totaling about 7GB, and then
forming data matrices of size 24 x N, where N varies.  I tried for example
this morning to run the same code, but working with only 12 of the files -
just to see if NaNs appeared.  No NaN appeared however when the machine was
being less 'taxed'.
 
Are you using non-ECC RAM, by chance?  (Though if you have >4GB of ram, I can't imagine that you wouldn't be using ECC...)

Alternately, have you run memtest lately?  That sound suspiciously like bad ram...
 

kneil

unread,
Dec 1, 2011, 11:46:40 PM12/1/11
to numpy-di...@scipy.org

Hi Pierre,

I confirmed with the guy who put together the machine that it is non-ECC
RAM. You know, now that i think about it, this machine seems to crash a
fair amount more often than its identical twin which sits on a desk near me.
I researched memtest a bit... downloaded and compiled it, but I do not quite
understand the finer points of using it... it seems that I want to remove my
RAM cards and test them one at a time. Do you know a good reference for
using it.

I think at this point the best thing to do will be to dump my data/code to
portable HDD and load it on the other computer with same specs as this one.
If it runs without generating any NaN then I will proceed to a full memtest.

Thanks for the advice.
-Karl

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32900355.html

kneil

unread,
Dec 2, 2011, 10:10:51 PM12/2/11
to numpy-di...@scipy.org

OK - here is the status update on this problem:
1. To test for bad RAM, I saved the code plus data onto a usb drive and
manually transferred it to a colleague's desktop machine with same specs as
mine. The NaN values continued to appear at random, so it is unlikely to be
bad RAM - unless its contagious.

2. Here is how I am temporarily working around the problem: Right before
performing the multiplication of X*X.H where X=asmatrix(Y), I save X to file
using np.save('X',X). Then I reload it via X=np.load('X.npy'). Cast as
matrix: X=asmatrix(X); and carry on: S=X*X.H. I have not seen any NaN
although I have only run it a few times... but it seems to work.

I'd be very curious to hear any ideas on why this problem exists in the
first place, and why save/load gets me around it.

For the record I am running ubuntu 11.04, have 16GB RAM (not 12) and use
python2.7.
cheers,
Karl

Hi Pierre,

I confirmed with the guy who put together the machine that it is non-ECC
RAM. You know, now that i think about it, this machine seems to crash a
fair amount more often than its identical twin which sits on a desk near me.
I researched memtest a bit... downloaded and compiled it, but I do not quite
understand the finer points of using it... it seems that I want to remove my
RAM cards and test them one at a time. Do you know a good reference for
using it.

I think at this point the best thing to do will be to dump my data/code to
portable HDD and load it on the other computer with same specs as this one.
If it runs without generating any NaN then I will proceed to a full memtest.

Thanks for the advice.
-Karl

On Thu, Dec 1, 2011 at 2:47 PM, kneil <magnetot...@gmail.com> wrote:

Are you using non-ECC RAM, by chance? (Though if you have >4GB of ram, I
can't imagine that you wouldn't be using ECC...)

Alternately, have you run memtest lately? That sound suspiciously like bad
ram...

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32906553.html

Nathaniel Smith

unread,
Dec 2, 2011, 10:30:56 PM12/2/11
to Discussion of Numerical Python

If save/load actually makes a reliable difference, then it would be useful to do something like this, and see what you see:

save("X", X)
X2 = load("X.npy")
diff = (X == X2)
# did save/load change anything?
any(diff)
# if so, then what changed?
X[diff]
X2[diff]
# any subtle differences in floating point representation?
X[diff][0].view(np.uint8)
X2[diff][0].view(np.uint8)

(You should still run memtest. It's very easy - just install it with your package manager, then reboot. Hold down the shift key while booting, and you'll get a boot menu. Choose memtest, and then leave it to run overnight.)

- Nathaniel

kneil

unread,
Dec 5, 2011, 11:50:37 PM12/5/11
to numpy-di...@scipy.org

Hi Nathaniel,
Thanks for the suggestion. I more or less implemented it:

np.save('X',X);
X2=np.load('X.npy')
X2=np.asmatrix(X2)
diffy = (X != X2)
if diffy.any():
print X[diffy]
print X2[diffy]
print X[diffy][0].view(np.uint8)
print X2[diffy][0].view(np.uint8)
S=X*X.H/k
S2=X2*X2.H/k

nanElts=find(isnan(S))
if len(nanElts)!=0:
print 'WARNING: Nans in S:'+str(find(isnan(S)))
print 'WARNING: Nans in S2:'+str(find(isnan(S2)))

My ouput, (when I got NaN) mostly indicated that both arrays are numerically
identical, and that they evaluated to have the same nan-value entries.

For example
>>WARNING: Nans in S:[ 6 16]
>>WARNING: Nans in S2:[ 6 16]

Another time I got as output:

>>WARNING: Nans in S:[ 26 36 46 54 64 72 82 92 100 110 128 138 146
156 166 174 184 192
202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342
352 360 370 380 388 398 416 426 434 444 454 464 474]
>>WARNING: Nans in S2:[ 26 36 46 54 64 72 82 92 100 110 128 138 146
156 166 174 184 192
202 212 220 230 240 250 260 268 278 279 296 297 306 314 324 334 335 342
352 360 370 380 388 398 416 426 434 444 454 464 474]

These were different arrays I think. At anyrate, those two results appeared
from two runs of the exact same code. I do not use any random numbers in
the code by the way. Most of the time the code runs without any nan showing
up at all, so this is an improvement.

*I am pretty sure that one time there were nan in S, but not in S2, yet
still no difference was observed in the two matrices X and X2. But, I did
not save that output, so I can't prove it to myself, ... but I am pretty
sure I saw that.

I will try and run memtest tonight. I am going out of town for a week and
probably wont be able to test until next week.
cheers,
Karl

I also think What was beyond w:
1. I have many less NaN than I used to, but still get NaN in S,
but NOT in S2!

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32922174.html

kneil

unread,
Dec 7, 2011, 2:20:44 AM12/7/11
to numpy-di...@scipy.org

Hi Nathaniel,
The results of running memtest was a pass with no errors.
-Karl

Nathaniel Smith wrote:
>
> (You should still run memtest. It's very easy - just install it with your
> package manager, then reboot. Hold down the shift key while booting, and
> you'll get a boot menu. Choose memtest, and then leave it to run
> overnight.)
>
> - Nathaniel
>
>

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32927196.html

Olivier Delalleau

unread,
Dec 7, 2011, 8:43:14 AM12/7/11
to Discussion of Numerical Python
I was trying to see if I could reproduce this problem, but your code fails with numpy 1.6.1 with:
AttributeError: 'numpy.ndarray' object has no attribute 'H'
Is X supposed to be a regular ndarray with dtype = 'complex128', or something else?

-=- Olivier

2011/12/5 kneil <magnetot...@gmail.com>

kneil

unread,
Dec 13, 2011, 1:13:31 PM12/13/11
to numpy-di...@scipy.org

Hi Olivier,
Sorry for the late reply - I have been on travel.
I have encountered the error in two separate cases; when I was using numpy
arrays, and when I was using numpy matrices.
In the case of a numpy array (Y), the operation is:
dot(Y,Y.conj().transpose())
and in the case of a matrix, with X=asmatrix(Y) and then the operation is:
X*X.H
-Karl


Olivier Delalleau-2 wrote:
>
> I was trying to see if I could reproduce this problem, but your code fails
> with numpy 1.6.1 with:
> AttributeError: 'numpy.ndarray' object has no attribute 'H'
> Is X supposed to be a regular ndarray with dtype = 'complex128', or
> something else?
>
> -=- Olivier
>
>

--
View this message in context: http://old.nabble.com/Apparently-non-deterministic-behaviour-of-complex-array-multiplication-tp32893004p32969114.html

Reply all
Reply to author
Forward
0 new messages