2-D data file here:
http://dl.dropbox.com/u/139035/data.npy
Then:
In [3]: data.mean()
Out[3]: 3067.0243839999998
In [4]: data.max()
Out[4]: 3052.4343
In [5]: data.shape
Out[5]: (1000, 1000)
In [6]: data.min()
Out[6]: 3040.498
In [7]: data.dtype
Out[7]: dtype('float32')
A mean value calculated per loop over the data gives me 3045.747251076416
I first thought I still misunderstand how data.mean() works, per axis
and so on, but did the same with a flattenend version with the same
results.
Am I really soo tired that I can't see what I am doing wrong here?
For completion, the data was read by a osgeo.gdal dataset method called
ReadAsArray()
My numpy.__version__ gives me 1.6.1 and my whole setup is based on
Enthought's EPD.
Best regards,
Michael
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
Otherwise you are left to using some alternative approach to calculate
the mean.
Bruce
-- -------------------------------------------------- Kathleen M. Tacina NASA Glenn Research Center MS 5-10 21000 Brookpark Road Cleveland, OH 44135 Telephone: (216) 433-6660 Fax: (216) 433-5802 -------------------------------------------------- |
I get the same result:
In [1]: import numpy
In [2]: data = numpy.load('data.npy')
In [3]: data.mean()
Out[3]: 3067.0243839999998
In [4]: data.max()
Out[4]: 3052.4343
In [5]: data.min()
Out[5]: 3040.498
In [6]: numpy.version.version
Out[6]: '2.0.0.dev-433b02a'
This on OS X 10.7.2 with Python 2.7.1, on an intel Core i7. Running python as a 32 vs. 64-bit process doesn't make a difference.
The data matrix doesn't look too strange when I view it as an image -- all pretty smooth variation around the (min, max) range. But maybe it's still somehow floating-point pathological?
This is fun too:
In [12]: data.mean()
Out[12]: 3067.0243839999998
In [13]: (data/3000).mean()*3000
Out[13]: 3020.8074375000001
In [15]: (data/2).mean()*2
Out[15]: 3067.0243839999998
In [16]: (data/200).mean()*200
Out[16]: 3013.6754000000001
Zach
Interesting -- I knew that float64 accumulators were used with integer arrays, and I had just assumed that 64-bit or higher accumulators would be used with floating-point arrays too, instead of the array's dtype. This is actually quite a bit of a gotcha for floating-point imaging-type tasks -- good to know!
Zach
Thank you Bruce and all,�
I knew I was doing something wrong (should have read the mean method doc more closely). Am of course glad that's so easy understandable.
But: If the error can get so big, wouldn't it be a better idea for the accumulator to always be of type 'float64' and then convert later to the type of the original array?�
As one can see in this case, the result would be much closer to the true value.
Michael
> Or does the results of calculations depend more on the platform?
Floating point operations often do, sadly (not saying that this is the case
here, but you'd need to try both versions on the same machine [or at least
architecture/bit-width]/same platform to be certain).
David
I found something similar, with a very simple example.
On 64-bit linux, python 2.7.2, numpy development version:
In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
In [23]: a.mean()
Out[23]: 4034.16357421875
In [24]: np.version.full_version
Out[24]: '2.0.0.dev-55472ca'
But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
>>>a = np.ones((1024,1024),dtype=np.float32)
>>>a.mean()
4000.0
>>>np.version.full_version
'1.6.1'
HiOn Wed, Jan 25, 2012 at 1:21 AM, Kathleen M Tacina <Kathleen...@nasa.gov> wrote:
I found something similar, with a very simple example.
On 64-bit linux, python 2.7.2, numpy development version:
In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
In [23]: a.mean()
Out[23]: 4034.16357421875
In [24]: np.version.full_version
Out[24]: '2.0.0.dev-55472ca'
But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
>>>a = np.ones((1024,1024),dtype=np.float32)
>>>a.mean()
4000.0
>>>np.version.full_version
'1.6.1'This indeed looks very nasty, regardless of whether it is a version or platform related problem.
I found something similar, with a very simple example.
On 64-bit linux, python 2.7.2, numpy development version:
In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
In [23]: a.mean()
Out[23]: 4034.16357421875
In [24]: np.version.full_version
Out[24]: '2.0.0.dev-55472ca'
But, a Windows XP machine running python 2.7.2 with numpy 1.6.1 gives:
>>>a = np.ones((1024,1024),dtype=np.float32)
>>>a.mean()
4000.0
>>>np.version.full_version
'1.6.1'
just to confirm, same computer as before but the python 3.2 version is
64 bit, now I get the "Linux" result
Python 3.2 (r32:88445, Feb 20 2011, 21:30:00) [MSC v.1500 64 bit
(AMD64)] on win32
>>> import numpy as np
>>> np.__version__
'1.5.1'
>>> a = 4000*np.ones((1024,1024),dtype=np.float32)
>>> a.mean()
4034.16357421875
>>> a.mean(0).mean(0)
4000.0
>>> a.mean(dtype=np.float64)
4000.0
Josef
>
> <snip>
>
> Chuck