It seems that using numpy.array([a,b,c]) is around 20 times slower
than creating an empty array and adding the individual elements.
Other things that don't work well either:
numpy.concatenate([a,b,c]).reshape(3,-1)
numpy.concatenate([[a],[b],[c]))
Is there a better way to efficiently create the array ?
See the following snippet:
---------------------------------------
import time
import numpy as nx
print 'numpy version', nx.version.version
t = time.time()
# test array
a = nx.arange(1000*1000)
print 'a ',time.time()-t
t = time.time()
# create array in the normal way..
b0 = nx.array([a,a,a])
print 'b0',time.time()-t
t = time.time()
# create using empty array
b1 = nx.empty((3,)+(a.shape))
b1[0] = a
b1[1] = a
b1[2] = a
print 'b1',time.time()-t
print nx.all((b0==b1))
-----------------------------------------
Produces the output:
numpy version 1.3.0
a 0.0019519329071
b0 0.286643981934
b1 0.0116579532623
equal True
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
What was your timing for concatenate? It wins for me given the shape of a.
In [1]: import numpy as np
In [2]: a = np.arange(1000*1000)
In [3]: timeit b0 = np.array([a,a,a])
1 loops, best of 3: 216 ms per loop
In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a
100 loops, best of 3: 19.3 ms per loop
In [5]: timeit b2 = np.c_[a,a,a].T
10 loops, best of 3: 30.5 ms per loop
In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1)
100 loops, best of 3: 9.33 ms per loop
Skipper
I've always just built arrays using nx.array([]) in the past though
and was surprised
that it performs so badly.
Can anyone provide an explanation (or a pointer) to such differences ?
Thanks
One more.
In [26]: timeit b4 = np.vstack((a,a,a))
100 loops, best of 3: 9.46 ms per loop
Skipper
The difference between the numpy.concatenate version and numpy.array is fairly
impressive though, I get a factor of > 50x. It would be nice to know why.
Ah, then you might want hstack. There is also a column_stack and
row_stack if you need to go that route.
> The difference between the numpy.concatenate version and numpy.array is fairly
> impressive though, I get a factor of > 50x. It would be nice to know why.
>
Sorry, I don't have any deep insight here. There is probably just
overhead in the array creation. Consider if you try to use hstack and
company on lists.
In [1]: import numpy as np
In [2]: a = np.arange(1000*1000)
In [3]: b = a.tolist()
In [4]: timeit b0 = np.array((a,a,a))
1 loops, best of 3: 217 ms per loop
In [5]: timeit b1 = np.vstack((b,b,b))
1 loops, best of 3: 380 ms per loop
Skipper
This thread remids me of something I've though about for a while: Would
NumPy benefit from an np.ndarraylist subclass of np.ndarray, that has an
O(1) amortized append like Python lists? (Other methods of Python lists
(pop, extend) would be worth considering as well.) Or will we get the
same performance by combining a Python list and ndarray?
Sturla
Sorry for the previous mispost.
This thread remids me of something I've though about for a while: Would
NumPy benefit from an np.ndarraylist subclass of np.ndarray, that has an
O(1) amortized append like Python lists? (Other methods of Python lists
(pop, extend) would be worth considering as well.) Or will we get the
same performance by combining a Python list and ndarray?
This idea, an appendable ndarray, has been discussed before; the
conclusion was that yes, it's sometimes useful, and in fact I think
there was code written for it. The idea was an ndarray subclass that
allowed appending through realloc and simple indexing, but made copies
when slices were taken (so that realloc wouldn't move the memory out
from under it). It could be "frozen" to an ordinary ndarray when
appending was done.
To answer the OP's question, np.array is a do-what-I-mean function
that examines its argument and deduces the shape, size, and dtype for
the new array that it constructs. For example, if you pass it a python
list, it must walk through the list and examine the objects to find
the numeric dtype that contains them (e.g. integer real or complex);
if there are any peculiar objects in there it will construct an object
array. (I don't know whether it tries int() float() and complex() on
every object or whether it examines their type information.) In any
case, all this imposes some bookkeeping overhead that is unnecessary
for np.concatenate. For three-dimensional arrays you might try
np.dstack, by the way, or you can concatenate along a new axis (not
that reshaping is usually expensive):
In [1]: a = np.random.randn(4,5)
In [2]: b = a[np.newaxis,...]
In [3]: np.concatenate([b,b,b], axis=0).shape
Out[3]: (3, 4, 5)
Anne
yes, it would.
> (Other methods of Python lists
> (pop, extend) would be worth considering as well.) Or will we get the
> same performance by combining a Python list and ndarray?
I don't think so.
In general, the advice is to accumulate in a Python list, then turn it
into an array. And this works pretty well.
However, Python lists hold python objects, so it's bit inefficient, at
least in terms of memory use.
I wrote such an accumulating array a while back. It uses a numpy array
internally to store the data - oversized at first, and expanded as need
by with ndarray.resize()
The result was that is was a bit slower than the list approach in most
cases -- though should use less memory, and had an advantage is
accumulating things more complex numdata types.
I've thought about trying to write it in C or Cython, rather than Python
-- I think the extra function call overhead on indexing and all may be
fairly expensive. There may be other ways to optimize it as well.
It's also only 1-d, though it wouldn't be hard to make it n-d, but
expandable only on the first dimension.
I've enclosed my current draft, along with some code to test and help
profile it.
Consider it a prototype, but I've used it successfully for a few things.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception
list of ndarrays:
N*elsize when building up
2*N*elsize when compacting
N*elsize when used after compacting
appendable array:
2*N*elsize when building up
2*N*elsize when used before compacting
3*N*elsize when compacting
N*elsize when used after compacting
Personally I am not convinced about an appendable array, neither speed
nor memory wise, I just wanted to hear what other people think. And
there is one more thing that makes me sceptical: We cannot use realloc()
to grow an appendable array. That is because it would segfault if there
are array views in use (the views might suddenly get a dangling data
pointer). And prohibiting realloc(), the performance of an appendable
array might not be fantastic (still O(1) on average though). But as Anne
said, this has probably been discussed in details before.
not very.
> At least not enough
> to convince me to spend my time on it. The memory issue is more
> important though.
I agree.
> But arrays with O(1) amortized append are memory
> inefficient on their own.
well, yes, but by the amount that you buffer them -- I've seen reference
to methods that allocate twice as much memory as needed on each
expansion, so that would be a maximum of a 2X hit.
In poking around the python code, it looks like lists and array.arrays
over allocate by 1/16 (once the arrays are large, anyway). In my code,
It defaults to over allocating by 1/4. In my very limited testing, I
didn't see a performance gain by allocating more than that. So it's a
maximum of 25% memory hit -- and it's user configurable.
> So it might make more sence to save a list of
> equally long short buffers (e.g. ndarrays), and flatten to a contigous
> buffer when needed.
I did that in a C extension I wrote a while back for reading big text
files. I found that the size of the buffer hardly mattered to performance.
In my newer code, I found it a lot easier to over-allocate, and even if
you over allocate by 1/10 or 1/16, performance is still pretty good.
> Personally I am not convinced about an appendable array, neither speed
> nor memory wise, I just wanted to hear what other people think.
I still think it's a good idea -- though clearly not important enough
for me to have gone further with my code (though I am using it now).
For me, the reasons are memory and the ability to deal natively with
numpy data types that don't map directly to Python types. I think there
could be a real benefit to loadtxt() for instance.
> there is one more thing that makes me sceptical: We cannot use realloc()
> to grow an appendable array. That is because it would segfault if there
> are array views in use (the views might suddenly get a dangling data
> pointer). And prohibiting realloc(), the performance of an appendable
> array might not be fantastic
I used ndarray.resize, which I'm pretty sure uses realloc(). And that's
why I didn't subclass ndarray, but rather have an ndarray as the buffer,
which never gets a view on it. When slicing, a copy is returned.
But you're right this limits the utility, as it's not a first class
citizen of numpy -- you still use it like you do list -- accumulate in
it, then convert to a regular numpy array for further processing.
I'd love to see someone with more C chops than me pick this up -- but
Sturla's right -- it's hard to find the motivation for the relatively
small gains.
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception