[Numpy-discussion] Fastest way to save a dictionary of numpy record arrays

6,219 views
Skip to first unread message

Vishal Rana

unread,
Jun 14, 2010, 8:00:23 PM6/14/10
to Discussion of Numerical Python
Hi,

I have dictionary of numpy record arrays, what could be fastest way to save/load to/from a disk. I tried numpy.save() but my dictionary is lost and cPickle seems to be slow.

Thanks
Vishal

Robert Kern

unread,
Jun 14, 2010, 8:08:15 PM6/14/10
to Discussion of Numerical Python

numpy.savez() will save a dictionary of arrays out to a .zip file.
Each key/value pair will map to a file in the .zip file with a file
name corresponding to the key.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco
_______________________________________________
NumPy-Discussion mailing list
NumPy-Di...@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Warren Weckesser

unread,
Jun 14, 2010, 8:27:44 PM6/14/10
to Discussion of Numerical Python
Robert Kern wrote:
> On Mon, Jun 14, 2010 at 19:00, Vishal Rana <ranav...@gmail.com> wrote:
>
>> Hi,
>> I have dictionary of numpy record arrays, what could be fastest way to
>> save/load to/from a disk. I tried numpy.save() but my dictionary is lost and
>> cPickle seems to be slow.
>>
>
> numpy.savez() will save a dictionary of arrays out to a .zip file.
> Each key/value pair will map to a file in the .zip file with a file
> name corresponding to the key.
>
>

Hey Robert,

If I expand the dictionary to keyword arguments to savez, it works
beautifully:

-----
In [4]: a = np.array([[1,2,3],[4,5,6]])

In [5]: b = np.array([('foo',1),('bar',2)], dtype=[('name', 'S8'),
('code', int)])

In [6]: d = dict(a=a, b=b)

In [7]: np.savez('mydata.npz', **d)

In [8]: q = np.load('mydata.npz')

In [9]: q['a']
Out[9]:
array([[1, 2, 3],
[4, 5, 6]])

In [10]: q['b']
Out[10]:
array([('foo', 1), ('bar', 2)],
dtype=[('name', '|S8'), ('code', '<i4')])
-----


But if I just pass in the dictionary to savez:

-----
In [26]: np.savez('mydata2.npz', d)

In [27]: q2 = np.load('mydata2.npz')

In [28]: q2.files
Out[28]: ['arr_0']

In [29]: q2['arr_0']
Out[29]:
array({'a': array([[1, 2, 3],
[4, 5, 6]]), 'b': array([('foo', 1), ('bar', 2)],
dtype=[('name', '|S8'), ('code', '<i4')])}, dtype=object)
-----

What would be the canonical way to pull this apart to get the arrays?

Warren

Robert Kern

unread,
Jun 14, 2010, 8:30:45 PM6/14/10
to Discussion of Numerical Python

Don't.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
-- Umberto Eco

Warren Weckesser

unread,
Jun 14, 2010, 8:34:08 PM6/14/10
to Discussion of Numerical Python

That's what I suspected. Thanks.

Vishal Rana

unread,
Jun 14, 2010, 9:31:21 PM6/14/10
to Discussion of Numerical Python
Thanks Robert

Vishal Rana

unread,
Jun 15, 2010, 2:56:38 AM6/15/10
to Discussion of Numerical Python
Robert, 

As you said, I was able to get the results, but I now got a question as np.load('np.npz') returns me a <class 'numpy.lib.io.NpzFile'> object so does that mean the data is read directly from the from the npz file and not all the data is loaded to the memory? 

Thanks
Vishal Rana


On Mon, Jun 14, 2010 at 5:08 PM, Robert Kern <rober...@gmail.com> wrote:

Robert Kern

unread,
Jun 15, 2010, 11:49:50 PM6/15/10
to Discussion of Numerical Python
On Tue, Jun 15, 2010 at 01:56, Vishal Rana <ranav...@gmail.com> wrote:
> Robert,
> As you said, I was able to get the results, but I now got a question as
> np.load('np.npz') returns me a <class 'numpy.lib.io.NpzFile'> object so does
> that mean the data is read directly from the from the npz file and not all
> the data is loaded to the memory?

Correct. The data is loaded lazily, on request.

Vishal Rana

unread,
Jun 16, 2010, 11:36:18 AM6/16/10
to Discussion of Numerical Python
Thanks
Reply all
Reply to author
Forward
0 new messages