[SciPy-user] Save a list of arrays

2,101 views
Skip to first unread message

Mathieu Dubois

unread,
Apr 22, 2009, 9:09:20 AM4/22/09
to SciPy Users List
[I have posted this message this morning but apparently it is stuck
somewhere - sorry for multi-posting]

Hello,

I would like to save a list of arrays (each one has a different shape)
to a file.
For instance:
>>> array1 = numpy.ones(2);
>>> array2 = numpy.ones(5);
>>> array3 = numpy.ones(1000);
>>> list = [array1, array2, array3]

As my arrays are huge (each one contains several thousands values) I
would like a compressed file.
numpy.savez would be perfect (because it produces an archive of binary
files) but unfortunately numpy.savez(list) doesn't work because savez
needs each array individually.

So what's the best way to do that?

Maybe I could create an archive file in Python (using ZipFile) save each
array in a npy file and then write each file in the archive. But then,
will numpy.load be able to open the file?

Any help appreciated.
Kind regards,
Mathieu
_______________________________________________
SciPy-user mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Nuttall, Brandon C

unread,
Apr 22, 2009, 9:35:58 AM4/22/09
to SciPy Users List
I thought that is what pickling does, see http://docs.python.org/library/pickle.html

Brandon Nuttall
bnut...@uky.edu (KGS)
________________________________________
From: scipy-use...@scipy.org [scipy-use...@scipy.org] On Behalf Of Mathieu Dubois [mathieu...@limsi.fr]
Sent: Wednesday, April 22, 2009 9:09 AM
To: SciPy Users List
Subject: [SciPy-user] Save a list of arrays

Pauli Virtanen

unread,
Apr 22, 2009, 9:43:48 AM4/22/09
to scipy...@scipy.org
Wed, 22 Apr 2009 15:09:20 +0200, Mathieu Dubois kirjoitti:

> [I have posted this message this morning but apparently it is stuck
> somewhere - sorry for multi-posting]
>
> Hello,
>
> I would like to save a list of arrays (each one has a different shape)
> to a file.
> For instance:
> >>> array1 = numpy.ones(2);
> >>> array2 = numpy.ones(5);
> >>> array3 = numpy.ones(1000);
> >>> list = [array1, array2, array3]
>
> As my arrays are huge (each one contains several thousands values) I
> would like a compressed file.
> numpy.savez would be perfect (because it produces an archive of binary
> files) but unfortunately numpy.savez(list) doesn't work because savez
> needs each array individually.
>
> So what's the best way to do that?

savez(filename, *list)

The star is Python syntax for unpacking a sequence to arguments.

--
Pauli Virtanen

Lev Givon

unread,
Apr 22, 2009, 9:43:28 AM4/22/09
to SciPy Users List
Received from Mathieu Dubois on Wed, Apr 22, 2009 at 09:09:20AM EDT:

> [I have posted this message this morning but apparently it is stuck
> somewhere - sorry for multi-posting]
>
> Hello,
>
> I would like to save a list of arrays (each one has a different shape)
> to a file.
> For instance:
> >>> array1 = numpy.ones(2);
> >>> array2 = numpy.ones(5);
> >>> array3 = numpy.ones(1000);
> >>> list = [array1, array2, array3]
>
> As my arrays are huge (each one contains several thousands values) I
> would like a compressed file.
> numpy.savez would be perfect (because it produces an archive of binary
> files) but unfortunately numpy.savez(list) doesn't work because savez
> needs each array individually.
>
> So what's the best way to do that?
>
> Maybe I could create an archive file in Python (using ZipFile) save each
> array in a npy file and then write each file in the archive. But then,
> will numpy.load be able to open the file?
>
> Any help appreciated.
> Kind regards,
> Mathieu

You might find hdf5pickle useful:

http://www.elisanet.fi/ptvirtan/software/hdf5pickle

(If you would like a compressed file, you can specify a compression
filter when creating the HDF5 file in which to store your data.)

L.G.

Gabriel Beckers

unread,
Apr 22, 2009, 9:53:06 AM4/22/09
to SciPy Users List
If you want your data saved in a generic way, you could use PyTables
( http://www.pytables.org/ ) to save your arrays in hdf5 format.

A script like the one below already does the trick, although it could be
greatly improved in terms of compression (look into pytables' CArray).
Also if the number of arrays gets really large you have to adapt the
approach, and avoid saving everything in root.

It would be trivial to write a function to read. You can also look at
the contents of the file visually by using ViTables
( http://vitables.berlios.de/ )

Gabriel

------------------------------------------
import numpy
import tables

def write_arraylist(arraylist, filename):
f = tables.openFile(filename, 'w')
for i, array in enumerate(arraylist):
f.createArray(f.root, 'a%d'%i, array)
f.close()



array1 = numpy.ones(2);
array2 = numpy.ones(5);
array3 = numpy.ones(1000);

arraylist = [array1, array2, array3]

write_arraylist(arraylist, 'testfile.h5')

--
Dr. Gabriël J.L. Beckers

Max Planck Institute for Ornithology
Department of Behavioural Neurobiology

Web: http://www.gbeckers.nl

Mathieu Dubois

unread,
Apr 22, 2009, 10:19:07 AM4/22/09
to SciPy Users List
Pauli Virtanen wrote:
> Wed, 22 Apr 2009 15:09:20 +0200, Mathieu Dubois kirjoitti:
>
>
>> [I have posted this message this morning but apparently it is stuck
>> somewhere - sorry for multi-posting]
>>
>> Hello,
>>
>> I would like to save a list of arrays (each one has a different shape)
>> to a file.
>> For instance:
>> >>> array1 = numpy.ones(2);
>> >>> array2 = numpy.ones(5);
>> >>> array3 = numpy.ones(1000);
>> >>> list = [array1, array2, array3]
>>
>> As my arrays are huge (each one contains several thousands values) I
>> would like a compressed file.
>> numpy.savez would be perfect (because it produces an archive of binary
>> files) but unfortunately numpy.savez(list) doesn't work because savez
>> needs each array individually.
>>
>> So what's the best way to do that?
>>
>
>
Hello Pauli,

> savez(filename, *list)
>
> The star is Python syntax for unpacking a sequence to arguments
Thank you for the tip this opens interesting possibilities.

Do you know something that works with named arguments (keyword arguments)?
This would allow to set the name of the array in the archive file (by
default it's 'arr_0.npy').

Mathieu

Ryan May

unread,
Apr 22, 2009, 10:29:40 AM4/22/09
to SciPy Users List

A dictionary and two stars:

arrays = {'a':a, 'b':b, 'c':c}
savez(filename, **arrays)

Ryan

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

Gabriel Beckers

unread,
Apr 22, 2009, 10:31:06 AM4/22/09
to SciPy Users List
On Wed, 2009-04-22 at 09:43 -0400, Lev Givon wrote:
> You might find hdf5pickle useful:
>
> http://www.elisanet.fi/ptvirtan/software/hdf5pickle

Extremely useful link, thanks!
Gabriel

Mathieu Dubois

unread,
Apr 22, 2009, 1:14:29 PM4/22/09
to SciPy Users List
Hello Ryan,

Thank you very much this is exactly what I wanted to do. Python is very
powerful.

Thanks also to Lev and Gabriel but using PyTable and HDF5 would have
forced me to changed all my scripts...

Kind regards,

Reply all
Reply to author
Forward
0 new messages