Hi,
I use h5py and I saved 92 groups of a float E_tot and a 1D double numpy array (of size 19 or less) using the following code:
f = File("data.hdf5", "w")
r = range(1, 93)
r.reverse()
for Z in r:
f.create_group("Z%02d" % Z)
E_tot, ks_energies = get_energies(Z, True)
f.create_dataset("/Z%02d/E_tot" % Z, data=E_tot)
f.create_dataset("/Z%02d/ks_energies" % Z, data=ks_energies)
print "%d %e" %(Z, E_tot)
I then read it and write a simple txt file with ascii output of the numbers:
from h5py import File
from numpy import array
f = File("nonrel_energies.hdf5")
g = open("xx.txt", "w")
for Z in range(1, 93):
E_tot = array(f["/Z%02d/E_tot" % Z])
ks_energies = f["/Z%02d/ks_energies" % Z][...]
g.write("Z = %02d\n" % Z)
g.write("E_tot = %20.14f\n" % E_tot)
g.write("ks_energies =\n")
for e in ks_energies:
g.write("%20.14f\n" % e)
g.write("\n")
and the file xx.txt has 24K, while the hdf5 file has 159K. I thought that one of the advantages of the hdf5 format is that it is smaller than just saving the numbers using ascii. So I must be doing something wrong. What would be the most efficient way to save my data? The 92 arrays have sizes from 1 to 19.
If hdf5 is not a good format for that, what would be the best way to do it, so that I don't have to worry about precision, platform dependent things, and also so that it is small?
Thanks,
Ondrej Certik