Writing a user block

301 views
Skip to first unread message

Craig

unread,
Apr 29, 2011, 2:12:50 PM4/29/11
to h5py
Hi,

How can one write a user block efficiently from within h5py/Python?

Reading a user block is easy enough: just open the HDF5 as a regular
file and use regular reads to access the user block. (For the files
I'm reading, one readline() gets the entire user block.) Then close
the file and reopen it with h5py.File().

But how to write the user block efficiently is a tougher question.

The normal way to create an HDF5 file with something in the user block
seems to be to use the h5jam tool. But as the h5jam's manual page's
Caveat says:
"These tools copy all the data sequentially in the file(s) to new
offsets. For a large file, this copy will take a long time."

The solution that the manual then proposes seems obvious at first
blush:
"The most efficient way to create a user block is to create the file
with a user block (see
H5Pset_user_block), and write the user block data into that space from
a program."

It's easy enough to create the HDF5 file with a user block, but how,
within h5py or (more likely) base Python, can one write into that
space? I can't find any sample code that does that.

Is there some way to treat a freshly-created HDF5 file both as an HDF5
object and as a regular Python file (in which one could seek to
location 0 and write data there)? Or is there some other trick for
writing into the beginning of the file? Something with ctypes.Union
(which I've never tried using)?

The best approach I can come up with is to create and write into the
user block file, create and close an "empty" HDF5 file, run h5jam on
these small files, and then open the resulting file in 'r+' mode to
write the real HDF5 data into it. Seems a little kludgy, but ought to
work.

Any code or suggestions would be greatly appreciated!

Thanks!
Craig

Andrew Collette

unread,
Apr 30, 2011, 5:17:40 PM4/30/11
to h5...@googlegroups.com
Hi Craig,

> It's easy enough to create the HDF5 file with a user block, but how,
> within h5py or (more likely) base Python, can one write into that
> space?  I can't find any sample code that does that.

If you already have a file with a user block, it may simply be
possible to open it in Python alongside the HDF5 file. When a user
block is present, HDF5 won't ever touch that section, so as long as
you're careful not to write past the end of the user block and stomp
on the beginning of the "HDF5 section", then I think it would be OK.

> The best approach I can come up with is to create and write into the
> user block file, create and close an "empty" HDF5 file, run h5jam on
> these small files, and then open the resulting file in 'r+' mode to
> write the real HDF5 data into it.  Seems a little kludgy, but ought to
> work.

I think a useful first step on the h5py side would be to add a
userblock keyword to File, so you can at least create files with a
userblock in one go. I'll see whether it's feasible to add this.

Andrew

Craig

unread,
May 2, 2011, 2:02:25 PM5/2/11
to h5py
Thanks, Andrew!

I agree that your proposed enhancement to the File method would be an
excellent, and altogether appropriate, solution. In the meantime,
I'll try your suggestion of opening the file in Python "alongside" the
HDF5 file. Wasn't sure that would be legal or would work right, and
still am not sure how to close them both, so that everything gets
saved to the same physical file, but I'll try a few things.

I appreciate your help, and all your work on this great piece of
software!

Cheers,
Craig
Reply all
Reply to author
Forward
0 new messages