Hi,
> I know that it is technically possible to store arbitrary files in hdf5
> using opaque types, but I can't seem to implement a methodology in h5py
> where this works as I would have expected. Is there a straight forward
> way of doing this with the h5py api? How should accessing the arbitrary data
> work from h5py?
Right now the h5py type mapping system can't represent opaque types.
You could create an opaque dataset using the low-level interface, but
there's no way to read and write the data.
There are ongoing discussions about how to support this feature, as
part of the discussion about improved Unicode support in future
versions of h5py. One possibility is to add support for NumPy void
(kind "V") arrays and scalars, and map these to opaque types in the
file. Another is to reclassify NumPy byte strings (kind "S") as
opaque, although this would be a big change. We welcome community
input on this topic.
If you absolutely need to store binary data right now, you could use
fixed-length NumPy byte strings (kind "S"), or numpy.uint8.
Andrew