> I'm trying to read an array of string from a HDF5-format MATLAB file,
> into numpy. The field I want is an array of strings. H5py shows this
> as an array of 'HDF5 object references'.
Matlab does use some odd conventions for storing things. Could you
post a (simple) example file?
Andrew
>> Matlab does use some odd conventions for storing things. Could you
>> post a (simple) example file?
>
> This file contains just my problem variable.
It looks like Matlab is really abusing the HDF5 format here. The good
news is you're dereferencing the datasets correctly. It looks like
each element of trans_threshold is a reference to a dataset under the
group "/#refs#". Each dataset is a 1D collection of ints. By
inspection it looks like the ints are meant to be ASCII characters.
There's also a little bit of metadata in the form of attributes on
each "#refs#" dataset which I think is meant to indicate the character
set.
Given the (silly) complexity of what MATLAB is doing here, I don't
think there's a way to simplify your code that much. If you'll be
dealing with this kind of data a lot you might want to write a little
wrapper which recognizes the attributes, reads the datasets and spits
out a Python string.
HTH,
Andrew
Thanks very much for taking a look. I'll post here my wrapper routine,
when I get round to writing it, to convert these arrays of references
into the corresponding array of values.
Angus
> HTH,
> Andrew
>
> --
> You received this message because you are subscribed to the Google Groups "h5py" group.
> To post to this group, send email to h5...@googlegroups.com.
> To unsubscribe from this group, send email to h5py+uns...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/h5py?hl=en.