Column names for an HDF5 File

4,617 views
Skip to first unread message

G

unread,
May 4, 2014, 4:41:12 PM5/4/14
to h5...@googlegroups.com

I am trying to write an HDF5 file. The file basically contains a large timeseries matrix in the following format

TimeStamp    Property1      Property2

I have managed to write the data successfully, I created a dset and used the H5Dwrite function.

Now my question is how do I create a file header, in other words, if I want to write the following array to the file...

['TimeStamp', 'Property1', 'Property2']

...and tag it to the columns for ease of later use ( I am planning to analyze the matrix in Python). How to do that?

I tried to use H5Dwrite to write a string array but failed, I guess it wanted consistent datatypes, so it just wanted floats, which is the datatype for my data. Then I read about this metadata thing, but I am a bit lost as to how to use it? Any help would be much appreciated.

A related side question is can the first row of a matrix be a string and the others rows contain doubles?

Matthew Zwier

unread,
May 5, 2014, 10:07:26 AM5/5/14
to h5...@googlegroups.com
Hi,


On Sun, May 4, 2014 at 3:41 PM, G <gauravb...@gmail.com> wrote:

I am trying to write an HDF5 file. The file basically contains a large timeseries matrix in the following format

TimeStamp    Property1      Property2

I have managed to write the data successfully, I created a dset and used the H5Dwrite function.

Now my question is how do I create a file header, in other words, if I want to write the following array to the file...

['TimeStamp', 'Property1', 'Property2']

...and tag it to the columns for ease of later use ( I am planning to analyze the matrix in Python). How to do that?

This is an ideal use case for attributes. For example:

h5file['dataset'] = array_to_assign
h5file['dataset'].attrs['column_names'] = ['TimeStamp','Property1','Property2'] 

You can then access the column names as:
h5file['dataset'].attrs['column_names']

A related side question is can the first row of a matrix be a string and the others rows contain doubles?

Not possible, at least in any clean way. I'm sure you could make it happen in some way with compound data types, but that gets ugly really fast. Attributes are probably the way to go.

Cheers,
Matt Z.

Michael Boyle

unread,
Jul 8, 2020, 9:42:05 AM7/8/20
to h5py
Sorry I'm so late to the party, but this is the first hit when I google "hdf5 column labels".  I agree with Matt's suggestion of using attributes for the labels.  But I want to suggest a better answer for the side question of strings for the first column and doubles for the others.

In this case, it's much better to store these as separate datasets, but associate them to each other using "dimension scales".  In this case, TimeStamp would go into its own dataset, Property1 and Property2 would stay together in their own, but the TimeStamp dataset would be a scale for dim[0] of the other dataset.

Valentyn Stadnytskyi

unread,
Aug 21, 2020, 10:40:14 AM8/21/20
to h5...@googlegroups.com
Hello,

I am trying to create an hdf5 file on a server drive mounted with autofs /net/control/mnt/. The error doesn't say why it failed exactly. Is it something to do with permissions? I don’t know what do flags and o_flags mean? I tried to search on docs.h5py.org for any information. (https://docs.h5py.org/en/stable/search.html?q=o_flags&check_keywords=yes&area=default). There is only one example.

OSError: Unable to create file (unable to open file: name = '/net/control/mnt/datatop_spray-2_0.tmpraw.hdf5', errno = 45, error message = 'Operation not supported', flags = 15, o_flags = a02)


More information:

Out[36]: 'Summary of the h5py configuration\n---------------------------------\n\nh5py    2.10.0\nHDF5    1.10.4\nPython  3.7.5 (default, Nov  1 2019, 02:16:23) \n[Clang 11.0.0 (clang-1100.0.33.8)]\nsys.platform    darwin\nsys.maxsize     9223372036854775807\nnumpy   1.19.1\n'

To assist reproducing bugs, please include the following:
 * Operating System (e.g. Windows 10, MacOS 10.11, Ubuntu 16.04.2 LTS, CentOS 7)
macOS Catalina 10.15.6
 * Python version (e.g. 2.7, 3.5)
Python 3.7.5 (default, Nov  1 2019, 02:16:23) 
 * Where Python was acquired (e.g. system Python on MacOS or Linux, Anaconda on
   Windows)
 * h5py version (e.g. 2.6)
In [35]: h5py.__version__
Out[35]: '2.10.0'
 * HDF5 version (e.g. 1.8.17)
 * The full traceback/stack trace shown (if it appears)
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    184         try:
--> 185             fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
    186         except IOError:

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.open()

OSError: Unable to open file (unable to open file: name = '/net/control/mnt/datatop_spray-2_0.tmpraw.hdf5', errno = 2, error message = 'No such file or directory', flags = 1, o_flags = 2)

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
<ipython-input-33-9b3f09345225> in <module>
----> 1 camera.recording_stop();sleep(1); camera.recording_init(256,'spray-2',True); camera.queue.reset(); camera.recording_start()

/System/Volumes/Data/net/server/C/All Projects/LaserLab/Software/lcp-video/lcp_video/flir_camera/flir_camera_DL.py in recording_init(self, N_frames, name, overwrite)
    779         self.recording_chunk_pointer = 0
    780         filename = self.recording_basefilename + '_' + str(self.recording_chunk_pointer) + '.tmpraw.hdf5'
--> 781         self.recording_create_file(filename = filename, N_frames = N_frames, overwrite = overwrite)
    782         self.recording_Nframes = N_frames
    783         self.recording_pointer = 0

/System/Volumes/Data/net/server/C/All Projects/LaserLab/Software/lcp-video/lcp_video/flir_camera/flir_camera_DL.py in recording_create_file(self, filename, N_frames, overwrite)
    801         else:
    802             info(f': The HDF5 was created. The file name is {filename}')
--> 803             with File(filename,file_action) as f:
    804                 f.create_dataset('pixel format', data = self.pixel_format)
    805                 f.create_dataset('exposure time', data = self.exposure_time)

/usr/local/lib/python3.7/site-packages/h5py/_hl/files.py in __init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, **kwds)
    406                 fid = make_fid(name, mode, userblock_size,
    407                                fapl, fcpl=make_fcpl(track_order=track_order),
--> 408                                swmr=swmr)
    409 
    410             if isinstance(libver, tuple):

/usr/local/lib/python3.7/site-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    185             fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
    186         except IOError:
--> 187             fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
    188     elif mode is None:
    189         warn("The default file mode will change to 'r' (read-only) in h5py 3.0. "

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.create()

OSError: Unable to create file (unable to open file: name = '/net/control/mnt/datatop_spray-2_0.tmpraw.hdf5', errno = 45, error message = 'Operation not supported', flags = 15, o_flags = a02)
`h5py.version.info` contains the needed versions, which can be displayed by
```
python -c 'import h5py; print(h5py.version.info)'
```
where `python` should be substituted for the path to python used to install
`h5py` with.


Best,
Valentyn
Reply all
Reply to author
Forward
0 new messages