TypeError: h5py objects cannot be pickled

Skip to first unread message

Debvrat Varshney

Mar 14, 2020, 10:42:41 AM3/14/20
to h5py

I am trying to run a PyTorch implementation of a code, which is supposed to work on SBD dataset.

The training labels are available in .bin file, which are converted to HDF5 (.h5) files.

Upon running the algorithm, I get an error as: " TypeError: h5py objects cannot be pickled "

I think the error is stemming from torch.utils.data.DataLoader, but I am not sure. 

Any idea if I am missing something here? I read that pickling is generally not preferred but as of now, my dataset is in HDF5 format only.

For your reference, the error's stack trace is as follows:

  File "G:\My Drive\Debvrat - shared\Codes\CASENet PyTorch Implementations\SBD-lijiaman\main.py", line 130, in <module>

  File "G:\My Drive\Debvrat - shared\Codes\CASENet PyTorch Implementations\SBD-lijiaman\main.py", line 85, in main
    win_feats5, win_fusion, viz, global_step)

  File "G:\My Drive\Debvrat - shared\Codes\CASENet PyTorch Implementations\SBD-lijiaman\train_val\model_play.py", line 31, in train
    for i, (img, target) in enumerate(train_loader):

  File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__
    return _DataLoaderIter(self)

  File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__

  File "C:\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)

  File "C:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)

  File "C:\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)

  File "C:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)

  File "C:\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)

  File "C:\Anaconda3\lib\site-packages\h5py\_hl\base.py", line 308, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")


Thomas Kluyver

Mar 15, 2020, 7:46:41 AM3/15/20
to h5...@googlegroups.com
Hi Debvrat,

In your traceback,  it looks like it's trying to pickle the object as it launches a subprocess to load the data. If the code you're using was written by someone using Linux or Mac, they probably didn't have this issue, because those platforms can 'fork' processes, avoiding the need to pickle things. So you might be able to work around it by running the code on Linux. Or adjust the code to pass the filename and object name in, and open the HDF5 file within the data loader process.

It's a deliberate design decision for h5py to disallow pickling its objects - although it's easy in many simple cases to transfer a reference to a filename and an object path inside that file, it relies on the process deserialising that having access to the same filesystem. Even if it does, there are other things that can go wrong, e.g. if the sender still has the file open for writing, the receiver can't open it again. For cases where these restrictions don't matter, there's a separate project h5pickle that enables pickling for simple cases.

Best wishes,

You received this message because you are subscribed to the Google Groups "h5py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h5py+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h5py/04c496c6-e4f6-4f97-a55b-590784c40c4b%40googlegroups.com.
Reply all
Reply to author
0 new messages