I am having trouble getting a WESTPA simulation of mine to run. On my first submission to the queue system, the simulation ran without any issues and exited normally as the queue time expired. Resubmitting the simulation, I get the following error:
exception caught; shutting down
-- ERROR 2015-10-26 15:22:08,434 PID 19211 TID 47435852512768
from logger "w_run"
at location /gscratch3/lchong/ajd98/apps/westpa_8.25.15/westpa/lib/cmds/w_run.py:73 [<module>()]
::
Traceback (most recent call last):
File "/gscratch3/lchong/ajd98/apps/westpa_8.25.15/westpa/lib/cmds/w_run.py", line 65, in <module>
sim_manager.run()
File "/gscratch3/lchong/ajd98/apps/westpa_8.25.15/westpa/src/west/sim_manager.py", line 643, in run
self.propagate()
File "/gscratch3/lchong/ajd98/apps/westpa_8.25.15/westpa/src/west/sim_manager.py", line 501, in propagate
self.data_manager.update_segments(self.n_iter, incoming)
File "/gscratch3/lchong/ajd98/apps/westpa_8.25.15/westpa/src/west/data_manager.py", line 915, in update_segments
dset.id.write(source_sel, dest_sel, auxdataset)
File "h5d.pyx", line 219, in h5py.h5d.DatasetID.write (h5py/h5d.c:2936)
File "_proxy.pyx", line 132, in h5py._proxy.dset_rw (h5py/_proxy.c:1585)
File "_proxy.pyx", line 93, in h5py._proxy.H5PY_H5Dwrite (h5py/_proxy.c:1334)
IOError: Can't write data (Inflate() failed)
The simulation runs long enough for one segment to complete; after returning data, it looks like WESTPA tries to write the data to disk using h5py, but fails. Interestingly, the error does not occur with the serial work-manager (letting WESTPA run about 25 segments; finishing the iteration in serial mode would take quite a long time). However, it does occur with both ZMQ and the "processes" work manager. I have attached a log file from WESTPA run in debug mode using the processes work-manager. My WESTPA install includes the ZMQ work-manager rewrite; it gives the version as "WEST version 1.0.0 beta," and I downloaded it on 8/25/15.
Here are some links that I have been reading through, where other people have similar problems (not related to WESTPA):
Some of these sources lead me to believe the error could be related to a bug in H5py. In any case, I will keep reading about how I may solve this. Any help is appreciated.