Error when running with ZMQ on cluster: no workers available

15 views
Skip to first unread message

Razvan Marinescu

unread,
Jan 13, 2023, 9:42:42 PM1/13/23
to westpa-users
When running ZMQ on my cluster, it finishes running the segments in the first iteration, but before it's about to start the next iteration, I get the following error:

-- ERROR    [w_run] -- Traceback (most recent call last):
  File "/projects/bbpa/westpa_source/src/westpa/cli/core/w_run.py", line 62, in run_simulation
    sim_manager.run()
  File "/projects/bbpa/westpa_source/src/westpa/core/sim_manager.py", line 777, in run
    self.prepare_iteration()
  File "/projects/bbpa/westpa_source/src/westpa/core/binning/mab_manager.py", line 177, in prepare_it
eration
    self.work_manager.submit(wm_ops.prep_iter, args=(self.n_iter, segments)).get_result()
  File "/projects/bbpa/westpa_source/src/westpa/work_managers/core.py", line 343, in get_result
    raise self._exception.with_traceback(self._traceback)
westpa.work_managers.zeromq.core.ZMQWorkerMissing: no workers available

What could be the cause? I'm trying to debug it, but it's hard given I didn't write the code. Darian, you also seem to have encountered the error here: 


Any idea how to fix it? I've tried re-running it multiple times, to no avail.

Thanks,
Razvan

Razvan Marinescu

unread,
Jan 13, 2023, 10:04:33 PM1/13/23
to westpa-users
Never mind. I just found the problem, there was an error when computing the progress coordinates in the workers, and that's why they were all exiting.

Razvan 

Reply all
Reply to author
Forward
0 new messages