Worker process did not connect within the allotted time.

98 views
Skip to first unread message

Dylan Noone

unread,
Dec 1, 2025, 10:28:42 AM12/1/25
to Warp
Hey guys,

so I am having a problem with slurm scripts that seemed to work fine before.

I am trying to export particles to relion, however, I get the following error:

Running command ts_export_particles with:
input_star = null
input_directory = etomo_patches_1000_tilt_axis_corrected/reconstruction/completed/compat_warp_export
input_pattern = *particles.star
coords_angpix = null
normalized_coords = True
output_star = relion_15793_pytom/matching.star
output_angpix = 5
box = 64
diameter = 230
relative_output_paths = True
2d = True
3d = False
dont_normalize_input = False
dont_normalize_3d = False
n_tilts = null
max_missing_tilts = 5
device_list = {  }
perdevice = 1
workers = {  }
settings = warp_tiltseries.settings
input_data = {  }
input_data_recursive = False
input_processing = null
output_processing = relion_15793_pytom
input_norawdata = False
strict = False

No alternative input specified, will use input parameters from warp_tiltseries.settings
File search will be relative to /bbsrc/home/dnoone/nooned/Dylan/warp/tomostar
200 files found
Parsing previous results for each item, if available...
0/200
10/200, previous metadata found for 10
20/200, previous metadata found for 20
30/200, previous metadata found for 30
40/200, previous metadata found for 40
50/200, previous metadata found for 50
60/200, previous metadata found for 60
70/200, previous metadata found for 70
80/200, previous metadata found for 80
90/200, previous metadata found for 90
100/200, previous metadata found for 100
110/200, previous metadata found for 110
120/200, previous metadata found for 120
130/200, previous metadata found for 130
140/200, previous metadata found for 140
150/200, previous metadata found for 150
160/200, previous metadata found for 160
170/200, previous metadata found for 170
180/200, previous metadata found for 180
190/200, previous metadata found for 190
200/200, previous metadata found for 200
Found 199 files in etomo_patches_1000_tilt_axis_corrected/reconstruction/completed/compat_warp_export matching *particles.star;
Found 99500 particles in 199 tilt series
Connecting to workers...

Unhandled exception. System.TimeoutException: Worker process did not connect within the allotted time.
   at Warp.WorkerWrapper.ListenForPort(String pipeName, Int32 timeoutMilliseconds) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpLib/WorkerWrapper.cs:line 168
   at Warp.WorkerWrapper..ctor(Int32 deviceID, Boolean silent, Boolean attachDebugger) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpLib/WorkerWrapper.cs:line 100
   at WarpTools.Commands.DistributedOptions.GetWorkers(Boolean attachDebugger) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpTools/Commands/DistributedOptions.cs:line 57
   at WarpTools.Commands.ExportParticlesTiltseries.Run(Object options) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpTools/Commands/Tiltseries/ExportParticlesTiltseries.cs:line 214
   at WarpTools.WarpTools.Run(Object options) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpTools/Program.cs:line 30
   at Warp.Tools.CommandLineParserHelper.ParseAndRun(String[] args, Func`2 run, Type[] verbs, String appName) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpLib/Tools/CommandLineParserHelper.cs:line 27
   at WarpTools.WarpTools.Main(String[] args) in /home/runner/micromamba/envs/package-build/conda-bld/warp_1757710231489/work/WarpTools/Program.cs:line 17
   at WarpTools.WarpTools.<Main>(String[] args)
/var/spool/slurmd/job37933/slurm_script: line 97: 2756720 Aborted                 (core dumped) WarpTools ts_export_particles --settings warp_tiltseries.settings --input_directory etomo_patches_1000_tilt_axis_corrected/reconstruction/completed/compat_warp_export --input_pattern "*particles.star" --output_processing relion_15793_pytom --normalized_coords --output_star relion_15793_pytom/matching.star --output_angpix 5 --box 64 --diameter 230 --relative_output_paths --2d


I have attached my slurm script for inspection.

Please note I have already read several other threads e.g. https://github.com/warpem/warp/issues/28#issuecomment-2436353364, suggesting adding :

export no_proxy=localhost,127.0.0.1:$(hostname)
or
export no_proxy=localhost,127.0.0.1

This does not seem to remedy the problem. WarpTools was running quite nicely for a couple of months and then suddenly this issue started to occur.

Best,
Dylan Noone

37_export_relion_pytom_15793_relion_2D.sh

Sylvain Trepout

unread,
Dec 15, 2025, 3:48:24 PM12/15/25
to Warp
Hi everyone,

I have exactly the same error.
I do not use job submission to start my warp commands.

I have 2 different warp envs on our cluster (dev34 and dev36) and both output the same error.
I was working on a new dataset so I thought something was wrong with it, but if I go back to a previous project and try to re-extract the particles the error is also there.

I recently installed the dev36. I am surprised that the dev34 is also not working because I haven't updated it.

Thank for any suggestion.
Cheeers,
Sylvain

Sylvain Trepout

unread,
Dec 18, 2025, 6:03:20 PM12/18/25
to Warp
Hi,

Since my last message, I have been trying several things on several computers and warp environments.
It seems my problem originated from a heavy I/O happening on the cluster.
This was slowing down file access and since the IT guys partially resolved the issue, I can extract particles as before.

@Dylan, hope your issue is as trivial as mine.

Cheers,
Sylvain

Dylan Noone

unread,
Dec 19, 2025, 3:48:33 AM12/19/25
to Warp
Hey Sylvain,

Thanks for your input.

Out of interest how did your cluster managers get around the heavy I/O?

Best,
Dylan

Dylan Noone

unread,
Dec 20, 2025, 7:52:41 AM12/20/25
to Warp

I managed to get it working.

We’re using a cluster with a head node (not uncommon I know), and jobs are normally submitted via sbatch to one of the many other nodes (as in my original script above).

If I use salloc to reserve an entire node and then SSH into that node to run the jobs locally, everything works as expected.

It’s a bit inconvenient, but it does seem to align with what you were suggesting as well.

It is strange as this is a warp specific problem, relion, cryosparc etc etc all run fine using the normal sbatch.

Sylvain Trepout

unread,
Dec 20, 2025, 6:09:10 PM12/20/25
to Warp
Hi Dylan,

Great that you found a solution.

In my case, there was an error on the cluster filesystem so they had to run an heavy I/O check of all files, which slowed down everything else.
If the system is busy or if the connection to another node takes a bit of time, then the error will pop up.

Maybe it is possible to change the default delay during Warp installation.
I think this is located in the file WarpWorker.cs, but I might be wrong.

Cheers,
Sylvain

Reply all
Reply to author
Forward
0 new messages