On 07/29/2015 09:57 PM, Trevor Gale wrote:
> I recently rebooted one of my nodes, and when it came back up slurm was running fine but when I run “sinfo” I see that it’s state i set to down. When I run scontrol show node compute0 it says that the reason is “unexpectedly rebooted”.
I have the same problem. According to the slurm.conf man page (Slurm
14.11.8), when I reboot a node using 'scontrol reboot_nodes <node>', it
should be returned to normal use, but instead it stays down (Reason=Node
unexpectedly rebooted)?
I should not have to set ReturnToService=2 for this, right?
Thanks,
Robbert
--
Robbert Eggermont Intelligent Systems
R.Egg...@tudelft.nl Electr.Eng., Mathematics & Comp.Science
+31 15 27 83234 Delft University of Technology