Hi
I am having problems with my jobs terminating because of 'Node fail'?
For example I get this message in e-mail alert:
Run time 01:09:50, NODE_FAIL, ExitCode 0
Has this anything to do with my job or the nodes? It happens to different jobs and the time when it fails seems random.
Cheers
Palle
SLURM Job_id=1340896 Name=CV Failed, Run time 00:07:57, NODE_FAIL, ExitCode 0
SLURM Job_id=1338030 Name=binaryDMU Failed, Run time 00:11:36, NODE_FAIL, ExitCode