Alison
The error message indicates that there are no resources to execute jobs. Since you haven’t defined any compute nodes you will get this error.
I would suggest that you create at least one compute node. Once, you do that this error should go away.
Jeff
From: Alison Peterson via slurm-users <slurm...@lists.schedmd.com>
Sent: Tuesday, April 9, 2024 2:52 PM
To: slurm...@lists.schedmd.com
Subject: [slurm-users] Nodes required for job are down, drained or reserved
◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.
Alison
Can you provide the output of the following commands:
and the job command that your trying to run?
Alison
The sinfo shows that your head node is down due to come configuration error.
Are you running slurmd on the head node? If slurmd, is running find the log file for it and pass along the entries from it.
Can you redo the scontrol command and “node name” should be “nodename” one word.
I need to see what’s in the test.sh file to get an idea of how your job is setup.
jeff
Alison
In your case since you are using head as both a slurm management node and a compute node you’ll need to setup slurmd on the head node.
Once the slurmd is running use “sinfo” to see what the status of the node is. Most likely down hopefully without an astrick. If that’s the case then use
scontrol update node=head state=resume
and then check the status again. Hopwfully the node with show idle meaning that it’s should be ready to accept jobs.
Jeff
Alison
I’m glad I was able to help. Good luck.