[slurm-users] Unable to submit job (ReqNodeNotAvail, UnavailableNodes)

JP Ebejer

unread,

Nov 7, 2023, 4:13:51 AM11/7/23

to slurm...@lists.schedmd.com

Hi there,

First of all, apologies for the rather verbose email.

Newbie here, wanting to set up a minimal slurm cluster on Debian 12. I installed slurm-wlm (22.05.8) on the head node and slurmd (also 22.05.8) on the compute node via apt. I have one head, one compute node, and one partition.

I have written the simplest of jobs (slurm_hello_world.sh):

#!/bin/env bash
#SBATCH --job-name=hello_word # Job name
#SBATCH --output=hello_world_%j.log # Standard output and error log

echo "Hello world, I am running on node $HOSTNAME"
sleep 5
date

Which I try to submit via sbatch slurm_hello_world.sh.

$ squeue --long -u $USER
Tue Nov 07 08:37:58 2023
JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
7 all_nodes hello_wo myuser PENDING 0:00 UNLIMITED 1 (Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)
9 all_nodes hello_wo myuser PENDING 0:00 UNLIMITED 1 (ReqNodeNotAvail, UnavailableNodes:compute-0)

sinfo shows that the node is drained (but this node is idle and has no processing)

$ sinfo --Node --long
Tue Nov 07 08:29:51 2023
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
compute-0 1 all_nodes* drained 32 2:8:2 60000 0 1 (null) batch job complete f

The slurm.conf (exact copy on head and compute nodes) is (mostly commented out stuff)

#
# Example slurm.conf file. Please run configurator.html
# (in doc/html) to build a configuration file customized
# for your environment.
#
#
# slurm.conf file generated by configurator.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ClusterName=mycluster
SlurmctldHost=head
#SlurmctldHost=
#
#DisableRootJobs=NO
#EnforcePartLimits=NO
#Epilog=
#EpilogSlurmctld=
#FirstJobId=1
#MaxJobId=67043328
#GresTypes=
#GroupUpdateForce=0
#GroupUpdateTime=600
#JobFileAppend=0
#JobRequeue=1
#JobSubmitPlugins=lua
#KillOnBadExit=0
#LaunchType=launch/slurm
#Licenses=foo*4,bar
#MailProg=/bin/mail
#MaxJobCount=10000
#MaxStepCount=40000
#MaxTasksPerNode=512
MpiDefault=none
#MpiParams=ports=#-#
#PluginDir=
#PlugStackConfig=
#PrivateData=jobs
ProctrackType=proctrack/cgroup
#Prolog=
#PrologFlags=
#PrologSlurmctld=
#PropagatePrioProcess=0
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#RebootProgram=
ReturnToService=1
SlurmctldPidFile=/var/run/slurm/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
#SrunEpilog=
#SrunProlog=
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
#TaskEpilog=
TaskPlugin=task/affinity
#TaskProlog=
#TopologyPlugin=topology/tree
#TmpFS=/tmp
#TrackWCKey=no
#TreeWidth=
#UnkillableStepProgram=
#UsePAM=0
#
#
# TIMERS
#BatchStartTimeout=10
#CompleteWait=0
#EpilogMsgTime=2000
#GetEnvTimeout=2
#HealthCheckInterval=0
#HealthCheckProgram=
InactiveLimit=0
KillWait=30
#MessageTimeout=10
#ResvOverRun=0
MinJobAge=300
#OverTimeLimit=0
SlurmctldTimeout=120
SlurmdTimeout=300
#UnkillableStepTimeout=60
#VSizeFactor=0
Waittime=0
#
#
# SCHEDULING
#DefMemPerCPU=0
#MaxMemPerCPU=0
#SchedulerTimeSlice=30
SchedulerType=sched/backfill
SelectType=select/cons_tres
#
#
# JOB PRIORITY
#PriorityFlags=
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=
#PriorityCalcPeriod=
#PriorityFavorSmall=
#PriorityMaxAge=
#PriorityUsageResetPeriod=
#PriorityWeightAge=
#PriorityWeightFairshare=
#PriorityWeightJobSize=
#PriorityWeightPartition=
#PriorityWeightQOS=
#
#
# LOGGING AND ACCOUNTING
#AccountingStorageEnforce=0
#AccountingStorageHost=
#AccountingStoragePass=
#AccountingStoragePort=
AccountingStorageType=accounting_storage/none
#AccountingStorageUser=
#AccountingStoreFlags=
#JobCompHost=
#JobCompLoc=
#JobCompPass=
#JobCompPort=
JobCompType=jobcomp/none
#JobCompUser=
#JobContainerType=
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=debug3
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdDebug=debug3
SlurmdLogFile=/var/log/slurm/slurmd.log
#SlurmSchedLogFile=
#SlurmSchedLogLevel=
#DebugFlags=
#
#
# POWER SAVE SUPPORT FOR IDLE NODES (optional)
#SuspendProgram=
#ResumeProgram=
#SuspendTimeout=
#ResumeTimeout=
#ResumeRate=
#SuspendExcNodes=
#SuspendExcParts=
#SuspendRate=
#SuspendTime=
#
#
# COMPUTE NODES
NodeName=compute-0 RealMemory=60000 Sockets=2 CoresPerSocket=8 ThreadsPerCore=2 State=UNKNOWN
PartitionName=all_nodes Nodes=ALL Default=YES MaxTime=INFINITE State=UP

To my untrained eye, there is nothing obviously wrong in slurmd.log (compute) and slurmctld.log (head). In slurmctld.log:

[...SNIP...]

[2023-11-07T08:58:35.804] debug2: sched: JobId=10. unable to schedule in Partition=all_nodes (per _failed_partition()). Retaining previous scheduling Reason=ReqNodeNotAvail. Desc=ReqNodeNotAvail, UnavailableNodes:compute-0. Priority=4294901753.

[2023-11-07T08:58:36.396] debug: sched/backfill: _attempt_backfill: beginning
[2023-11-07T08:58:36.396] debug: sched/backfill: _attempt_backfill: 4 jobs to backfill
[2023-11-07T08:58:36.652] debug2: Processing RPC: REQUEST_SUBMIT_BATCH_JOB from UID=1002
[2023-11-07T08:58:36.652] debug3: _set_hostname: Using auth hostname for alloc_node: head
[2023-11-07T08:58:36.652] debug3: JobDesc: user_id=1002 JobId=N/A partition=(null) name=hello_word
[2023-11-07T08:58:36.652] debug3: cpus=1-4294967294 pn_min_cpus=-1 core_spec=-1
[2023-11-07T08:58:36.652] debug3: Nodes=4294967294-[4294967294] Sock/Node=65534 Core/Sock=65534 Thread/Core=65534
[2023-11-07T08:58:36.652] debug3: pn_min_memory_job=18446744073709551615 pn_min_tmp_disk=-1
[2023-11-07T08:58:36.653] debug3: immediate=0 reservation=(null)
[2023-11-07T08:58:36.653] debug3: features=(null) batch_features=(null) cluster_features=(null) prefer=(null)
[2023-11-07T08:58:36.653] debug3: req_nodes=(null) exc_nodes=(null)
[2023-11-07T08:58:36.653] debug3: time_limit=-1--1 priority=-1 contiguous=0 shared=-1
[2023-11-07T08:58:36.653] debug3: kill_on_node_fail=-1 script=#!/bin/env bash
#SBATCH --job-name=hello...
[2023-11-07T08:58:36.653] debug3: argv="/home/myuser/myuser-slurm/tests/hello_world_slurm.sh"
[2023-11-07T08:58:36.653] debug3: environment=SHELL=/bin/bash,LANGUAGE=en_GB:en,EDITOR=vim,...
[2023-11-07T08:58:36.653] debug3: stdin=/dev/null stdout=/home/myuser/myuser-slurm/tests/hello_world_%j.log stderr=(null)
[2023-11-07T08:58:36.653] debug3: work_dir=/home/myuser/ansible-slurm/tests alloc_node:sid=head:721
[2023-11-07T08:58:36.653] debug3: power_flags=
[2023-11-07T08:58:36.653] debug3: resp_host=(null) alloc_resp_port=0 other_port=0
[2023-11-07T08:58:36.653] debug3: dependency=(null) account=(null) qos=(null) comment=(null)
[2023-11-07T08:58:36.653] debug3: mail_type=0 mail_user=(null) nice=0 num_tasks=-1 open_mode=0 overcommit=-1 acctg_freq=(null)
[2023-11-07T08:58:36.653] debug3: network=(null) begin=Unknown cpus_per_task=-1 requeue=-1 licenses=(null)
[2023-11-07T08:58:36.653] debug3: end_time= signal=0@0 wait_all_nodes=-1 cpu_freq=
[2023-11-07T08:58:36.653] debug3: ntasks_per_node=-1 ntasks_per_socket=-1 ntasks_per_core=-1 ntasks_per_tres=-1
[2023-11-07T08:58:36.653] debug3: mem_bind=0:(null) plane_size:65534
[2023-11-07T08:58:36.653] debug3: array_inx=(null)
[2023-11-07T08:58:36.653] debug3: burst_buffer=(null)
[2023-11-07T08:58:36.653] debug3: mcs_label=(null)
[2023-11-07T08:58:36.653] debug3: deadline=Unknown
[2023-11-07T08:58:36.653] debug3: bitflags=0x1e000000 delay_boot=4294967294
[2023-11-07T08:58:36.654] debug2: found 1 usable nodes from config containing compute-0
[2023-11-07T08:58:36.654] debug3: _pick_best_nodes: JobId=11 idle_nodes 1 share_nodes 1
[2023-11-07T08:58:36.654] debug2: select/cons_tres: select_p_job_test: evaluating JobId=11
[2023-11-07T08:58:36.654] debug2: select/cons_tres: select_p_job_test: evaluating JobId=11
[2023-11-07T08:58:36.654] debug3: select_nodes: JobId=11 required nodes not avail
[2023-11-07T08:58:36.654] _slurm_rpc_submit_batch_job: JobId=11 InitPrio=4294901752 usec=822
[2023-11-07T08:58:38.807] debug: sched: Running job scheduler for default depth.
[2023-11-07T08:58:38.807] debug3: sched: JobId=7. State=PENDING. Reason=Resources. Priority=4294901756. Partition=all_nodes.
[2023-11-07T08:58:38.807] debug2: sched: JobId=8. unable to schedule in Partition=all_nodes (per _failed_partition()). Retaining previous scheduling Reason=ReqNodeNotAvail. Desc=ReqNodeNotAvail, UnavailableNodes:compute-0. Priority=4294901755.
[2023-11-07T08:58:38.807] debug2: sched: JobId=9. unable to schedule in Partition=all_nodes (per _failed_partition()). Retaining previous scheduling Reason=ReqNodeNotAvail. Desc=ReqNodeNotAvail, UnavailableNodes:compute-0. Priority=4294901754.
[2023-11-07T08:58:38.807] debug2: sched: JobId=10. unable to schedule in Partition=all_nodes (per _failed_partition()). Retaining previous scheduling Reason=ReqNodeNotAvail. Desc=ReqNodeNotAvail, UnavailableNodes:compute-0. Priority=4294901753.
[2023-11-07T08:58:38.807] debug2: sched: JobId=11. unable to schedule in Partition=all_nodes (per _failed_partition()). Retaining previous scheduling Reason=ReqNodeNotAvail. Desc=ReqNodeNotAvail, UnavailableNodes:compute-0. Priority=4294901752.
[2023-11-07T08:58:39.008] debug3: create_mmap_buf: loaded file `/var/spool/slurmctld/job_state` as buf_t
[2023-11-07T08:58:39.008] debug3: Writing job id 12 to header record of job_state file

Can you help me figure out what is wrong with my setup please?

Many thanks
Jean-Paul Ebejer
University of Malta

The contents of this email are subject to these terms.

Diego Zuccato

unread,

Nov 7, 2023, 4:35:48 AM11/7/23

to slurm...@lists.schedmd.com

Il 07/11/2023 10:12, JP Ebejer ha scritto:

> sinfo shows that the node is drained (but this node is idle and has no
> processing)
>
> $ sinfo --Node --long
> Tue Nov 07 08:29:51 2023
> NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK
> WEIGHT AVAIL_FE REASON
> compute-0 1 all_nodes* drained 32 2:8:2 60000 0
> 1 (null) batch job complete f

You have to RESUME the node so it starts accepting jobs.
scontrol update nodename=compute-0 state=resume

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

JP Ebejer

unread,

Nov 7, 2023, 5:16:55 AM11/7/23

to Slurm User Community List

Hi there Diego,

Grazie per il vostro aiuto.

I had to use sudo to switch to the slurm user, as with myuser I got "slurm_update error: Invalid user id".

$ sudo -u slurm scontrol update nodename=compute-0 state=resume

This works (I think, as it returns no visual cue), but on running sinfo right after, the node is still "drained".

$ sinfo --Node --long
Tue Nov 07 10:08:27 2023

NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
compute-0 1 all_nodes* drained 32 2:8:2 60000 0 1 (null) batch job complete f

In my jobs (squeue) I now also have the failed jobs

$ squeue --long -u $USER

JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
9 all_nodes hello_wo myuser PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:compute-0)
11 all_nodes hello_wo myuser PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:compute-0)

What am I missing here please?

--

Prof. Jean-Paul Ebejer | Associate Professor

BSc (Hons) (Melita), MSc (Imperial), DPhil (Oxon.)

Centre for Molecular Medicine and Biobanking

Office 320, Biomedical Sciences Building,

University of Malta, Msida, MSD 2080. MALTA.

T: (00356) 2340 3263

Department of Artificial Intelligence

Associate Member

Join the Bioinformatics@UM mailing list!
Where to find me

Diego Zuccato

unread,

Nov 7, 2023, 5:34:01 AM11/7/23

to slurm...@lists.schedmd.com

Il 07/11/2023 11:15, JP Ebejer ha scritto:
> Hi there Diego,
>
> Grazie per il vostro aiuto.
>
> I had to use sudo to switch to the slurm user, as with myuser I got
> "slurm_update error: Invalid user id".

Ok, that's normal.

> $ sudo -u slurm scontrol update nodename=compute-0 state=resume
>
> This works (I think, as it returns no visual cue),

Still normal: silent when there's no error.

> but on running sinfo
> right after, the node is still "drained".

That's not normal :(
Look at the slurmd log on the node for a reason. Probably the node
detects an error and sets itself to drained. Another possibility is that
slurmctld detects a mismatch between the node and its config: in this
case you'll find the reason in slurmctld.log .

JP Ebejer

unread,

Nov 7, 2023, 11:44:33 AM11/7/23

to Slurm User Community List

On Tue, 7 Nov 2023 at 11:34, Diego Zuccato <diego....@unibo.it> wrote:

Il 07/11/2023 11:15, JP Ebejer ha scritto:
> but on running sinfo
> right after, the node is still "drained".

That's not normal :(
Look at the slurmd log on the node for a reason. Probably the node
detects an error and sets itself to drained. Another possibility is that
slurmctld detects a mismatch between the node and its config: in this
case you'll find the reason in slurmctld.log .

Ok great. So I clear the slurmd.log on the compute-0 node. I restart the service (after changing the logging from debug3 to verbose).

[2023-11-07T16:34:17.575] topology/none: init: topology NONE plugin loaded
[2023-11-07T16:34:17.575] route/default: init: route default plugin loaded
[2023-11-07T16:34:17.577] task/affinity: init: task affinity plugin loaded with CPU mask 0xffffffff
[2023-11-07T16:34:17.578] cred/munge: init: Munge credential signature plugin loaded
[2023-11-07T16:34:17.578] slurmd version 22.05.8 started
[2023-11-07T16:34:17.579] error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:195: pmi/pmix: can not load PMIx library
[2023-11-07T16:34:17.579] error: Couldn't load specified plugin name for mpi/pmix: Plugin init() callback failed
[2023-11-07T16:34:17.579] error: MPI: Cannot create context for mpi/pmix
[2023-11-07T16:34:17.580] error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:195: pmi/pmix: can not load PMIx library
[2023-11-07T16:34:17.580] error: Couldn't load specified plugin name for mpi/pmix_v4: Plugin init() callback failed
[2023-11-07T16:34:17.580] error: MPI: Cannot create context for mpi/pmix_v4
[2023-11-07T16:34:17.580] slurmd started on Tue, 07 Nov 2023 16:34:17 +0000
[2023-11-07T16:34:17.580] CPUs=32 Boards=1 Sockets=2 Cores=8 Threads=2 Memory=64171 TmpDisk=1031475 Uptime=87818 CPUSpecList=(null) FeaturesAvail=(null) FeaturesActive=(null)

I am not sure I understand this, and my MPI setting is none (so MpiDefault=none). The jobs I intend to run do not use MPI.

Could this be the cause, and how do I fix this (on Debian 12)?

Also if I stop, truncate the log file, and start the slurmctld service I see similar errors. Below:

[2023-11-07T16:40:22.888] error: Configured MailProg is invalid
[2023-11-07T16:40:22.889] slurmctld version 22.05.8 started on cluster mycluster
[2023-11-07T16:40:22.890] cred/munge: init: Munge credential signature plugin loaded
[2023-11-07T16:40:22.892] select/cons_res: common_init: select/cons_res loaded
[2023-11-07T16:40:22.892] select/cons_tres: common_init: select/cons_tres loaded
[2023-11-07T16:40:22.892] select/cray_aries: init: Cray/Aries node selection plugin loaded
[2023-11-07T16:40:22.893] preempt/none: init: preempt/none loaded
[2023-11-07T16:40:22.894] ext_sensors/none: init: ExtSensors NONE plugin loaded
[2023-11-07T16:40:22.895] error: mpi/pmix_v4: init: (null) [0]: mpi_pmix.c:195: pmi/pmix: can not load PMIx library
[2023-11-07T16:40:22.895] error: Couldn't load specified plugin name for mpi/pmix_v4: Plugin init() callback failed
[2023-11-07T16:40:22.895] error: MPI: Cannot create context for mpi/pmix_v4
[2023-11-07T16:40:22.899] accounting_storage/none: init: Accounting storage NOT INVOKED plugin loaded
[2023-11-07T16:40:22.901] No memory enforcing mechanism configured.
[2023-11-07T16:40:22.902] topology/none: init: topology NONE plugin loaded
[2023-11-07T16:40:22.904] sched: Backfill scheduler plugin loaded
[2023-11-07T16:40:22.904] route/default: init: route default plugin loaded
[2023-11-07T16:40:22.905] Recovered state of 1 nodes
[2023-11-07T16:40:22.905] Recovered JobId=8 Assoc=0
[2023-11-07T16:40:22.905] Recovered JobId=9 Assoc=0
[2023-11-07T16:40:22.905] Recovered JobId=10 Assoc=0
[2023-11-07T16:40:22.905] Recovered JobId=11 Assoc=0
[2023-11-07T16:40:22.905] Recovered information about 4 jobs
[2023-11-07T16:40:22.906] select/cons_tres: select_p_node_init: select/cons_tres SelectTypeParameters not specified, using default value: CR_Core_Memory
[2023-11-07T16:40:22.906] select/cons_tres: part_data_create_array: select/cons_tres: preparing for 1 partitions
[2023-11-07T16:40:22.906] Recovered state of 0 reservations
[2023-11-07T16:40:22.906] State of 0 triggers recovered
[2023-11-07T16:40:22.906] read_slurm_conf: backup_controller not specified
[2023-11-07T16:40:22.906] select/cons_tres: select_p_reconfigure: select/cons_tres: reconfigure
[2023-11-07T16:40:22.906] select/cons_tres: part_data_create_array: select/cons_tres: preparing for 1 partitions
[2023-11-07T16:40:22.906] Running as primary controller
[2023-11-07T16:40:22.907] No parameter for mcs plugin, default values set
[2023-11-07T16:40:22.907] mcs: MCSParameters = (null). ondemand set.

Is this a step closer to resolution?

JP Ebejer

unread,

Nov 12, 2023, 5:13:01 AM11/12/23

to Slurm User Community List

Ok so a step further (I hope), but still am stuck with a non working cluster.

I managed to solve both problems above by installing two debian packages (sudo apt install mailutils libpmix-dev) on both head and compute nodes.

I have no errors in the two log files, but somehow the node is still drained.

How do I get around this please?

--

Prof. Jean-Paul Ebejer | Associate Professor

BSc (Hons) (Melita), MSc (Imperial), DPhil (Oxon.)

Centre for Molecular Medicine and Biobanking

Office 320, Biomedical Sciences Building,

University of Malta, Msida, MSD 2080. MALTA.

T: (00356) 2340 3263

Department of Artificial Intelligence

Associate Member

Join the Bioinformatics@UM mailing list!
Where to find me

Reply all

Reply to author

Forward