[slurm-users] srun: error: Unable to allocate resources: Invalid partition name specified

9,164 views
Skip to first unread message

vale...@cbpf.br

unread,
Jul 26, 2018, 11:58:04 AM7/26/18
to slurm...@lists.schedmd.com
Hi all,

I dont´t understand why its occurs!

user: john
group: courseit
partition: course

[john@master ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
course up infinite 8 idle node[02-04,06,09-12]

/etc/group
courseit:x:1002:john

/etc/passwd
john:x:1001:1002::/home/john:/bin/bash

/etc/slurm/slurm.conf
PartitionName=course Nodes=node[02-04,06,09-12] AllowGroups=courseit
Default=YES MaxTime=INFINITE State=UP


[john@master ~]$ srun -N3 -l /bin/hostname
srun: error: Unable to allocate resources: User's group not permitted
to use this partition

And if I put -p course, it´s ok

[john@master ~]$ srun -p course -N3 -l /bin/hostname
2: node04
1: node03
0: node02

Can someone has an idea?

Thanks in advance!
Valeriana







Michael Robbert

unread,
Jul 26, 2018, 1:14:34 PM7/26/18
to slurm...@lists.schedmd.com
The line that you list from your slurm.conf shows the "course" partition
being set as the default partition, but on our system the sinfo command
shows our default partition with a * at the end and your output doesn't
show that so I'm wondering if you've got another partition that is
getting defined as the default partition.

Can you post the full output of 'sinfo -a' and maybe the output of 'grep
-i ^Partition /etc/slurm/slurm.conf' would help us debug as well.

Mike Robbert

Lachlan Musicman

unread,
Jul 26, 2018, 8:08:43 PM7/26/18
to Slurm User Community List
On 27 July 2018 at 03:13, Michael Robbert <mrob...@mines.edu> wrote:
The line that you list from your slurm.conf shows the "course" partition being set as the default partition, but on our system the sinfo command shows our default partition with a * at the end and your output doesn't show that so I'm wondering if you've got another partition that is getting defined as the default partition.

Can you post the full output of 'sinfo -a' and maybe the output of 'grep -i ^Partition /etc/slurm/slurm.conf' would help us debug as well.

I've done this in the past as well. As Michael has noted, most likely scenario is that more that one PartitionName line in your slurm.conf has Default=YES

Only the last listed (I think) will be Default.

Cheers
L.

vale...@cbpf.br

unread,
Jul 27, 2018, 9:21:19 AM7/27/18
to slurm...@lists.schedmd.com
Hi Merlin

[root@masters3 ~]# scontrol show partition
PartitionName=course
AllowGroups=courseit AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=node[02-04,06,09-12]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=64 TotalNodes=8 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

PartitionName=test
AllowGroups=testcluster AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=node[01,05,07,08]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=32 TotalNodes=4 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

/etc/slurm/slurm.conf
PartitionName=course Nodes=node[02-04,06,09-12] AllowGroups=courseit
Default=NO MaxTime=INFINITE State=UP
PartitionName=test Nodes=node[01,05,07,08] AllowGroups=testcluster
Default=YES MaxTime=INFINITE State=UP


> Do you accidentally have more than one partition with Default=YES?
It was. I changed to NO and I continue with the same error

Thanks!!!


Citando Merlin Hartley <mer...@mrc-mbu.cam.ac.uk>:

> Do you accidentally have more than one partition with Default=YES?
>
>
> --
> Merlin Hartley
> Computer Officer
> MRC Mitochondrial Biology Unit
> University of Cambridge
> Cambridge, CB2 0XY
> United Kingdom
----- Final da mensagem encaminhada -----

vale...@cbpf.br

unread,
Jul 27, 2018, 9:35:12 AM7/27/18
to slurm...@lists.schedmd.com
Hi Merlin

> Do you accidentally have more than one partition with Default=YES?
It was. I changed to NO and I continue with the same error.

[root@master ~]# scontrol show partition
PartitionName=course
AllowGroups=courseit AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=node[02-04,06,09-12]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=64 TotalNodes=8 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

PartitionName=test
AllowGroups=testcluster AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=node[01,05,07,08]
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=32 TotalNodes=4 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED

/etc/slurm/slurm.conf
PartitionName=course Nodes=node[02-04,06,09-12] AllowGroups=curseit
Default=NO MaxTime=INFINITE State=UP
PartitionName=test Nodes=node[01,05,07,08] AllowGroups=testcluster
Default=YES MaxTime=INFINITE State=UP

If I have a lot of partitions, how can I set a default partition to a
distinct groups?

In my slurm.conf file, I think that I have to set Default=YES to all
main partition to my all distinct partitions

For example:

/etc/slurm/slurm.conf

PartitionName=course Nodes=node[02-04,06,09-12] AllowGroups=curseit
Default=YES MaxTime=INFINITE State=UP
PartitionName=courset Nodes=node[13-20] AllowGroups=curseit Default=NO
MaxTime=INFINITE State=UP

PartitionName=test Nodes=node[01,05,07,08] AllowGroups=testcluster
Default=YES MaxTime=INFINITE State=UP
PartitionName=testc Nodes=node[21-30] AllowGroups=testcluster
Default=NO MaxTime=INFINITE State=UP

Thanks!!!

Valeriana

Brian Andrus

unread,
Jul 27, 2018, 11:00:19 AM7/27/18
to slurm...@lists.schedmd.com
You show you still have more that one partition with Default=YES.

There should one and only one that is set to YES.
That is the one partition that is used if it is not specified.

Brian Andrus
Reply all
Reply to author
Forward
0 new messages