[slurm-users] Use all cores when submitting to heterogeneous nodes

235 views
Skip to first unread message

Richard Ems

unread,
Mar 22, 2022, 10:30:42 AM3/22/22
to slurm...@schedmd.com
Hi all,

I am looking for an option to use all cores when submitting to heterogeneous nodes.
In this case I have 2 partitions:
part1:  #N1 nodes, each node has 40 cores
part2:  #N2 nodes, each node has 48 cores

I want to submit to both partitions, requesting a number of nodes and then set
--ntasks=40*#nodes
 or
--ntasks=48*#nodes
depending on which partition gets selected by Slurm.
Can this be done?

An option similar to --ntasks=USE_ALL_CORES would be great.

Many thanks,
Richard

--
Richard Ems     /     aiduit     /     r....@aiduit.com

Tina Friedrich

unread,
Mar 22, 2022, 10:43:21 AM3/22/22
to slurm...@lists.schedmd.com
Hi Richard,

...what's wrong with using '--exclusive'? I mean if you're wanting all
cores on the node anyway, wouldn't asking for it exclusively be pretty
much the same thing?

Tina
> <mailto:r....@aiduit.com>

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk

Brian Andrus

unread,
Mar 22, 2022, 11:18:28 AM3/22/22
to slurm...@lists.schedmd.com

You are putting the cart before the horse here. While you can get access to all the node using --exclusive, when you request cores, you will not know if you have more. For example you request 80 cores and land on a 40 and a 48 with exclusive access. You would need to do some sort of discovery to know what is available versus what you asked for. When using exclusive, it becomes more like "I want at least X cores" and you get "Ok, here are X cores or more"

Within your script, you could check for total cores. something like 'srun lscpu' and parse the output.

Brian Andrus

Baer, Troy

unread,
Mar 22, 2022, 11:20:36 AM3/22/22
to Slurm User Community List
Requesting --exclusive and then using $SLURM_CPUS_ON_NODE to determine the number of the tasks or threads to use inside the job script would be my recommendation.

--Troy
https://urldefense.com/v3/__http://www.arc.ox.ac.uk__;!!KGKeukY!gMufcIC3i58UdzqvJu5NJgHD5TdADngBaj6rq48GMTu5RjsPPTccwzudcC6u$ https://urldefense.com/v3/__http://www.it.ox.ac.uk__;!!KGKeukY!gMufcIC3i58UdzqvJu5NJgHD5TdADngBaj6rq48GMTu5RjsPPTccw4ucG3qS$

Richard Ems

unread,
Mar 22, 2022, 11:45:33 AM3/22/22
to Slurm User Community List
Hi all,

Thanks for your comments and suggestions.
Using --exclusive does not solve my issue because I need ntasks to get set by Slurm to the MAX possible.
What I want is to request a fixed number of nodes with --nodes=N and --ntasks=MAX , so that Slurm provides 2 nodes and sets then ntasks to the max possible, depending on the nodes selected.
So for example if I request 2 nodes and I get them from the 40 cores partition, then ntasks gets set to 80 by Slurm. If those nodes are from the 2nd partition with 48 cores each node, then Slurm should set ntasks to 96.
The request seems a bit odd, but I have a commercial software (Altair) that is using "scontrol show ..." output to set the number of processes/tasks to start. I cannot manipulate that in my script.
But probably I will have to stop telling Altair I am using Slurm so it does not use "scontrol show ...", and create all parallel settings myself using SLURM's variables as Troy suggested.

Thanks,
Richard
Reply all
Reply to author
Forward
0 new messages