On 06/15/2022 02:48 PM, Tina Friedrich wrote:
> Hi Guillaume,
>
Hi Tina,
> in that example you wouldn't need the 'srun' to run more than one task,
> I think.
>
You are correct. To start a program like sleep I could simply run:
sleep 20s &
sleep 30s &
wait
However, my objective is to use mpirun in combination with srun to avoid
to define manually rankfile.
>
> I'm not 100% sure, but to me it sounds like you're currently assigning
> whole nodes to jobs rather than cores (i.e have
> 'SelectType=select/linear' and no OverSubscribe) and find that to be
> wasteful - is that correct?
>
In my first email I copy parts of my slurm.conf. I'm using
"SelectType=select/cons_res"
with
"SelectTypeParameters=CR_Core_Memory"
And until now "no OverSubscribe". I tried to activate
"OverSubscribe=YES" on the partition with
PartitionName=short Nodes=node[01-08] Default=NO MaxTime=0-02:00:00
State=UP DefaultTime=00:00:00 MinNodes=1 PriorityTier=100 OverSubscribe=YES
But it did not solve the issue with
srun -vvv --exact -n1 -c1 sleep 20 > srun1.log 2>&1 &
srun -vvv --exact -n1 -c1 sleep 30 > srun2.log 2>&1 &
wait
> If it is, I'd say the more obvious solution to that would be to change
> the SelectType to either select/cons_res or select/cons_tres, so that
> cores (not nodes) are allocated to jobs?
>
How can I be sure that my slurm is using the parameter "select/cons_res"
defined in my /etc/slurm/slurm.conf?
Thx a lot
Guillaume