[slurm-users] job requesting licenses would not be scheduled as expected

129 views
Skip to first unread message

刘文晓

unread,
Mar 15, 2022, 2:54:54 AM3/15/22
to slurm...@schedmd.com
Hi there:
licenses requesting job would not be scheduled as expected
In my local environment--Slurm19.05, I have 2 computing nodes( 2 CPUs per node) and 40 fluent licenses. 
the envs are same result:
  Licenses=fluent:40
  SchedulerType=sched/backfill
  PriorityType=priority/multifactor
or
  Licenses=fluent:40
  SchedulerType=sched/builtin
  PriorityType=priority/basic

the test steps are below:
1. submit 35 licenses and 1 CPU's job, it is running;
2. submit 7 licenses and 1 CPU's job, it is pending(licenses) with bigger priority than step 3's job;
3. submit 1 license and 1 CPU's job, it is running;

In my view, it is not right. The 7 licenses' job with higher priority should be running before 1 license's job.
is my question right?

thanks





 

Brian Andrus

unread,
Mar 15, 2022, 12:23:31 PM3/15/22
to slurm...@lists.schedmd.com

Depending on other variables, it is fine.

The 7 license job cannot run because there are only 5 available, so that one has to wait.
Since there are 5 available, the 1 license job can run, so it does.

That is the simple view. Other variables such as job time could affect that.

Brian Andrus

Williams, Gareth (IM&T, Black Mountain)

unread,
Mar 15, 2022, 5:48:13 PM3/15/22
to Slurm User Community List

This is what I would expect. It doesn’t matter that the 7 license job has higher priority as it in ineligible to run due to the lack of licenses. The scheduler moves on and prioritises (and starts) the next eligible job.

 

I will suggest a workaround.  You could periodically run a script that looks for jobs requesting licenses (running and queued) and have the script make decisions and take action. I think the only easy action that is relevant in this case is to hold or release jobs. You might want to submit jobs with a hold (or do that via a submit plugin) to avoid newly submitted jobs sneaking past held jobs before your script takes action.

 

By the way, where only cores and nodes were considered, the highest priority job would block out resource to run as soon as possible and lower priority jobs could backfill around it. It would be complicated to make backfill consider licenses too and I doubt it has been or will be done – but I’ve not actually checked code or documentation.

 

Gareth

Reply all
Reply to author
Forward
0 new messages