Hello,
I’m going a problema I have detected in my SLURM cluster. If I configure a partition with a “TimeLimit” of, for example, 15 minutes and, later, a user submits a job in which he/she apply a “TimeLimitt” bigger (for example, 20 minutes), job remains in PENDING state because TimeLimit requested by user is bigger that configured in the queue. My question is: is there any way to force to the partition TimeLimit from the queue if user request a bigger value?
Thanks.
Hello,
In your case 15 minute partition “TimeLimit” is a default value and should only apply if user has not specified time limit for their job within their sbatch script or srun command, or specified a lower value than partition default or has done so incorrectly.
From: slurm-users <slurm-use...@lists.schedmd.com>
On Behalf Of Gestió Servidors
Sent: Thursday, December 2, 2021 8:18 AM
To: slurm...@lists.schedmd.com
Subject: [EXT] [slurm-users] TimeLimit parameter
|
APL external email warning: Verify sender slurm-use...@lists.schedmd.com before clicking links or attachments |
Hi,
Answering between lines...
> Hi;
>
> The EnforcePartLimits parameter in slurm.conf, should be set to ALL or ANY
> to enforce time limit for partition.
>
> Regards.
>
> Ahmet M.
I have not configured "EnforcePartLimits" in my slurm.conf file, so I suppose that my SLURM is running with "default" value "NO", so my job will be accept and remain queued until the partition limits are altered (as SLURM documentation says and me too ;) )
>I look at it this way (so it makes sense):
>
>It goes into a pending state because it is possible for the time to become available (you could just run a command that increases the
>timelimit) so it is waiting for that to happen. This is useful because you may have some users have a job that does indeed need to go that long, but they have to let you know to allow it to happen.
>
>If you do not want ANY jobs to queue up if they are asking for more time than is available, you can add some code to the job_submit.lua
>
>Here is a snippet from mine:
>
>?? if time_limit > part_max_time then
> ?????? slurm.log_info("job from uid %d with request for more than
>max_time: Denying.",job_desc.user_id)
> ?????? slurm.log_user("You cannot request more than %s minutes in partition %s!!", part_max_time, partition) ?????? return slurm.ESLURM_INVALID_TIME_LIMIT ?? end>
>
>The time_limit, part_max_time and partition variables are mapped from job_desc and part_list
>
>Brian Andrus
I think solution modifying “lua” script is a good solution. Could I do something like this?
if time_limit > part_max_time then
slurm.log_info("job from uid %d with request for more than max_time: Reconfiguring your job.",job_desc.user_id)
slurm.log_user("You cannot request more than %s minutes in partition %s!!", part_max_time, partition)
time_limit=part_max_time
end
return slurm.ESLURM_INVALID_TIME_LIMIT
In other words: if a user submits a jobs with TimeLimit bigger than partition TimeLimit, assign partition TimeLimit to his/her job.
Thanks.
With regards to the lua scripting, yes, you can modify a job and
set the time limit to something if it is not appropriate.
Treydock has one that does exactly that (among other things) at
https://gist.github.com/treydock/b964c5599fd057b0aa6a#file-job_submit-lua
You will have to distill the parts you need out of it.
Brian Andrus