[slurm-users] How to assign temporary priority bonuses or penalties?

31 views
Skip to first unread message

Luke Yeager

unread,
Dec 10, 2020, 12:21:22 PM12/10/20
to slurm...@lists.schedmd.com

(originally posted at https://bugs.schedmd.com/show_bug.cgi?id=10322)

 

There are some great tools for assigning discounts or penalties to jobs before they are allocated resources (QOS.UsageFactor, Partition.TRESBillingWeights, etc.).

 

But what if I want to change the cost of a job after the fact? I might want to avoid penalizing users who spent their allocated resources on jobs which failed due to reasons outside their control (hardware failure, parallel FS glitch, etc.). Or I might want to charge extra for jobs which require node reboots to cleanup afterwards. Either way, I want to be able to adjust how the job affects their current fairshare priority for queued jobs.

 

Are there any existing solutions for this?

 

The only solutions I've found so far are:

  1. 'sacctmgr modify ... set RawUsage=0' - obviously this is too big of a hammer. I only want to edit a single job, and I might want to *increase* the usage for the job - not decrease it.
  2. For clusters using "banking" (limits on TRESMins and PriorityDecayHalfLife=0), you can essentially accomplish this by editing the limit after the fact (increasing the limit for a refund, decreasing it for a penalty). See https://github.com/jcftang/slurm-bank/blob/master/src/sbank-refund, for example. But we don't use that accounting strategy at our site. And that seems a little sketchy anyway since you’d need to remember to reset the limits back to their intended values at each usage reset.

 

The official answer I got on the bug is “I don't think what you are looking for is possible with Slurm at the moment.” I’m posting here in hopes that someone else has a creative solution? How do y’all handle this?

 

Thanks!

Luke

 

Search keywords: priority bump refund penalty accounting

Alex Chekholko

unread,
Dec 10, 2020, 12:59:00 PM12/10/20
to Slurm User Community List
Hi Luke,

Yes, I think your request is unusual.

I believe in the past there have been a number of middle-wares that helped with this kind of bureaucracy, things like http://docs.adaptivecomputing.com/gold/

Regards,
Alex
Reply all
Reply to author
Forward
0 new messages