[slurm-users] Time spent in PENDING/Priority

Chip Seraphine

unread,

Dec 7, 2023, 3:09:49 PM12/7/23

to Slurm User Community List

Hi all,

I am trying to find some good metrics for our slurm cluster, and want it to reflect a factor that is very important to users—how long did they have to wait because resources were unavailable. This is a very key metric for us because it is a decent approximation of how much life could be improved if we had more capacity, so it’d be an important consideration when doing growth planning, setting user expectations, etc. So we are specifically interested in how long jobs were in the PENDING state for reason Priority.

Unfortunately, I’m finding that this is difficult to pull out of squeue or the accounting data. My first thought was that I could simply subtract SubmitTime from EligibleTime (or StartTime), but that includes time spent in expected ways, e.g. waiting while an array chugs along. The delta between StartTime and EligibleTime does not reflect the time spent PENDING at all, so it’s not useful either.

I can grab some of my own metrics by polling squeue or the REST interface, I suppose, but those will be less accurate, more work, and will not allow me to see my past history. I was wondering if there was something I was missing that someone on the list has figured out? Perhaps some existing bit of accounting data that can tell me how long a job was stuck behind other jobs?

--

Chip Seraphine
Grid Operations
For support please use help-grid in email or slack.
This e-mail and any attachments may contain information that is confidential and proprietary and otherwise protected from disclosure. If you are not the intended recipient of this e-mail, do not read, duplicate or redistribute it by any means. Please immediately delete it and any attachments and notify the sender that you have received it by mistake. Unintended recipients are prohibited from taking action on the basis of information in this e-mail or any attachments. The DRW Companies make no representations that this e-mail or any attachments are free of computer viruses or other defects.

Groner, Rob

unread,

Dec 7, 2023, 3:26:38 PM12/7/23

to Slurm User Community List

Ya, I'm kinda looking at exactly this right now as well. For us, I know we're under-utilizing our hardware currently, but I still want to know if the number of pending jobs is growing because that would probably point to something going wrong somewhere. It's a good metric to have.

We are going the route of using pyslurm/graphite/grafana to get our answers. I know there is also a prometheus slurm data tool/grafana dashboards that might work just as well.

With pyslurm, I end up with an array of all current jobs and can then grab my metrics as needed. We currently measure the "queue" time by comparing when the job was submitted vs. current time, as long as the job is Pending. Once it's running, then the time spent in the queue is start time minus submit time.

You could view the job Reason to determine if it is for Resources, or for QOS limits, etc. I kinda only care about Resource-related pending, but we could also use the QOS/group CPU limit-related pending as a way to show users if they purchased more CPU time then they'd wait much less.

Some of what I'm saying is hypothetical, we aren't actually graphing queue time yet, or at least, not like I want to. But that is how I plan to go about it.

Rob

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Chip Seraphine <csera...@DRWHoldings.com>
Sent: Thursday, December 7, 2023 3:09 PM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [slurm-users] Time spent in PENDING/Priority

[You don't often get email from csera...@drwholdings.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

Ryan Novosielski

unread,

Dec 7, 2023, 3:28:00 PM12/7/23

to Slurm User Community List

I can’t quite answer the question, but I know that Open XDMoD does provide a field that gives this exact information, so they must have a formula they are using. They use exclusively the accounting database, AFAIK.

--
#BlackLivesMatter

____
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novo...@rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark
`'

Chip Seraphine

unread,

Dec 7, 2023, 3:34:23 PM12/7/23

to Slurm User Community List

We use Prometheus as our primary metric tool, and I recently added a metric for jobs in PENDING for the specific reason of “priority”. So we’ll have some nice data for when we are preparing for FY 2025, I suppose, the problem is for this past year we are stuck with what Slurm gathered…. unless I can find a better way to determine if the PD reason is “priority” than “run a query on an active job and see”.

To put things another way, what I am trying to find out is for a given past job, is there any way to determine how long it’s start was delayed due to lack of available resources?

From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of "Groner, Rob" <rug...@psu.edu>
Reply-To: Slurm User Community List <slurm...@lists.schedmd.com>
Date: Thursday, December 7, 2023 at 2:26 PM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [ext] Re: [slurm-users] Time spent in PENDING/Priority

Ya, I'm kinda looking at exactly this right now as well. For us, I know we're under-utilizing our hardware currently, but I still want to know if the number of pending jobs is growing because that would probably point to something going wrong

Ya, I'm kinda looking at exactly this right now as well. For us, I know we're under-utilizing our hardware currently, but I still want to know if the number of pending jobs is growing because that would probably point to something going wrong somewhere. It's a good metric to have.

We are going the route of using pyslurm/graphite/grafana to get our answers. I know there is also a prometheus slurm data tool/grafana dashboards that might work just as well.

With pyslurm, I end up with an array of all current jobs and can then grab my metrics as needed. We currently measure the "queue" time by comparing when the job was submitted vs. current time, as long as the job is Pending. Once it's running, then the time spent in the queue is start time minus submit time.

You could view the job Reason to determine if it is for Resources, or for QOS limits, etc. I kinda only care about Resource-related pending, but we could also use the QOS/group CPU limit-related pending as a way to show users if they purchased more CPU time then they'd wait much less.

Some of what I'm saying is hypothetical, we aren't actually graphing queue time yet, or at least, not like I want to. But that is how I plan to go about it.

Rob

________________________________
From: slurm-users <slurm-use...@lists.schedmd.com> on behalf of Chip Seraphine <csera...@DRWHoldings.com>
Sent: Thursday, December 7, 2023 3:09 PM
To: Slurm User Community List <slurm...@lists.schedmd.com>
Subject: [slurm-users] Time spent in PENDING/Priority

[You don't often get email from csera...@drwholdings.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification<https://urldefense.com/v3/__https:/aka.ms/LearnAboutSenderIdentification__;!!EvhwMw!Rw6nrAHCv30MmFVGhSViczTk-u_mcA1pfI_XSkjodOo4dkOBVusbZ8ySzADdzNo_ce8i1S7CmaCFnPgL_HI$> ]

Oren Shani

unread,

Dec 10, 2023, 12:32:35 AM12/10/23

to Slurm User Community List

Hi Chip,

As others already answered, there is no full solution for this problem, because SLURM does not record the breakdown of the wait time into the various states and causes of waiting. As far as I know, the best thing you can do is to consider just StartTime - EligableTime as the actual wait time. You are correct that this still includes some expected waiting but this expected waiting time is usually very short. So what I do, is to look for jobs that have a relatively long period between EligableTime and StartTime, and then I try to correlate that with other factors, such as how much free resources were available at that time.