[slurm-users] sinfo history

39 views
Skip to first unread message

Steve Kirk via slurm-users

unread,
Jul 28, 2025, 12:55:25 PMJul 28
to slurm-users list
Hi,

Am I correct in thinking that the history of a *node* as shown by sinfo
isn't stored anywhere by Slurm?

Interested to know if slurm can tell me historically when a node was
draining,drained etc.

Regards,
Steve

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

Paul Edmon via slurm-users

unread,
Jul 28, 2025, 1:00:31 PMJul 28
to slurm...@lists.schedmd.com
Correct. What we do is that we have prometheus collectors running which
pull node state so we can graph it over time.

https://github.com/fasrc/prometheus-slurm-exporter

-Paul Edmon-

Michael Gutteridge via slurm-users

unread,
Jul 28, 2025, 1:00:34 PMJul 28
to slurm-users list
Hi

I think the events you're looking for would be tracked in the events tables in the accounting database:

sacctmgr show event where node=<nodename>

 -- Michael

Christopher Samuel via slurm-users

unread,
Jul 28, 2025, 8:19:16 PMJul 28
to slurm...@lists.schedmd.com
On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:

> I think the events you're looking for would be tracked in the events
> tables in the accounting database:

Be aware that down and drainED nodes are there, but not drainING.

So (unless something has changed in 25.05) until a draining node is
empty of jobs it doesn't get recorded in slurmdbd's events table.

All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA

Ole Holm Nielsen via slurm-users

unread,
Jul 29, 2025, 3:00:57 AMJul 29
to slurm...@lists.schedmd.com
On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:
> On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:
>
>> I think the events you're looking for would be tracked in the events
>> tables in the accounting database:

Thanks, "sacctmgr show event where node=<nodename>" is extremely useful
for monitoring nodes, and I wasn't aware of this command. I've added some
further examples to my Wiki page now at
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_operations/#listing-node-events

> Be aware that down and drainED nodes are there, but not drainING.
>
> So (unless something has changed in 25.05) until a draining node is empty
> of jobs it doesn't get recorded in slurmdbd's events table.

So the sacctmgr manual page is not quite correct when it states "event:
Events like downed or draining nodes on clusters." I've opened a ticket
https://support.schedmd.com/show_bug.cgi?id=23337 suggesting a
documentation update.

Best regards,
Ole

Ole Holm Nielsen via slurm-users

unread,
Jul 29, 2025, 4:43:14 AMJul 29
to slurm...@lists.schedmd.com
On 7/29/25 08:58, Ole Holm Nielsen wrote:
> On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:
>> On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:
> Thanks, "sacctmgr show event where node=<nodename>" is extremely useful
> for monitoring nodes, and I wasn't aware of this command.  I've added some
> further examples to my Wiki page now at https://
> eur01.safelinks.protection.outlook.com/?
> url=https%3A%2F%2Fwiki.fysik.dtu.dk%2FNiflheim_system%2FSlurm_operations%2F%23listing-node-events&data=05%7C02%7COle.H.Nielsen%40fysik.dtu.dk%7C6571d26860a24f0755fa08ddce6d5473%7Cf251f123c9ce448e927734bb285911d9%7C0%7C0%7C638893691141746858%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=e6wWHvnvnausmanpKqnTPevTWcafDliBAKvMjYzLhtI%3D&reserved=0

If you're interested in a general node status, I've added the "sacctmgr
show event" command to my shownode script:
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/shownode

/Ole

Steve Kirk via slurm-users

unread,
Aug 11, 2025, 8:26:47 AMAug 11
to slurm...@lists.schedmd.com
On Tue, 2025-07-29 at 08:58 +0200, Ole Holm Nielsen via slurm-users
wrote:
> On 7/29/25 02:17, Christopher Samuel via slurm-users wrote:
> > On 7/28/25 9:58 am, Michael Gutteridge via slurm-users wrote:
> >
> > > I think the events you're looking for would be tracked in the
> > > events
> > > tables in the accounting database:
>
> Thanks, "sacctmgr show event where node=<nodename>" is extremely
> useful
> for monitoring nodes, and I wasn't aware of this command.  I've added
> some
> further examples to my Wiki page now at
> https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_operations/#listing-node-events

Thanks for the replies; I was also not aware of that command and now
feel like I should have read the documentation better! That wiki is
also a nice resource.

> > Be aware that down and drainED nodes are there, but not drainING.

Noted; I think down and drained will give me what I'm looking for. We
do have monitoring of all our cluster that likely has the information
but this gives me something I use quickly from within the cluster etc.

Cheers,
Steve
Reply all
Reply to author
Forward
0 new messages