[slurm-users] monitor draining/drain nodes

294 views
Skip to first unread message

Rodrigo Santibáñez

unread,
Jun 12, 2021, 4:30:24 PM6/12/21
to Slurm User Community List
Hi SLURM users,

Does anyone have a cronjob or similar to monitor and warn via e-mail when a node is in draining/drain status?

Thank you.

Best regards.
Rodrigo Santibáñez

Fulcomer, Samuel

unread,
Jun 12, 2021, 4:47:07 PM6/12/21
to Slurm User Community List
...something like "sinfo | grep drain && mail -s 'drain nodes' <recipient address> "

...will work...

Substitute "draining" or "drained" for "drain" to taste...

Fulcomer, Samuel

unread,
Jun 12, 2021, 4:53:37 PM6/12/21
to Slurm User Community List
...sorry... "sinfo | grep drain && sinfo | grep drain | mail -s 'drain nodes' <recipient address> "


Marcus Boden

unread,
Jun 14, 2021, 1:51:42 AM6/14/21
to slurm...@lists.schedmd.com
Hi,

Slurm provides the strigger[1] utility for that. You can set it up to
automatically send mails when nodes go into drain.

Best,
Marcus

[1] https://slurm.schedmd.com/strigger.html
--
Marcus Vincent Boden, M.Sc.
Arbeitsgruppe eScience, HPC-Team
Tel.: +49 (0)551 201-2191, E-Mail: mbo...@gwdg.de
-------------------------------------------------------------------------
Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen (GWDG)
Am Faßberg 11, 37077 Göttingen, URL: https://www.gwdg.de

Support: Tel.: +49 551 201-1523, URL: https://www.gwdg.de/support
Sekretariat: Tel.: +49 551 201-1510, Fax: -2150, E-Mail: gw...@gwdg.de

Geschäftsführer: Prof. Dr. Ramin Yahyapour
Aufsichtsratsvorsitzender: Prof. Dr. Norbert Lossau
Sitz der Gesellschaft: Göttingen
Registergericht: Göttingen, Handelsregister-Nr. B 598

Zertifiziert nach ISO 9001
-------------------------------------------------------------------------

Ole Holm Nielsen

unread,
Jun 14, 2021, 8:08:00 AM6/14/21
to slurm...@lists.schedmd.com
On 6/14/21 7:50 AM, Marcus Boden wrote:
> Slurm provides the strigger[1] utility for that. You can set it up to
> automatically send mails when nodes go into drain.

I provide some Slurm triggers examples in
https://github.com/OleHolmNielsen/Slurm_tools/tree/master/triggers

> On 12.06.21 22:29, Rodrigo Santibáñez wrote:
>> Hi SLURM users,
>>
>> Does anyone have a cronjob or similar to monitor and warn via e-mail when a
>> node is in draining/drain status?

/Ole

Rodrigo Santibáñez

unread,
Jun 14, 2021, 2:53:12 PM6/14/21
to Slurm User Community List
Thank you Marcus, Ole and Samuel.

Regarding Samuel's answer, I added ifne from moreutils before mail to not have empty emails.

Regarding strigger, I don't know how to become the slurm user. "su slurm" complains "This account is currently not available.". The user "slurm" exists and is the SlurmUser.

Best,

Marcus Boden

unread,
Jun 15, 2021, 1:30:32 AM6/15/21
to slurm...@lists.schedmd.com
I think your slurm-user has /sbin/nologin as the the shell in
/etc/passwd. Try `su -s /bin/bash slurm`.

Best,
Marcus

Rodrigo Santibáñez

unread,
Jun 15, 2021, 7:08:53 PM6/15/21
to Slurm User Community List
Thanks, Marcus. You're right.

Best regards.
Reply all
Reply to author
Forward
0 new messages