[slurm-users] slurmdbd and slurmctld prevent alma9 login

7 views
Skip to first unread message

Pols, Maarten via slurm-users

unread,
Apr 23, 2026, 4:31:56 AMApr 23
to John Hearns via slurm-users
Dear Community,
 
Our Slurm cluster has been running without any issues for several months on version 25.05.3.
 
Last Friday, we experienced a power outage which required us to restart the server. After the restart, we were unable to log in to the master node. Eventually, we managed to access the system via a safe mode workaround. Through a process of elimination, we identified the slurmdbd and slurmctld services as the root cause of the issue.
 
Would you happen to have any idea what might have caused this behavior?
 
We have since upgraded to version 25.11.5, which appears to be running smoothly. However, we would still like to understand the underlying cause of the problem.
 
Thank you in advance for your help.
 
Kind regards,
Maarten

Ole Holm Nielsen via slurm-users

unread,
Apr 28, 2026, 5:06:41 AMApr 28
to slurm...@lists.schedmd.com
Hi Maarten,

On 4/23/26 09:33, Pols, Maarten via slurm-users wrote:
> Last Friday, we experienced a power outage which required us to restart
> the server. After the restart, we were unable to log in to the master
> node. Eventually, we managed to access the system via a safe mode
> workaround. Through a process of elimination, we identified the slurmdbd
> and slurmctld services as the root cause of the issue.
> Would you happen to have any idea what might have caused this behavior?
> We have since upgraded to version 25.11.5, which appears to be running
> smoothly. However, we would still like to understand the underlying cause
> of the problem.

Console or SSH login to a server should not in any way be related to the
slurmctld/slurmdbd daemons.

Maybe one of the server's filesystems had become full? A full /root or
/tmp disk could prevent logins on any Linux system, because files need to
be written to /tmp.

IHTH,
Ole

--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark

--
slurm-users mailing list -- slurm...@lists.schedmd.com
To unsubscribe send an email to slurm-us...@lists.schedmd.com

John Hearns via slurm-users

unread,
Apr 28, 2026, 10:13:23 AMApr 28
to Pols, Maarten, John Hearns via slurm-users
You did look carefully at the logs?

If you were starting the services manually you can use journalctl to echo the log in a separate terminal.

In the old days I would have said use tail -f
But that shows my age

Pols, Maarten via slurm-users

unread,
May 6, 2026, 5:39:43 AM (8 days ago) May 6
to John Hearns, John Hearns via slurm-users

Dear John,

 

We don't see any strange warnings in the logs that could explain this.

The disks aren't full either.

We have now created a crontab that starts slurmdbd and slurmctld 1 minute after the server starts, and that works fine.

It seems that somehow, services are waiting for each other, which in turn prevents logging in.

I don't see any logic in this and couldn't find anything in the logs for slurm or massages.

 

Kind regards,
Maarten

 

 

Van: John Hearns <hea...@gmail.com>
Verzonden: Tuesday, 28 April 2026 15:35
Aan: Pols, Maarten <po...@hkv.nl>
CC: John Hearns via slurm-users <slurm...@lists.schedmd.com>
Onderwerp: Re: [slurm-users] slurmdbd and slurmctld prevent alma9 login

 

## Let op: deze mail is afkomstig van een externe afzender. Meer informatie over waarom dit belangrijk is

Reply all
Reply to author
Forward
0 new messages