[slurm-users] Slurmctld and log file

1,688 views
Skip to first unread message

Gestió Servidors

unread,
Sep 8, 2020, 5:39:48 AM9/8/20
to slurm...@lists.schedmd.com

Hello,

 

I don’t know why, but my SLURM server (that is running fine) has its slurmdctl.log file with size 0 bytes... so... where is writting logs? It seems that log file has 0 bytes from logrotate process during today’s early morning. My logrotate SLURM conf is this:

[root@server logrotate.d]# cat slurm

/var/log/slurmdctl.log

/var/log/slurmdbd.log

{

    rotate 7

    notifempty

    missingok

    create

    weekly

}

 

 

Gestió Servidors

unread,
Sep 8, 2020, 5:41:51 AM9/8/20
to slurm...@lists.schedmd.com

Now, I have run “scontrol reconfigure” and, voilà, file /var/log/slurmdctl.log has appeared... but it doesn’t show log info from logrotate execution to scontrol execution, so I have lost log info...

 

Is a logrotate problem or is a SLURM one?

 

Thanks.

Timo Rothenpieler

unread,
Sep 8, 2020, 6:18:15 AM9/8/20
to slurm...@lists.schedmd.com
My slurm logrotate file looks like this:

> /var/log/slurm/*.log {
> weekly
> compress
> missingok
> nocopytruncate
> nocreate
> nodelaycompress
> nomail
> notifempty
> noolddir
> rotate 5
> sharedscripts
> size=5M
> create 640 slurm slurm
> postrotate
> systemctl reload slurmd > /dev/null 2>&1 || true
> systemctl reload slurmdbd > /dev/null 2>&1 || true
> systemctl reload slurmctld > /dev/null 2>&1 || true
> endscript
> }

The reload section is probably the most important part yours is missing.

Steffen Grunewald

unread,
Sep 8, 2020, 7:00:47 AM9/8/20
to Slurm User Community List
Is that a typo? The daemon is "slurmctld" not "slurmdctl", so you have
rotated (and created) the wrong file, which will not be written to...

Cheers,
Steffen

--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~

Gestió Servidors

unread,
Sep 8, 2020, 9:03:11 AM9/8/20
to slurm...@lists.schedmd.com

Hello,

 

My slurm logrotate file looks like this:

 

> /var/log/slurm/*.log {

>     weekly

>     compress

>     missingok

>     nocopytruncate

>     nocreate

>     nodelaycompress

>     nomail

>     notifempty

>     noolddir

>     rotate 5

>     sharedscripts

>     size=5M

>     create 640 slurm slurm

>     postrotate

>         systemctl reload slurmd > /dev/null 2>&1 || true

>         systemctl reload slurmdbd > /dev/null 2>&1 || true

>         systemctl reload slurmctld > /dev/null 2>&1 || true

>     endscript

> }

 

The reload section is probably the most important part yours is missing.

 

Mmm, maybe, I’m going to add “postrate” section. Thanks!!!!

 

Is that a typo? The daemon is "slurmctld" not "slurmdctl", so you have

rotated (and created) the wrong file, which will not be written to...

 

The daemon is “slurmctld, but I don’t know why (???) I wrote “slurmdctl.conf” as log file... One thing is the daemon name and other thing is where it writes. I suppose this is not the problem. Thanks!

Brian Andrus

unread,
Sep 8, 2020, 12:59:01 PM9/8/20
to slurm...@lists.schedmd.com

This seems to imply you had some changes in your slurm.conf

I'm presuming you are running Centos 7 or such.

Do you see anything when you do 'journalctl -u slurmctld'

I'm wondering if you were only logging to the journal and then added the bits to also/instead log to a separate file.

I do both. I do high debug to the journal and info to the log file.

Brian Andrus

Gestió Servidors

unread,
Sep 9, 2020, 3:42:03 AM9/9/20
to slurm...@lists.schedmd.com

Hello,

 

This seems to imply you had some changes in your slurm.conf

 

I'm presuming you are running Centos 7 or such.

 

Do you see anything when you do 'journalctl -u slurmctld'

 

I'm wondering if you were only logging to the journal and then added the

bits to also/instead log to a separate file.

 

I do both. I do high debug to the journal and info to the log file.

 

Brian Andrus

 

Yes, I’m running CentOS 7.7.1908

 

If I run “journalctl -u slurmctld” I see some log from 3 months ago

 

I don’t know how answer you to the question/doubt “I'm wondering if you were only logging to the journal and then added the bits to also/instead log to a separate file”. How could I know

 

 

Thanks.

Andrew Elwell

unread,
Sep 9, 2020, 6:00:48 AM9/9/20
to Slurm User Community List
As an aside, I've seen on one of the talk slides that using systemctl
reload is a Bad Thing to do with logrotation for slurm - Simply send
SIGUSR2 (or HUP for pre-17.11 versions apparently)
https://bugs.schedmd.com/show_bug.cgi?id=4393

Andrew

Ole Holm Nielsen

unread,
Sep 9, 2020, 7:26:40 AM9/9/20
to slurm...@lists.schedmd.com
See also the last section "LOGGING" in
https://slurm.schedmd.com/slurm.conf.html

/Ole

Ole Holm Nielsen

unread,
Sep 9, 2020, 7:55:25 AM9/9/20
to slurm...@lists.schedmd.com
Apparently the SIGUSR2 isn't working as expected, see
https://bugs.schedmd.com/show_bug.cgi?id=9264

/Ole

Reply all
Reply to author
Forward
0 new messages