Getting "permission denied" error on "Running Maintenance failed" on Alertmanager in journalctl

957 views
Skip to first unread message

Sutirtha Das

unread,
Apr 17, 2019, 12:34:07 AM4/17/19
to Prometheus Users
Looking at the journalctl logs i see things like the following every 15mins. Alertmanager is running smooth nonetheless but was worried if this was a bigger underlying problem. How do i fix this?


Apr 17 13:08:30 alertmanager.dev.mine.com alertmanager[8654]: level=error ts=2019-04-17T04:08:30.035693317Z caller=nflog.go:363 component=nflog msg="Running maintenance failed" err="open /path/to/alertmanager/nflog.46dfe3ee664d56ad: permission denied"
Apr 17 13:08:30 alertmanager.dev.mine.com alertmanager[8654]: level=info "some"=2019-04-17T04:08:30.036944666Z caller=silence.go:291 component=silences msg="Running maintenance failed" err="open /path/to/alertmanager/silences.1cd6f8bf90d29f49: permission denied"

Simon Pasquier

unread,
Apr 17, 2019, 9:43:50 AM4/17/19
to Sutirtha Das, Prometheus Users
The log messages tell you that the AlertManager process can't write to
the directory.
The problem is that all silences and notification logs would be lost
if you restart AlertManager. If you're running a cluster of
AlertManager, it is less of an issue as after a restart, AlertManager
would receive the missing data from its peers. But I would still
recommend that youf fix the permission issue.
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To post to this group, send email to promethe...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/8e7cec90-5464-4671-b40d-acb7a60f4fec%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Sutirtha Das

unread,
Apr 17, 2019, 9:47:11 PM4/17/19
to Prometheus Users
@Simon Thanks for the explanation. looking at the source nflog.go and silences.go suggested this was the case but i wasn't sure.
Currently the permission is set to nobody;nobody . Should it be different?

$ ls -l /path/to/alertmanager/

drwxr-xr-x   2  nobody nobody   6          Jul  13  2018   data
  -rw-r--r--   1   nobody nobody   637     Mar 15 12:55   nflog
  -rw-r--r--   1   nobody nobody   27831 Mar 15 12:55   silences

I'm running a cluster so data won't be lost when the instance is restarted but that's a risk (if someone mistakenly deploys to all clusters concurrently then there goes the data)

On Wednesday, April 17, 2019 at 10:43:50 PM UTC+9, Simon Pasquier wrote:
The log messages tell you that the AlertManager process can't write to
the directory.
The problem is that all silences and notification logs would be lost
if you restart AlertManager. If you're running a cluster of
AlertManager, it is less of an issue as after a restart, AlertManager
would receive the missing data from its peers. But I would still
recommend that youf fix the permission issue.

On Wed, Apr 17, 2019 at 6:34 AM Sutirtha Das <sutirt...@gmail.com> wrote:
>
> Looking at the journalctl logs i see things like the following every 15mins. Alertmanager is running smooth nonetheless but was worried if this was a bigger underlying problem. How do i fix this?
>
>
> Apr 17 13:08:30 alertmanager.dev.mine.com alertmanager[8654]: level=error ts=2019-04-17T04:08:30.035693317Z caller=nflog.go:363 component=nflog msg="Running maintenance failed" err="open /path/to/alertmanager/nflog.46dfe3ee664d56ad: permission denied"
> Apr 17 13:08:30 alertmanager.dev.mine.com alertmanager[8654]: level=info "some"=2019-04-17T04:08:30.036944666Z caller=silence.go:291 component=silences msg="Running maintenance failed" err="open /path/to/alertmanager/silences.1cd6f8bf90d29f49: permission denied"
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Simon Pasquier

unread,
Apr 18, 2019, 6:30:17 AM4/18/19
to Sutirtha Das, Prometheus Users
On Thu, Apr 18, 2019 at 3:47 AM Sutirtha Das <sutirt...@gmail.com> wrote:
>
> @Simon Thanks for the explanation. looking at the source nflog.go and silences.go suggested this was the case but i wasn't sure.
> Currently the permission is set to nobody;nobody . Should it be different?

It depends under which user/group AlertManager is running.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To post to this group, send email to promethe...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e5bd4ddf-e363-4907-9eb8-5dcc120d3139%40googlegroups.com.

Sutirtha Das

unread,
Apr 18, 2019, 6:43:39 AM4/18/19
to Simon Pasquier, Prometheus Users
Alertmanager process is running under nobody.
The original alertmanager file has root:root though 

Sutirtha Das

unread,
Apr 22, 2019, 11:11:19 PM4/22/19
to Prometheus Users

@The message has persisted till now. I'm still not sure what permissions i need on Alertmanager and/or the files/directory.

$ ls -l /path/to/alertmanager/
drwxr
-xr-x   2  nobody nobody   6          Jul  13  2018   data
 
-rw-r--r--   1   nobody nobody   637     Mar 15 12:55   nflog
 
-rw-r--r--   1   nobody nobody   27831 Mar 15 12:55
  silences

$
 ps aux | grep alertmanager
nobody   xxxxx  0.2  0.7 122100 57708 ?        Ssl  Mar15 150:37 /usr/bin/alertmanager --config.file=/path/to/alertmanager/config/alertmanager.yml --storage.path=/path/to/alertmanager/storage/alertmanager --cluster.peer=alertmanager.other.peer.instance.co.jp:9094

$ ls -l /usr/bin/alertmanager
-rwxr-xr-x 1 root root 19246224 Feb 19 08:35 /usr/bin/alertmanager

Sutirtha Das

unread,
May 30, 2019, 4:12:47 AM5/30/19
to Prometheus Users
@Simon Pasquier

Sorry to re-tag you but this problem still persists and may be a problem for us in the future. Is this a common occurence or a result of some mistake in the settings?

Simon Pasquier

unread,
Jun 3, 2019, 11:29:12 AM6/3/19
to Sutirtha Das, Prometheus Users
Are you sure that the user "nobody" can write to the
/path/to/alertmanager/ directory?
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To post to this group, send email to promethe...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a1029483-7f82-4bcc-9a92-818f2052d50a%40googlegroups.com.

Sutirtha Das

unread,
Jun 4, 2019, 4:05:13 AM6/4/19
to Prometheus Users
Yes thank you , that solved it. The permission for the parent directory of [data, nflog, silences] had owner & group set to root. I changed it (/path/to/alertmanager/) to nobody and now it seems to be working fine.

Thank you for the help :)
As a followup i changed permissions of the alertmanager service conf file, alertmanager.yaml file, and all template files to nobody as well (basically all files generated by the Chef recipe). Still seems to work fine.
Reply all
Reply to author
Forward
0 new messages