Alertmanager: only inhibit exactly matching alerts (minus severity)

1,682 views
Skip to first unread message

Alin Sinpalean

unread,
Sep 28, 2017, 4:43:58 AM9/28/17
to Prometheus Users
I have 3 levels of alerts -- info, warning and critical -- and have them set up such that warning inhibits info and critical inhibits both warning and info.

inhibit_rules:
- source_match:
    severity: 'critical'
  target_match_re:
    severity: '^(warning|info|)$'
  # Apply inhibition if the alertname, job and environment are the same.
  equal: ['alertname', 'job', 'env']

- source_match:
    severity: 'warning'
  target_match_re:
    severity: '^(info|)$'
  # Apply inhibition if the alertname, job and environment are the same.
  equal: ['alertname', 'job', 'env']

But I just realized now there may be an issue with the way inhibition works. Assuming I have the following 3 alerts firing:

ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="warning"}
ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server2",severity="warning"}
ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="critical"}

What will happen given my inhibit configuration is that both severity="warning" alerts will be inhibited by the one severity="critical" alert.

What if I actually want to be alerted that both server1 and server2 are running low on disk space? I mean apart from adding instance to the list of equal fields. Because I might also end up adding a disk label to this alert or a networkInterface to another and so on and I don't want to have to remember to add each of those.

So I guess the question is: is there a way to default to matching all alert labels *except* severity or some specific set of labels? And if not, should there be one? E.g. either specify inhibit_rules.equal and default to the current behavior or, say, inhibit_rules.ignore and default to matching all other labels (but obviously not both)?

Cheers,
Alin.

Brian Brazil

unread,
Sep 28, 2017, 4:54:17 AM9/28/17
to Alin Sinpalean, Prometheus Users
There is no such feature, nor do I think there should be. In cases like this it's safest and simplest to allow both notifications to fire. 

Inhibition is intended for whole-datacenter-down notification suppression, not micro-management.

--

Alin Sinpalean

unread,
Sep 28, 2017, 5:30:26 AM9/28/17
to Brian Brazil, Prometheus Users
Got it, thanks.

But (there is always a but, isn't it?) I picked up the specific usage from simple.yml and the alertmanager README.md, so maybe they should be updated with a less misleading example?

Cheers,
Alin.

On Thu, Sep 28, 2017 at 10:54 AM, Brian Brazil <brian....@robustperception.io> wrote:

Christian Platzer

unread,
Jul 15, 2019, 7:22:13 AM7/15/19
to Prometheus Users
I just stumbled upon your post and had the same problem. There is an easy way to solve it:

1.) In your inhibit rule, define something in the equals section, that is currently not used. e.g.
    equal: ['alertname', 'job', 'env', 'alertgrouplabel']
2.) In your alert definition, fill this label with everything you want to act as a distinction:
- alert: LowDiskSpace
    expr: <your_expression_here>
    for: 0s
    labels:
      severity: high
      alertgrouplabel: "{{ $labels.instance }}"

Depending on how you fill this label, you can decide if something is inhibited or not.
Reply all
Reply to author
Forward
0 new messages