Alertmanager: only inhibit exactly matching alerts (minus severity)

Alin Sinpalean

unread,

Sep 28, 2017, 4:43:58 AM9/28/17

to Prometheus Users

I have 3 levels of alerts -- info, warning and critical -- and have them set up such that warning inhibits info and critical inhibits both warning and info.

inhibit_rules:

- source_match:

severity: 'critical'

target_match_re:

severity: '^(warning|info|)$'

# Apply inhibition if the alertname, job and environment are the same.

equal: ['alertname', 'job', 'env']

- source_match:

severity: 'warning'

target_match_re:

severity: '^(info|)$'

# Apply inhibition if the alertname, job and environment are the same.

equal: ['alertname', 'job', 'env']

But I just realized now there may be an issue with the way inhibition works. Assuming I have the following 3 alerts firing:

ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="warning"}

ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server2",severity="warning"}

ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="critical"}

What will happen given my inhibit configuration is that both severity="warning" alerts will be inhibited by the one severity="critical" alert.

What if I actually want to be alerted that both server1 and server2 are running low on disk space? I mean apart from adding instance to the list of equal fields. Because I might also end up adding a disk label to this alert or a networkInterface to another and so on and I don't want to have to remember to add each of those.

So I guess the question is: is there a way to default to matching all alert labels *except* severity or some specific set of labels? And if not, should there be one? E.g. either specify inhibit_rules.equal and default to the current behavior or, say, inhibit_rules.ignore and default to matching all other labels (but obviously not both)?

Cheers,

Alin.

Brian Brazil

unread,

Sep 28, 2017, 4:54:17 AM9/28/17

to Alin Sinpalean, Prometheus Users

There is no such feature, nor do I think there should be. In cases like this it's safest and simplest to allow both notifications to fire.

Inhibition is intended for whole-datacenter-down notification suppression, not micro-management.

--

Brian Brazil

www.robustperception.io

Alin Sinpalean

unread,

Sep 28, 2017, 5:30:26 AM9/28/17

to Brian Brazil, Prometheus Users

Got it, thanks.

But (there is always a but, isn't it?) I picked up the specific usage from simple.yml and the alertmanager README.md, so maybe they should be updated with a less misleading example?

Cheers,

Alin.

On Thu, Sep 28, 2017 at 10:54 AM, Brian Brazil <brian....@robustperception.io> wrote:

Christian Platzer

unread,

Jul 15, 2019, 7:22:13 AM7/15/19

to Prometheus Users

I just stumbled upon your post and had the same problem. There is an easy way to solve it:

1.) In your inhibit rule, define something in the equals section, that is currently not used. e.g.

equal: ['alertname', 'job', 'env', 'alertgrouplabel']

2.) In your alert definition, fill this label with everything you want to act as a distinction:

- alert: LowDiskSpace
    expr: <your_expression_here>
    for: 0s
    labels:
      severity: high
      alertgrouplabel: "{{ $labels.instance }}"

Depending on how you fill this label, you can decide if something is inhibited or not.

Reply all

Reply to author

Forward