I have 3 levels of alerts -- info, warning and critical -- and have them set up such that warning inhibits info and critical inhibits both warning and info.
inhibit_rules:
- source_match:
severity: 'critical'
target_match_re:
severity: '^(warning|info|)$'
# Apply inhibition if the alertname, job and environment are the same.
equal: ['alertname', 'job', 'env']
- source_match:
severity: 'warning'
target_match_re:
severity: '^(info|)$'
# Apply inhibition if the alertname, job and environment are the same.
equal: ['alertname', 'job', 'env']
But I just realized now there may be an issue with the way inhibition works. Assuming I have the following 3 alerts firing:
ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="warning"}
ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server2",severity="warning"}
ALERT{alertname="LowDiskSpace",env="test",job="myjob",instance="server1",severity="critical"}
What will happen given my inhibit configuration is that both severity="warning" alerts will be inhibited by the one severity="critical" alert.
What if I actually want to be alerted that both server1 and server2 are running low on disk space? I mean apart from adding instance to the list of equal fields. Because I might also end up adding a disk label to this alert or a networkInterface to another and so on and I don't want to have to remember to add each of those.
So I guess the question is: is there a way to default to matching all alert labels *except* severity or some specific set of labels? And if not, should there be one? E.g. either specify inhibit_rules.equal and default to the current behavior or, say, inhibit_rules.ignore and default to matching all other labels (but obviously not both)?
Cheers,
Alin.