(Alertmanager) Ignore instance label to prevent same alert multiple times

2,200 views
Skip to first unread message

Robin Ong

unread,
Feb 26, 2018, 3:45:29 AM2/26/18
to Prometheus Users
Hi all,

I have a question regarding filtering messages to prevent getting multiple alerts on same error.

Case:
I have a rabbitMQ cluster that has 3 nodes. Each running on a different server. This rabbitMQ cluster duplicates the messages on a queue to all instances.
So when an Error hits an Error queue I receive 3 alerts for each instance. The only value that makes each error unique is the instance value.

I would like Alert Manager to just look at VHOST and queue name. This way I should only get 1 alert?


Do you guys know a solution?

Regards in advance.


Brian Brazil

unread,
Feb 26, 2018, 3:49:11 AM2/26/18
to Robin Ong, Prometheus Users
It sounds like you want to remove instance from the group_by in your alertmanager.yml.

--

Robin Ong

unread,
Feb 26, 2018, 4:45:07 AM2/26/18
to Prometheus Users
I did in below route in the alertmanager config:

route:
receiver: 'msteams'
repeat_interval: 15m
group_interval: 5m
group_wait: 1m
routes:
- receiver: 'msteams-rabbitmq'
group_by: [vhost, queue]
group_wait: 30s
match:
service: rabbitmq

But still gives me 3 alerts.

tommy....@gmail.com

unread,
May 9, 2019, 11:29:58 AM5/9/19
to Prometheus Users
Did you ever find a solution for this? I'm looking at the exact same issue.

// Tommy

pin...@hioscar.com

unread,
Jun 4, 2020, 1:48:27 PM6/4/20
to Prometheus Users
+1

We get the same alert multiple times in the same email, because the monitor label (prometheus instance) being different for our simple replicated setup. Would be nice to be able to ignore certain labels so that alert bodies are higher signal.

Murali Krishna Kanagala

unread,
Jun 4, 2020, 3:03:54 PM6/4/20
to pin...@hioscar.com, Prometheus Users
I guess your scrape config has some static labels like environment: prod, region: central, etc. so add another static label with cluster name. Then do a label replacement on your alert query to replace the instance value to your cluster name. Then you have 1 instance name for the entire cluster. If you want to add your vhost to it then try label_join which can add your vhost name to the instance label. 


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ffdeacd5-4bf8-48dc-83ab-a0e06869fc2f%40googlegroups.com.

Christian Hoffmann

unread,
Jun 4, 2020, 4:07:32 PM6/4/20
to pin...@hioscar.com, Prometheus Users
Hi,

On 6/4/20 7:48 PM, 'pin...@hioscar.com' via Prometheus Users wrote:
> We get the same alert multiple times in the same email, because the
> monitor label (prometheus instance) being different for our simple
> replicated setup. Would be nice to be able to ignore certain labels so
> that alert bodies are higher signal.

This sounds like it could be done on the Prometheus side using existing
(standard) features.
Any reason why just aggregating away the unwanted label would not work?

E.g.

avg without(instance) (some_metric)

(Depending on the value, other aggregation functions such as sum, min or
max might make more sense)

Kind regards,
Christian

pin...@hioscar.com

unread,
Jun 4, 2020, 11:40:01 PM6/4/20
to Prometheus Users
Thanks for your replies, guys. We have two replicated prometheus instances scraping the same metrics and sending the same alerts in parallel to alertmanager. We add a label to alerts indicating which prometheus instance the alert is fired from, so that if one prometheus instance is going bad we can silence alerts from that instance. The pain point is that the alert email (grouped) body is bloated with duplicated alert texts from both instances with only one label being different.

We would like to keep the label so that we can silence alerts at prometheus instance level. So, label replacement or aggregating labels in prometheus doesn't seem the right way for us. I think it would work for us if alertmanager can be configured to ignore or collapse certain labels in email texts, like:

From prometheus instance 1:
label_A = value1

From prometheus instance 2:
label_A = value2

In alert email:
label_A = value1, value2

Regards,

Ping

Matthias Rampke

unread,
Jun 6, 2020, 2:47:28 AM6/6/20
to pin...@hioscar.com, Prometheus Users
Unfortunately you cannot have it both ways: either Alertmanager knows about separate alert instances that can be silenced separately, or it doesn't.

 I would try to eliminate the need to silence by Prometheus, for example by making the alert expressions resistant to gaps in the data.

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages