Clarification on Alertmanager High Availability Setup

424 views
Skip to first unread message

Juan Bran

unread,
Apr 6, 2020, 7:27:25 PM4/6/20
to Prometheus Users
The Alertmanager High Availability documentation states that one should configure Prometheus to send to multiple Alertmanagers, which I take to mean that when you set up a single HA cluster with multiple Alertmanager instances that are members of that cluster and these are behind a load balancer, that you should then aim Prometheus at the instances themselves and never the load balancer service IP. There is contention within my group as to whether that is the correct reading of that bit of documentation or if that only applies to multiple HA clusters.

If we have a load balanced HA cluster do we need to enumerate each cluster member in the Prometheus configuration?
What happens if we don't? I'm guessing we can end up in a situation where an alert is sent to the LB service address which then may get sent to one of the nodes in the cluster as it is failing and this could cause the alert to be missed.
What would be the use case to have more than one non-clustered alertmanager or HA alertmanager cluster?

Thanks!
-Juan

Matthias Rampke

unread,
Apr 7, 2020, 12:13:44 PM4/7/20
to Juan Bran, Prometheus Users
Your reading is correct. The idea is to always be alerting even under difficult networking conditions. For example, one of your alertmanagers might not be able to reach the internet, but it is reachable from your load balancer, due to routing shenanigans or a firewall configuration issue.

The basic procedure is that

- Prometheus sends a notification to each Alertmanager instance separately
- optionally they use the clustering to deduplicate notifications
- each Alertmanager instance sends the notification unless it knows for sure that it has already been sent

This way, we ensure that even if clustering falls apart, or only some of your alertmanager instances can actually alert, you are still getting notifications (possibly more than one but that's better than none).

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/3ca4d93b-06cc-400c-96fd-16e88f08546c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages