Prometheus Alert handle/resolve handling

24 views
Skip to first unread message

EnthuDeveloper

unread,
Mar 3, 2020, 10:54:51 AM3/3/20
to Prometheus Developers
Hi ,
I am just curious to know if we have a custom webhook implemented to receive alerts from Prometheus Alertmanager then what should be the considerations for implementing the custom logic for resolved alerts ?
My question is more from a standpoint that if repeat_interval is configured to be a frequent interval , then our system might have received multiple alerts with firing status for the same alert. So when Prometheus identifies that alert should be resolved now would it just send one alert with resolved status ?

My confusion is shouldn’t we mark all the existing alert instances with matching alert name as resolved in our custom logic once Prometheus sees an alert transitioning from firing to resolved status.

Note : This logic is needed on our custom framework side to avoid listing resolved alerts on the alert dashboard.


Any input would be greatly appreciated.


Thanks.

Matthias Rampke

unread,
Mar 3, 2020, 2:34:10 PM3/3/20
to EnthuDeveloper, Prometheus Developers
I think it helps to think about Alertmanager webhooks differently.

Alertmanager does not notify about individual alerts but about *groups* of alerts. These groups come into being, the number of alert instances in them potentially changes over time. Subsequent webhooks about the same groups are updates about the status, not a separate instance of anything. The resolution notification closes the group.

By design, you cannot rely on 1 webhook call = 1 alert. To ensure delivery, Alertmanager will err on the side of notifying more than needed if it is not sure what you already got. This is especially the case with clustered Alertmanager.

The webhooks contain a groupKey field. Use this field to identify which group a notification is about, and update that one if you already have it in your UI. That way, there is only one thing to close as well.


/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/ceaf9c38-3b17-47e0-a4bd-e79e7d229e3f%40googlegroups.com.

EnthuDeveloper

unread,
Mar 3, 2020, 4:13:00 PM3/3/20
to Prometheus Developers
Thank you Matthias.

Actually our webhook implementation is depending upon status of alert sent i.e. firing vs resolved.

If we receive an notification for a given group containing some number of alerts, we just assume it to be new alert instances and end up creating new alert records in database.

It is quite hard to determine if we are receiving notification with diff labels / content or is just an update of an exiting group alert.

For e.g. if I have a service down alert configured and webhook receives the notification for service A first and service B later than they are two different alert instances for us.

So basically the main question is every-time our webhook gets the alert notifications as group then how to ensure,

We are not repetitively creating alert records and updating them as resolved each time the notification is sent with different statuses.

Thanks.

Reply all
Reply to author
Forward
0 new messages