> I am using alertmanager to post alerts on slack. Here is the configuration
> of my alert:
>
> expr: <a query that takes 5 seconds>
> for: 60m
>
> Here are the settings on my alertmanager:
>
> global:
> resolve_timeout: 5m
> route:
> group_by: ['alertname', 'cluster']
> group_interval: 5m
> group_wait: 30s
> receiver: "slack"
> repeat_interval: 12h
[...]
> 2. What is the behavior for slack to send messages? I would assume that
> it would send messages on the following situations:
> 1. Alert goes into alarm
> 2. Alert goes out of alarm
> 3. num_firing on alert either increases or decreases
[...]
One of the things you are likely running into is how group_interval
works. Once an alert group is active, Alertmanager will only send
out further notifications at every group_interval after the initial
trigger, regardless of when alerts in the group resolve or further
alerts are triggered. So if your initial alert goes out at 6:43 AM, the
next notification about the alert group's state will only be sent by
Alertmanager at exactly 6:48 AM, then 6:53 AM, and so on.
If there are no state changes in the alert group at the next
notification time, Alertmanager doesn't send out a new notification.
But if there is a state change that arrives later, it is still not
immediately sent out; it has to wait to the next tick. So if you have an
alert that you get notified about at 6:43 AM and is resolved at 6:49 AM,
you will not get another alert group notification until 6:53 AM.
This may mean that you want a relatively short group_interval time
setting. This can lead to a lot of alert notifications if a bunch of
alerts in an alert group trigger one after another, but this may be
a feature in your environment.
- cks