How to get "Resolved" message from Webhook Notify?

26 views
Skip to first unread message

iono sphere

unread,
Sep 23, 2020, 9:02:31 AM9/23/20
to Prometheus Users
I am new to Prometheus and its Alert Manager.

I am using this alertmanager.yml:

global:
  resolve_timeout: 5m

route:
  #group_by: ['alertname']
  group_wait: 1s
  group_interval: 1s
  repeat_interval: 1h
  receiver: 'web.hook'
receivers:
- name: 'web.hook'
  webhook_configs:

My Prometheus rules.yml is:
groups:
  - name: default
    rules:
      - alert: RequestRate
        expr: rate(my_counter[1m]) > 0
        #for: 1s #this is optional.
        labels:
          severity: high
        annotations:
          summary: rate counter 1m more than 0

The relevant part in my node.js is simply: 
app.use( "/alert", (req, res) => {
    console.log("alert received")
    res.status(200).send("ok");
    return;
});

This is working as expected. AlertManager calls http://localhost:2222/alert and I got "alert received". But here's the thing, it calls two times - as expected, but the first time is when the alert goes off, but the second time is when the alert has been resolved. I do not know which is for "the alert goes off" and "resolve". I have checked in the req parameter, but I do not see anything that could be of use, unless I am missing something...

So:

1. how can we differentiate the "alert goes off" and the "alert has been resolved" in this case?

2. What is the "resolve" time? Immediately if the expr in rules.yml is false in the next evaluation_interval?

Regards,
Iono


Christian Hoffmann

unread,
Sep 23, 2020, 4:07:02 PM9/23/20
to iono sphere, Prometheus Users
Hi,

On 9/23/20 3:02 PM, iono sphere wrote:
> This is working as expected. AlertManager calls
> http://localhost:2222/alert and I got "alert received". But here's the
> thing, it calls two times - as expected, but the first time is when the
> alert goes off, but the second time is when the alert has been resolved.
> I do not know which is for "the alert goes off" and "resolve". I have
> checked in the req parameter, but I do not see anything that could be of
> use, unless I am missing something...
>
> So:
>
> 1. how can we differentiate the "alert goes off" and the "alert has been
> resolved" in this case?

The Webhook Payload contains a "status" field which should be set to
"resolved" in these cases:
https://prometheus.io/docs/alerting/latest/configuration/#webhook_config

I'm not exactly sure what your example is (NodeJS?) and how the web
handler works exactly. However, if "res" contains the other JSON fields,
it should also contain the "status" field.

If it does not, you might have to look at some other part of the
request, maybe?

> 2. What is the "resolve" time? Immediately if the expr in rules.yml is
> false in the next evaluation_interval?
Without having looked into the actual code: Prometheus only knows if an
alert changed when the rule has been re-evaluated. This happens after
evaluation_interval. Only then, it can notice that the alert is resolved
and will notify Alertmanager which will then notify your webhook after
several aggegrations/cleanups (e.g. group_interval/group_wait, etc.).

So I would say: The notification is not immediate, but directly after
the intended grace periods.

Kind regards,
Christian
Reply all
Reply to author
Forward
0 new messages