Alertmanager doesn't send "endsAt" parameter in json

741 views
Skip to first unread message

nemes...@gmail.com

unread,
May 29, 2018, 9:43:57 AM5/29/18
to Prometheus Users
Hi everyone,

my configuration is:
prometheus : 2.2.1
alertmanager: 0.15.0-rc.1
telegrambot (via webhook in alertmanager) : https://github.com/inCaller/prometheus_bot

I have faced with some difficulties , my alertmanager doesn't send "endsAt" parameter in json with webhook.
Steps :
1. i have triggered alert (for example high cpu load on somenodename)
2. Prometheus sends to alertmanager alert
3. Alert manager send next json to telegram bot via webhook:
{
   "alerts":[
      {
         "annotations":{
            "resolve_message":"CPU usage \u003c 80% during 5 minutes",
            "summary":"CPU usage is too high (crnt: 100%; condition: 80%; during 5 minutes)"
         },
         "sendsAt":"",
         "generatorURL":"11111",
         "labels":{
            "alertname":"HighCPU",
            "dashboard":"1111111",
            "environment":"11111",
            "monitor":"11111111",
            "nodename":"somenodename",
            "project":"1111111",
            "severity":"warning"
         },
         "startsAt":"2018-05-29T13:09:45.392819306Z"
      }
   ],
   "commonAnnotations":{
      "resolve_message":"CPU usage \u003c 80% during 5 minutes",
      "summary":"CPU usage is too high (crnt: 100%; condition: 80%; during 5 minutes)"
   },
   "commonLabels":{
      "alertname":"HighCPU",
      "dashboard":"111111",
      "environment":"11111",
      "monitor":"1111111",
      "nodename":"somenodename",
      "project":"1111111111",
      "severity":"warning"
   },
   "externalURL":"http://alertmanager:9093",
   "groupKey":0,
   "groupLabels":{
      "alertname":"HighCPU",
      "dashboard":"1111111",
      "nodename":"somenodename",
      "project":"1111111"
   },
   "receiver":"telegram",
   "status":"firing",
   "version":0
}

4. Stop cpu load on somenodename.
5. Prometheus sends resolve
6. Alertmanager sends post json (without "EndsAt"):
{
   "alerts":[
      {
         "annotations":{
            "resolve_message":"CPU usage \u003c 80% during 5 minutes",
            "summary":"CPU usage is too high (crnt: 100%; condition: 80%; during 5 minutes)"
         },
         "sendsAt":"",
         "generatorURL":"11111",
         "labels":{
            "alertname":"HighCPU",
            "dashboard":"11111",
            "environment":"11111",
            "monitor":"111111",
            "nodename":"somenodename",
            "project":"111111",
            "severity":"warning"
         },
         "startsAt":"2018-05-29T13:09:45.392819306Z"
      }
   ],
   "commonAnnotations":{
      "resolve_message":"CPU usage \u003c 80% during 5 minutes",
      "summary":"CPU usage is too high (crnt: 100%; condition: 80%; during 5 minutes)"
   },
   "commonLabels":{
      "alertname":"HighCPU",
      "dashboard":"111111",
      "environment":"11111",
      "monitor":"11111",
      "nodename":"somenodename",
      "project":"11111111",
      "severity":"warning"
   },
   "externalURL":"http://alertmanager:9093",
   "groupKey":0,
   "groupLabels":{
      "alertname":"HighCPU",
      "dashboard":"111111",
      "nodename":"somenodename",
      "project":"111111"
   },
   "receiver":"telegram",
   "status":"resolved",
   "version":0
}

The main idea - i want to compare "EndsAt" and "StartsAt" and write in resolve message alert's active time , for example "alert was fired during 7 minutes".



I have read alertmanager documentation several times , and found out the article https://prometheus.io/docs/alerting/configuration/#webhook_config , it says that alertmanager sends alert in next json format :
  "alerts": [
    {
      "labels": <object>,
      "annotations": <object>,
      "startsAt": "<rfc3339>",
      "endsAt": "<rfc3339>"

My alertmanager config :
# Ansible managed
global:
    resolve_timeout: 2m
receivers:
-   name: telegram
    webhook_configs:
    -   send_resolved: true

route:
    group_by:
    - alertname
    - nodename
    - project
    - instance
    - queue_name
    - replication_status
    - dashboard
    - domain
    - host
    - queue
    receiver: telegram
    repeat_interval: 8736h
    routes:
    -   continue: true
        group_by:
        - alertname
        - nodename
        - project
        - instance
        - queue_name
        - replication_status
        - dashboard
        - domain
        - host
        - queue
        match_re:
            severity: ^(critical|warning)$
        receiver: telegram
        repeat_interval: 8736h



Probably I misunderstood alertmanager documentation ....

Sorry for my english it is not my native language.

Simon Pasquier

unread,
May 30, 2018, 8:15:18 AM5/30/18
to nemes...@gmail.com, Prometheus Users
IIUC the JSON data you've pasted is from the Telegram bot. It seems that it doesn't deserialize properly the payload sent by AlertManager (see [1], the key is "sendsAt" instead of "endsAt").
I suggest filing an issue in the bot repository.



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c099fc7e-59f6-4ff7-b7ac-b40624f9e4a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

nemes...@gmail.com

unread,
May 31, 2018, 2:51:00 AM5/31/18
to Prometheus Users
Oh thank for highlighting you Simon , i'll try to recompile telegram bot with fixes

среда, 30 мая 2018 г., 15:15:18 UTC+3 пользователь Simon Pasquier написал:
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages