Alert State

288 views
Skip to first unread message

Rahul Srivastava

unread,
Apr 20, 2017, 1:01:43 AM4/20/17
to Prometheus Developers
[trying to post again as my earlier post did not show up]

Hi,

How can I query the current state of an Alert ?

When doing a GET http://localhost:9090/api/v1/query?query=ALERTS{alertname=%22MyAlertRule%22} -- the json returned has multiple states for the same Alert. How do I know which one is the current state from the response below ?

[[
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "ALERTS",
          "alertname": "MyAlertRule",
          "alertstate": "firing",
          "monitor": "codelab-monitor",
          "severity": "critical",
          "test": "rahul"
        },
        "value": [
          1492620446.568,
          "1"
        ]
      },
      {
        "metric": {
          "__name__": "ALERTS",
          "alertname": "MyAlertRule",
          "alertstate": "pending",
          "severity": "critical",
          "test": "rahul"
        },
        "value": [
          1492620446.568,
          "0"
        ]
      },
      {
        "metric": {
          "__name__": "ALERTS",
          "alertname": "MyAlertRule",
          "alertstate": "firing",
          "severity": "critical",
          "test": "rahul"
        },
        "value": [
          1492620446.568,
          "1"
        ]
      }
    ]
  }
}
]]

Thanks,
Rahul.

Julius Volz

unread,
Apr 25, 2017, 4:02:07 PM4/25/17
to Rahul Srivastava, Prometheus Developers
You are running into Prometheus's staleness behavior here, where time series are still shown when their last sample is less than 5 minutes old (in this case, the "pending" series). You can distinguish this by the sample value though (pending = 0, firing = 1). The one with a value of 1 is the current state. See also https://prometheus.io/docs/alerting/rules/#inspecting-alerts-during-runtime

You can also filter out the 0 ones by changing your query to:

ALERTS{alertname="MyAlertRule"} == 1

or only the firing ones:

ALERTS{alertname="MyAlertRule",alertstate="firing"} == 1

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
To post to this group, send email to prometheus-developers@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/cb8ea169-796a-4923-9c47-8a46608ce558%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rahul Srivastava (र।हुल श्रीवास्तव)

unread,
Apr 26, 2017, 12:11:14 AM4/26/17
to Julius Volz, Prometheus Developers
Thanks Julius. 

But then why do we see two blocks in the json response (referred to earlier) with the value=1. There could be only one current state at any given point in time the Alarm could be in, so shouldn't there be just one block in the response to get alarm state with a value=1 -- the two blocks both with a value set to 1 seems a bit confusing. The only difference in the two blocks in the response though is the additional tag "monitor": "codelab-monitor" -- what does this tag imply ?

Thanks,
Rahul.


To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscri...@googlegroups.com.

Julius Volz

unread,
Apr 26, 2017, 12:25:54 AM4/26/17
to Rahul Srivastava (र।हुल श्रीवास्तव), Prometheus Developers
As your example shows, one of the alerts has the label "monitor": "codelab-monitor", while the other one doesn't. Thus, they are two different alerts. Is the one with that label perhaps referring to another Prometheus with "monitor": "codelab-monitor" as an external label and you are federating from it? Without knowing your exact alerting rules and Prometheus configuration, that's hard to tell.

Rahul Srivastava (र।हुल श्रीवास्तव)

unread,
Apr 26, 2017, 1:57:39 AM4/26/17
to Julius Volz, Prometheus Developers
Well, its just one single instance of Prometheus server running out of the box (no federation), with the following alert rule defined:
[[
  ALERT MyAlertRule
    IF sum(http_requests_total) > 0.0
    FOR 5s
    LABELS {
      severity = "critical",
      test = "rahul"
    }
]]

I thought "monitor": "codelab-monitor" is added as part of default behaviour of Prometheus, but seems it is not from your reply. I would be interested to know then where is this tag coming from -- maybe that would help explain the two blocks in the json response with value=1 and alertstate=firing for the same Alert.

Thanks much,
Rahul.

Julius Volz

unread,
Apr 26, 2017, 2:43:23 AM4/26/17
to Rahul Srivastava (र।हुल श्रीवास्तव), Prometheus Developers
Thank you - this pointed out a real bug in Prometheus!

Rahul Srivastava (र।हुल श्रीवास्तव)

unread,
Apr 26, 2017, 2:55:03 AM4/26/17
to Julius Volz, Prometheus Developers
Thanks Julius. Much appreciate that.
Reply all
Reply to author
Forward
0 new messages