"vector contains metrics with the same labelset after applying alert labels" error

195 views
Skip to first unread message

Denis Trunov

unread,
May 17, 2020, 8:46:43 AM5/17/20
to Prometheus Users
Hi,
Given rule
  - alert: SnellIQModules
    expr: ({state=~".*?FAIL.*?|.*?WARN.*?"})
    for: 1s
    labels:
      alertname: snelliqmodulefail
      severity: warning
    annotations:
      summary: "Module {{ $labels.instance }} state of {{ $labels.__name__ }} is {{ $labels.state }}"
      description: "Module {{ $labels.instance }} state of {{ $labels.__name__ }} is {{ $labels.state }}"

in some cases this rule works as expected but in most cases I get "vector contains metrics with the same labelset after applying alert labels" error and I can't understand why.
When rule works ok I see query ({state=~".*?FAIL.*?|.*?WARN.*?"}) result in Prometheus WEB interface:

ElementValue
snelltrapINPUT_2_CLOSED_CAPTION_STATE{dept="TV",instance="IQSAM00-3G 1",job="snelltrap",state="WARN:No"}1
ALERTS_FOR_STATE{alertname="SnellIQModules",dept="TV",instance="IQSAM00-3G 1",job="snelltrap",severity="warning",state="WARN:No"}1589688003
ALERTS{alertname="SnellIQModules",alertstate="firing",dept="TV",instance="IQSAM00-3G 1",job="snelltrap",severity="warning",state="WARN:No"}1

When rule results error message "vector contains metrics with the same labelset after applying alert labels" I see very similar result but it causes an error:

ElementValue
snelltrapINPUT_2_CLOSED_CAPTION_STATE{dept="TV",instance="IQSAM00-3G 1",job="snelltrap",state="WARN:No"}1
ALERTS_FOR_STATE{alertname="SnellIQModules",dept="TV",instance="IQSAM00-3G 1",job="snelltrap",severity="warning",state="WARN:No"}1589688003
ALERTS{alertname="SnellIQModules",alertstate="firing",dept="TV",instance="IQSAM00-3G 1",job="snelltrap",severity="warning",state="WARN:No"}1
snelltrapRULES_STATE{dept="TV",instance="CHOVER R2",job="snelltrap",state="WARN:Off"}1

Julius Volz

unread,
May 17, 2020, 9:34:53 AM5/17/20
to Denis Trunov, Prometheus Users
After evaluating an alerting expression, Prometheus removes the metric name from the output alert series before further processing them as alerts. In your query you completely ignore the metric name, so you might select two time series that are otherwise labeled the same (just had a different metric name initially, which was removed). Another issue here is that your alerting expression will also select the ALERTS and ALERTS_FOR_STATE series that were written out by the alerting rule in the first place, making the alerting rule effectively feed into itself.

You'll probably want to reformulate your alerting rule in such a way that it doesn't completely ignore the metric name, and/or move relevant differentiating dimensionality from the metric name into a label using either the label_replace() function, metric relabeling upon scrape, or best, fixing the data right where it's being exported.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/27433943-1201-41bf-95bb-745db5bd21b2%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

Denis Trunov

unread,
May 17, 2020, 10:25:03 AM5/17/20
to Prometheus Users
Many thanks, Julius! Your reply really helped!
Fixed according to your hints rule is now working like a charm:
  - alert: SnellIQModules
    expr: label_replace({__name__!~"ALERTS.*?",state=~".*?FAIL.*?|.*?WARN.*?"}, "metric", "$1", "__name__", "snelltrap(.+)")
    for: 1s
    labels:
      alertname: snelliqmodulefail
      severity: warning
    annotations:
      summary: "Module {{ $labels.instance }} state of {{ $labels.metric }} is {{ $labels.state }}"
      description: "Module {{ $labels.instance }} state of {{ $labels.metric }} is {{ $labels.state }}"

Thank you! :)


воскресенье, 17 мая 2020 г., 16:34:53 UTC+3 пользователь Julius Volz написал:
After evaluating an alerting expression, Prometheus removes the metric name from the output alert series before further processing them as alerts. In your query you completely ignore the metric name, so you might select two time series that are otherwise labeled the same (just had a different metric name initially, which was removed). Another issue here is that your alerting expression will also select the ALERTS and ALERTS_FOR_STATE series that were written out by the alerting rule in the first place, making the alerting rule effectively feed into itself.

You'll probably want to reformulate your alerting rule in such a way that it doesn't completely ignore the metric name, and/or move relevant differentiating dimensionality from the metric name into a label using either the label_replace() function, metric relabeling upon scrape, or best, fixing the data right where it's being exported.

To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages