Include hostname in the alert summary message

51 views
Skip to first unread message

Zhang Zhao

unread,
Aug 31, 2020, 6:38:06 PM8/31/20
to Prometheus Users
Hi,
Anyone can guide me what shall I do to include the nodename (highlighted in the snapshot) in the alert? The labels variable doesn’t include the nodename, I wanted to show the nodename in the alert summary message. 
summary: “Host high CPU load (instance {{ $labels.instance }})”

Zhang
Snip20200831_4.png

Brian Candler

unread,
Sep 1, 2020, 4:49:16 AM9/1/20
to Prometheus Users
You can join your query with node_uname_info, using group_left, so that the query result gains additional labels from node_uname_info.  The approach is described here:

Zhang Zhao

unread,
Sep 1, 2020, 4:02:26 PM9/1/20
to Brian Candler, Prometheus Users
Brian,
I added the group_left below as highlighted. However, it didn’t work as expected in the output. Any advice where was wrong?
summary: Host out of memory (instance {{ $labels.instance }} group_left(nodename) node_uname_info{job="node-exporter-vm”})





Output:
"commonAnnotations_summary":"Host out of memory (instance 172.25.35.85:9100 group_left(nodename) node_uname_info{job=\"node-exporter-vm\"})"






--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/2c849844-a562-4546-adeb-031d8bc0fa37o%40googlegroups.com.

Brian Candler

unread,
Sep 1, 2020, 4:30:02 PM9/1/20
to Prometheus Users
On Tuesday, 1 September 2020 21:02:26 UTC+1, Zhang Zhao wrote:
I added the group_left below as highlighted. However, it didn’t work as expected in the output. Any advice where was wrong?
summary: Host out of memory (instance {{ $labels.instance }} group_left(nodename) node_uname_info{job="node-exporter-vm”})


I don't know what you did, but the entire PromQL expression (including group_left) goes in the "expr:" part of your alerting rule, not in the template.

You need to use a binary operator and join on a common label, usually "instance".  Something like:

groups:
- name: UpDown
  rules:
  - alert: UpDown
    expr: (up == 0) * on (instance) group_left(domainname,nodename,sysname) node_uname_info

Zhang Zhao

unread,
Sep 1, 2020, 4:42:39 PM9/1/20
to Brian Candler, Prometheus Users
What I needed is to display the “hostname” in the summary so that I can extract the hostname on ServiceNow side. Is that possible?


- alert: HostOutOfMemory
      annotations:
        message: |
          Node memory is filling up (< 10% left)
            VALUE = {{ $value }}
        summary: Host out of memory (instance {{ $labels.instance }} group_left(nodename) node_uname_info{job="node-exporter-vm”})
      expr: node_memory_MemAvailable_bytes {job="node-exporter-vm"} / node_memory_MemTotal_bytes{job="node-exporter-vm"}
        * 100 < 10
      for: 5m
      labels:
        inc: servicenow
        severity: warning

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Brian Candler

unread,
Sep 1, 2020, 4:50:41 PM9/1/20
to Prometheus Users
On Tuesday, 1 September 2020 21:42:39 UTC+1, Zhang Zhao wrote:
What I needed is to display the “hostname” in the summary so that I can extract the hostname on ServiceNow side. Is that possible?


The group_left(x,y,z) means that the result gains labels x,y,z from the RHS expression.  So you can use $labels.nodename in the alert template.

- alert: HostOutOfMemory
      annotations:
        message: |
          Node memory is filling up (< 10% left)
            VALUE = {{ $value }}
        summary: Host out of memory (instance {{ $labels.instance }} nodename {{ $labels.nodename }})
      expr: |
        (node_memory_MemAvailable_bytes{job="node-exporter-vm"} / node_memory_MemTotal_bytes{job="node-exporter-vm"} * 100 < 10)
        * on(instance) group_left(nodename) node_uname_info
      for: 5m
      labels:
        inc: servicenow
        severity: warning

Another option is to set meaningful "instance" labels at scrape time, so that $labels.instance can be used directly in the alert without having to do a join.

Zhang Zhao

unread,
Sep 1, 2020, 5:17:46 PM9/1/20
to Brian Candler, Prometheus Users
Thank you for the explanation.



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages