Hot to set the email subject line based on the condition in alertmanager

2,218 views
Skip to first unread message

Sri man

unread,
Nov 18, 2021, 3:54:08 AM11/18/21
to Prometheus Users
Hi Team,

I want to set the subject line in alertmanager based on send_resolved status. I mean before resolving the alert subject line should be "DOWN" and after resolving the alert subject line should be set to "UP".

Could someone please guide me how to achieve this or provide me syntax please?

Best Regards,
Sriman

Brian Candler

unread,
Nov 18, 2021, 6:54:53 AM11/18/21
to Prometheus Users
This is already done for you.  With the default templates, the subject line has a prefix like [FIRING:2] or [RESOLVED].

However if your alerts are being grouped, and let's say one alert resolves, then you might then get an E-mail with subject [FIRING:1] which shows one alert still firing and one alert resolved.  That is: a single E-mail may contain both "DOWN" and "UP" notifications.

You can change the templates if you like.  There is documentation here.  The default subject line is created here and you can override this with your own template string.

But to be honest, I've switched off all "resolved" notifications now.  There's a good explanation why here:

In short: the fact that an alert condition has ceased is not an excuse to your staff to say "oh that's OK, the problem has gone; I can ignore it now".  There *was* a problem and it *still* needs to be investigated.  You can use an alert to open a ticket, but you should close the ticket manually.

This document is also *very* well worth reading:

Sri man

unread,
Nov 19, 2021, 1:58:08 AM11/19/21
to Prometheus Users

Hi Brian,

Thanks for you inputs. Could you please help me where to add in and what to modify in the below line of code to get the email subject with the keyword  "DOWN" for firing alerts and "UP" for resolved alerts.

{{ define "__subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}

Best Regards,
Suman

Brian Candler

unread,
Nov 19, 2021, 4:33:08 AM11/19/21
to Prometheus Users
To replace any header with a new template, follow the documentation of email_config here.

# Further headers email header key/value pairs. Overrides any headers # previously set by the notification implementation. [ headers: { <string>: <tmpl_string>, ... } ]

The default value comes from here which sets:
{{ define "email.default.subject" }}{{ template "__subject" . }}{{ end }}
That is, if you don't override it, then it uses the value of the "__subject" variable, the definition of which I pointed you to before.

Sri man

unread,
Nov 23, 2021, 3:40:17 AM11/23/21
to Prometheus Users
Hello Brian,

I have edited the default template as below.

{{ define "__subject" }}[{{ if eq .Status "firing" }} "DOWN" {{ else }} "UP" {{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}
{{ define "__description" }}{{ end }}

But am getting the below error.
./amtool check-config /home/sam_r/alertmanager.yml
Checking '/home/snr_r/alertmanager.yml'  SUCCESS
Found:
 - global config
 - route
 - 0 inhibit rules
 - 2 receivers
 - 1 templates
  FAILED: template: statuspal.tmpl:4: function "DOWN" not defined

amtool: error: failed to validate 1 file(s)

Could you please help me?

Best Regards,
Suman



Sri man

unread,
Nov 24, 2021, 3:08:39 AM11/24/21
to Prometheus Users
Hi Brian,

I am a beginner and have minimal knowledge about alertmanager. Could you please provide me some guidance please.

I tried to add the subject in alertmanager.yml as below. But still no luck.
global:
  resolve_timeout: 5m
  smtp_smarthost: *********
  smtp_auth_username: '*********'
  smtp_auth_identity: '********'
  smtp_auth_password: '******'
  smtp_from: 'x...@gmail.com'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 24h
  receiver: 'App-Kafka-Support'

receivers:
- name: 'App-Kafka-Support'
  email_configs:
  - to: 'x...@gmail.com'
    send_resolved: true
    headers:
      subject: '{{ if eq .Status "firing" }} DOWN {{ else if eq .Status "resolved" }} UP {{end}}'

Best Regards,
Suman

Brian Candler

unread,
Nov 24, 2021, 3:15:34 AM11/24/21
to Prometheus Users
Once I had fixed the smtp_smarthost line to include quotes and a port number, the config you gave validates just fine for me.

So if you have a problem with this config, you need to describe what the symptoms are (since the symptoms you gave previously no longer apply).  I'm afraid I can't guess what your system may or may not be doing.

root@prometheus:~# cat test.yml
global:
  resolve_timeout: 5m
  smtp_smarthost: '*********:25'
  smtp_auth_username: '*********'
  smtp_auth_identity: '********'
  smtp_auth_password: '******'
  smtp_from: 'x...@gmail.com'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 24h
  receiver: 'App-Kafka-Support'

receivers:
- name: 'App-Kafka-Support'
  email_configs:
  - to: 'x...@gmail.com'
    send_resolved: true
    headers:
      subject: '{{ if eq .Status "firing" }} DOWN {{ else if eq .Status "resolved" }} UP {{end}}'
root@prometheus:~# /opt/alertmanager/amtool check-config test.yml
Checking 'test.yml'  SUCCESS
Found:
 - global config
 - route
 - 0 inhibit rules
 - 1 receivers
 - 0 templates

root@prometheus:~#

Brian Candler

unread,
Nov 24, 2021, 3:19:27 AM11/24/21
to Prometheus Users
For completeness, this is the version of alertmanager I tested with:

root@prometheus:~# /opt/alertmanager/alertmanager --version
alertmanager, version 0.23.0 (branch: HEAD, revision: 61046b17771a57cfd4c4a51be370ab930a4d7d54)
  build user:       root@e21a959be8d2
  build date:       20210825-10:48:55
  go version:       go1.16.7
  platform:         linux/amd64

Brian Candler

unread,
Nov 24, 2021, 3:49:21 AM11/24/21
to Prometheus Users
I also tested your "headers: subject: ..." section under "email_configs".  This is what Thunderbird showed me for a "resolved" E-mail:

resolved.png
Therefore, it all works as expected as far as I can see.  Did you remember to signal to alertmanager that you'd changed its config?  (Send it a HUP signal)

Do note that you included an extra space before and after the words "DOWN" and "UP" in your template.  Hence the E-mail subject has a leading space:

Subject:  UP
        ^^
 

Sri man

unread,
Nov 24, 2021, 4:21:54 AM11/24/21
to Prometheus Users
Hi Brian,

I am assuming that if we have used send_resolved:true then once the alert is resolved .Status value will be set to resolved and the subject would be updated accordingly. 

For me I am not able to see any errors in the log file. Alertmanager service and prometheus are up and running fine. Alerts are activated in Prometheus but emails have not trigerred so far.

Best Regards,
Suman

Brian Candler

unread,
Nov 24, 2021, 5:16:59 AM11/24/21
to Prometheus Users
On Wednesday, 24 November 2021 at 09:21:54 UTC muktha...@gmail.com wrote:
I am assuming that if we have used send_resolved:true then once the alert is resolved .Status value will be set to resolved and the subject would be updated accordingly. 

For me I am not able to see any errors in the log file. Alertmanager service and prometheus are up and running fine. Alerts are activated in Prometheus but emails have not trigerred so far.

Then it's up to you to either create an alert condition or resolve an alert condition, to force an alert to be sent out.  Or wait for the repeat_interval (24h) if there are already active alerts.

Sri man

unread,
Nov 26, 2021, 6:31:37 AM11/26/21
to Prometheus Users
Hi Brian,

Many thanks for your help. I am almost able to achieve whatever I wanted to but with one receiver. When i add two receiver and try to send the alerts. Based on the matcher alerts are fired and sent to the receiver but resolved alerts are not going. Could you please verify my configs and let me know if am on right track please. I am not able to get any error or any insight from the alertmanger.log.

 global:
  resolve_timeout: 5m
  smtp_smarthost: *******:25
  smtp_auth_username: '*****'
  smtp_auth_identity: '*****'
  smtp_auth_password: '******'
  smtp_from: '********'
  smtp_require_tls: false

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 5m
  receiver: 'App-Kafka-Support'
  routes:
  - receiver: 'Statuspal'
    matchers:
    - typeofalert="statuspal"
    continue: true

receivers:
- name: 'App-Kafka-Support'
  email_configs:
  - to: 'a...@gmail.com'
    send_resolved: true
- name: 'Statuspal'
  email_configs:
  - to: 'x...@gmail.com'
    headers:
      subject: '{{ if eq .Status "firing" }}DOWN{{ else if eq .Status "resolved" }}UP{{end}}'
    send_resolved: true

Best Regards,
Suman

Brian Candler

unread,
Nov 26, 2021, 8:32:16 AM11/26/21
to Prometheus Users
What your config says is:

- if the alert has label typeofalert="statuspal", send it to "x...@gmail.com" only (with the updated subject header)
- otherwise, send it to "a...@gmail.com" only

(Notice that "continue: true" doesn't make any difference on the final alerting rule.  If this rule matches, it would follow onto another rule if there were one, but it won't do the default action)

I don't see any problem, although I'd be inclined to simplify the template to

     subject: '{{ if eq .Status "firing" }}DOWN{{ else }}UP{{end}}'

Look at the stdout/stderr output from alertmanager (maybe "journalctl -eu alertmanager" if you're running it under systemd).  I think there's a debug mode tool

Trace the resolved E-mails - look at your E-mail logs, look at the logs on the receiver, check your spam folder etc.

You can also look at the counters which alertmanager itself creates (scrape localhost:9093/metrics to see them), to convince yourself that the resolved E-mails *are* being sent.

Sri man

unread,
Nov 26, 2021, 11:04:37 AM11/26/21
to Prometheus Users
Hi Brian,

What should be added or changed if i want to send the alerts that matches this labels typeofalert="statuspal" to a...@gmail.com as well along with z...@gmail.com.

Could you please provide more information on the parameter continue:true and its significance please.

Best Regards,
Suman

Brian Candler

unread,
Nov 26, 2021, 11:45:09 AM11/26/21
to Prometheus Users
On Friday, 26 November 2021 at 16:04:37 UTC muktha...@gmail.com wrote:
What should be added or changed if i want to send the alerts that matches this labels typeofalert="statuspal" to a...@gmail.com as well along with z...@gmail.com.

The simplest way is to make a new receiver with multiple entries, and route to that:

receivers:
- name: 'App-Kafka-Support-and-Statuspal'
  email_configs:
  - to: 'a...@gmail.com'
    send_resolved: true
  - to: 'x...@gmail.com'
    send_resolved: true
    headers:
      subject: '{{ if eq .Status "firing" }}DOWN{{ else if eq .Status "resolved" }}UP{{end}}'

But you can also have separate receivers.  For example, if you want to send all alerts to receiver App-Kafka-Support *in addition* to any other receivers with explicit routes, then you can put a catch-all route at the top:

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 5m
  receiver: 'dontcare'
  routes:
  - receiver: 'App-Kafka-Support'
    continue: true
  - receiver: 'Statuspal'
    matchers:
    - typeofalert="statuspal"

 
Could you please provide more information on the parameter continue:true and its significance please.


If continue is set to false, it stops after the first matching child. If continue is true on a matching node, the alert will continue matching against subsequent siblings. If an alert does not match any children of a node (no matching child nodes, or none exist), the alert is handled based on the configuration parameters of the current node.

I think that's pretty clear.  If it doesn't match rule N, then it moves to rule N+1.  If it does match N, but has continue: true, then it also moves to rule N+1.  The top-level receiver is only used if *none* of the rules matched, with or without "continue: true".

Therefore, "continue: true" on the final route makes no difference.

Reply all
Reply to author
Forward
0 new messages