Alertmanager & OpsGenie Configutation

Hasene Ceren Yıkılmaz

unread,

Jan 17, 2022, 7:55:49 AM1/17/22

to Prometheus Users

When I restart the alertmanager it's running, but I can't see any of these alerts in OpsGenie.

I follow this doc : https://support.atlassian.com/opsgenie/docs/integrate-opsgenie-with-prometheus/ and this doc https://prometheus.io/docs/alerting/latest/configuration/#opsgenie_config

Is there anything to controlling about Alertmanager & OpsGenie integration?

This is my alertmanager.yml file;

global:
resolve_timeout: 5m

route:
receiver: slack_general
group_by: ['instance']
group_wait: 1s
group_interval: 1s
routes:
- match:
severity: critical
continue: true
receiver: slack_general
- match:
severity: warning
continue: true
receiver: slack_general
- match:
severity: critical
continue: true
- match:
severity: warning
continue: true
# added receivers for opsgenie
- match:
severity: critical
receiver: 'netmera_opsgenie'
- match:
severity: warning
receiver: 'netmera_opsgenie'

receivers:
- name: slack_general
slack_configs:
- api_url: 'slack api url'
channel: '#netmera-prometheus'
send_resolved: true
title: "{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}"
text: "{{ range .Alerts }}{{ .Annotations.description }}\n{{ end }}"
# added opsgenie configs
- name: 'netmera_opsgenie'
opsgenie_configs:
- api_key: opsgenie api key
api_url: https://api.eu.opsgenie.com/
message: '{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}'
description: '{{ range .Alerts }}{{ .Annotations.description }}\n{{ end }}'
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical"}}P2{{else}}P3{{end}}{{end}}'

I contacted with OpsGenie support and they checked the logs, but they couldn't see anything comes from alert manager.

Could you please help me about that?

Thank you!

Brian Candler

unread,

Jan 17, 2022, 9:44:11 AM1/17/22

to Prometheus Users

1. Are you sure that the alerts you generate from alerting rules have the 'severity: critical' or 'severity: warning' labels on them? If not, they won't match any of the routes, so they'll fall back to the default you set:

route:
receiver: slack_general

2. Why do you have these empty routes?

- match:
severity: critical
continue: true
- match:
severity: warning
continue: true

They don't do anything - delete them.

3. In order to see if alertmanager is attempting to send to opsgenie (and failing):

* Look at the logs of the alertmanager proces (e.g. "journalctl -eu alertmanager" if running it under systemd)

* Look at the notification metrics which alertmanager itself generates:

curl -Ss localhost:9093/metrics | grep 'alertmanager_notifications.*opsgenie'

If you see:

alertmanager_notifications_failed_total{integration="opsgenie"} 0
alertmanager_notifications_total{integration="opsgenie"} 0

then no delivery to opsgenie has been attempted. If there are attempts and failures, you'll see these metrics going up.

BTW, it's useful to scrape alertmanager from your prometheus, so you can query these metrics and get history of them (and indeed, alert on them if necessary):

- job_name: alertmanager
scrape_interval: 1m
static_configs:
- targets: ['localhost:9093']

4. If you want to deliver a particular alert to multiple destinations, a much cleaner way of doing it is to use a subtree of routes to list multiple destinations:

- match:
severity: critical
routes: [ {receiver: slack_general, continue: true}, {receiver: netmera_opsgenie} ]

Then you don't have to duplicate your matching logic (in this case "severity: critical"), and you don't get into confusion over when to use "continue: true".

OTOH, if you want *all* alerts to go to slack regardless, then just put a catch-all route at the top:

route:
receiver: slack_general # this is never used because the first rule below always matches
routes:
- receiver: slack_general
continue: true
- match:

severity: critical
receiver: 'netmera_opsgenie'
- match:
severity: warning
receiver: 'netmera_opsgenie'

5. "match" and "match_re" are deprecated, better to start using the new matchers syntax:

- matchers:
- 'severity =~ "warning|critical"'
receiver: 'netmera_opsgenie'

Hasene Ceren Yıkılmaz

unread,

Jan 26, 2022, 4:24:08 PM1/26/22

to Prometheus Users

Hi!

I send the curl request and get;

alertmanager_notifications_failed_total{integration="opsgenie"} 0

alertmanager_notifications_total{integration="opsgenie"} 0

So in this case the first thing I control is the alerts I generate from alerting rules have the 'severity: critical' or 'severity: warning' labels on them right?

But how can I controll this?

17 Ocak 2022 Pazartesi tarihinde saat 17:44:11 UTC+3 itibarıyla Brian Candler şunları yazdı:

Brian Candler

unread,

Jan 27, 2022, 3:44:40 AM1/27/22

to Prometheus Users

In the alerting rules themselves. e.g.

groups:
- name: UpDown
rules:
- alert: UpDown
expr: up == 0
for: 3m
labels:
severity: critical
annotations:
summary: 'Scrape failed: host is down or scrape endpoint down/unreachable'

You can check the currently-firing alerts in the Prometheus web UI (by default at x.x.x.x:9090). It will show you what labels each alert carries.

Reply all

Reply to author

Forward