Want to receive only specific alerts in teams channel

Sampada Thorat

unread,

Feb 26, 2023, 9:53:26 AM2/26/23

to Prometheus Users

Hello Everyone,

I want to receive Alerts for 'HostOutOfDiskSpace','HostHighCpuLoad','HostOutOfMemory','KubeNodeNotReady' alertnames in "elevate_alerts" channel and rest all other alerts in " default_receiver_test" channel. But for the below configuration, I'm getting all the alerts in "elevate_alerts" only.

This is my ConfigMap:

apiVersion: v1
data:
connectors.yaml: |
connectors:
- test: https://sasoffice365.webhook.office.com/webhookb2/d2415be1-2360-49c3-af48-7baf41aa1371@b1c14d5c-3625-45b3-a430-9552373a0c2f/IncomingWebhook/c7c62c1315d24c1fb5d1c731d2467dc6/5c8c1e6c-e827-4114-a893-9a1788ad41b5
- alertmanager: https://sasoffice365.webhook.office.com/webhookb2/a7cb86de-1543-4e6d-b927-387c1f1e35ad@b1c14d5c-3625-45b3-a430-9552373a0c2f/IncomingWebhook/687a7973ffe248d081f58d94a090fb4c/05be66ae-90eb-42f5-8e0c-9c10975012ca
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: prometheus-msteams
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2023-02-26T12:33:36Z"
labels:
app.kubernetes.io/managed-by: Helm
name: prometheus-msteams-config
namespace: monitoring
resourceVersion: "18040490"
uid: 795c96d5-8318-4885-804f-71bba707c885

This is my alertmanager.yaml:

global:
resolve_timeout: 5m
receivers:
- name: elevate_alerts
webhook_configs:
- url: "http://prometheus-msteams.default.svc.cluster.local:2000/alertmanager"
send_resolved: true
- name: default_receiver_test
webhook_configs:
- url: "http://prometheus-msteams.default.svc.cluster.local:2000/test"
send_resolved: true
route:
group_by:
- alertname
- severity
group_interval: 5m
group_wait: 30s
repeat_interval: 3h
receiver: default_receiver_test
routes:
- matchers:
alertname:['HostOutOfDiskSpace','HostHighCpuLoad','HostOutOfMemory','KubeNodeNotReady']
receiver: elevate_alerts

Please help

Brian Candler

unread,

Feb 27, 2023, 5:21:33 AM2/27/23

to Prometheus Users

> routes:
> - matchers:
> alertname:['HostOutOfDiskSpace','HostHighCpuLoad','HostOutOfMemory','KubeNodeNotReady']

That's invalid: alertmanager should not even start. I tested your config, and I get the following error:

ts=2023-02-27T10:17:54.702Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=tmp.yaml err="yaml: unmarshal errors:\n line 22: cannot unmarshal !!str `alertna...` into []string"

'matchers' is a list of strings, not a map. This should work:

route:

routes:

- matchers:
- alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"

receiver: elevate_alerts

See:

https://prometheus.io/docs/alerting/latest/configuration/#matcher

https://prometheus.io/docs/alerting/latest/configuration/#example

Sampada Thorat

unread,

Feb 27, 2023, 8:22:04 AM2/27/23

to Prometheus Users

Hello Brian I tried your change yet my alertmanager isn't taking config changes and shows older config. Can u have a look ?

global:
resolve_timeout: 5m
receivers:
- name: pdmso_alerts
webhook_configs:
- url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/pdmsoalert"

send_resolved: true
- name: default_receiver_test
webhook_configs:

- url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/test"

send_resolved: true
route:
group_by:
- alertname
- severity
group_interval: 5m
group_wait: 30s
repeat_interval: 3h
receiver: default_receiver_test
routes:
- matchers:

alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"

receiver: pdmso_alerts

Sampada Thorat

unread,

Feb 27, 2023, 8:22:45 AM2/27/23

to Prometheus Users

global:
resolve_timeout: 5m
receivers:
- name: pdmso_alerts
webhook_configs:
- url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/pdmsoalert"
send_resolved: true
- name: default_receiver_test
webhook_configs:
- url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/test"
send_resolved: true
route:
group_by:
- alertname
- severity
group_interval: 5m
group_wait: 30s
repeat_interval: 3h
receiver: default_receiver_test
routes:
- matchers:
alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"
receiver: pdmso_alerts

Brian Candler

unread,

Feb 27, 2023, 10:04:06 AM2/27/23

to Prometheus Users

On Monday, 27 February 2023 at 13:22:04 UTC Sampada Thorat wrote:

Hello Brian I tried your change yet my alertmanager isn't taking config changes and shows older config. Can u have a look ?

You mentioned ConfigMap, which suggests that you are deploying Prometheus on a Kubernetes cluster. It looks like your problem is primarily with Kubernetes, not Prometheus.

If you deployed Prometheus using one of the various third-party Helm charts, then you could ask on the tracker for that Helm chart. They might be able to tell you how it's supposed to work if you change the ConfigMap, e.g. whether you're supposed to destroy and recreate the pod manually to pick up the change.

Alternatively, it might be that your config has errors in it, and Alertmanager is sticking with the old config.

I tested the config you posted, by writing it to tmp.yaml and then running a standalone instance of alertmanager by hand:

/opt/alertmanager/alertmanager --config.file tmp.yaml --web.listen-address=:19093 --cluster.listen-address="0.0.0.0:19094"

It gave me the following error:

ts=2023-02-27T14:56:01.186Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=tmp.yaml err="yaml: unmarshal errors:\n line 22: cannot unmarshal !!str `alertna...` into []string\n line 23: field receiver already set in type config.plain"

(I would expect such errors to appear in pod logs too)

It's complaining that you have duplicate values for the same "receiver" key:

route:
...
receiver: default_receiver_test
...
receiver: pdmso_alerts

This is because you did not indent the second 'receiver:' correctly. It has to be under the bullet point for the 'routes:'

route:

receiver: default_receiver_test

routes:
- matchers:

- alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"

^ dash required here because 'matchers' is a list

receiver: pdmso_alerts

^ should be here, to line up with "matchers" as it's part of the same route (list element under "routes")

Sampada Thorat

unread,

Feb 27, 2023, 12:32:57 PM2/27/23

to Brian Candler, Prometheus Users

I did bullet the second value under the first value as well.

Yet some minor mistake is occuring, hence it's not reflecting in alertmanager.

Here's my Config:

global:
resolve_timeout: 5m
receivers:
- name: pdmso_alerts
    webhook_configs:
      - url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/pdmsoalert"
        send_resolved: true
- name: default_receiver_test
    webhook_configs:
      - url: "http://prometheus-msteams.monitoring.svc.cluster.local:2000/test"
        send_resolved: true
route:
group_by:

- namespace

group_interval: 5m
group_wait: 30s
repeat_interval: 3h

receiver: default_receiver_test
routes:
- matchers:
- alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"

receiver: pdmso_alerts

Thanks & Regards,

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a183d004-896c-4f7c-88d1-7bed759b1f49n%40googlegroups.com.

Brian Candler

unread,

Feb 27, 2023, 12:45:06 PM2/27/23

to Prometheus Users

Sorry, but this is the last config I am going to test for you. I think it would be better if you run alertmanager yourself locally, then you can see the errors yourself and correct them. Or at least, please paste your config into an online YAML validator like yamllint.com - it can highlight structural errors like this one.

The config you just posted gives the following error from alertmanager:

ts=2023-02-27T17:37:25.399Z caller=coordinator.go:118 level=error component=configuration msg="Loading configuration file failed" file=tmp.yaml err="yaml: line 21: did not find expected '-' indicator"

Again, this is because you have not lined up "receiver" with "matchers". Here it is again, this time replacing spaces with asterisks to try to make it 100% clear.

WRONG:

**routes:
****- matchers:
********- alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"
****receiver: pdmso_alerts

CORRECT:

**routes:
****- matchers:
********- alertname=~"HostOutOfDiskSpace|HostHighCpuLoad|HostHighCpuLoad|KubeNodeNotReady"
******receiver: pdmso_alerts

The first three lines are the same, but notice the different indentation of the last line: it needs 6 spaces not 4 so the "r" of receiver lines up with the "m" of matchers (they are two keys in the same object).

Reply all

Reply to author

Forward