Repeat interval and absent hours on instance.

33 views

Skip to first unread message

Sebastian Glock

unread,

Feb 25, 2021, 8:27:18 AM2/25/21

to Prometheus Users

Hi, I have problem with repeat_interval. I want to specify in routes diffrent repeating.

alertmanager.yml

```

global:

route:

receiver: alert-emailer-default-30m

group_by: ['alertname', 'priority', 'instance']

group_wait: 1m

group_interval: 1m

repeat_interval: 30s

routes:

- receiver: alert-emailer-default-30m

group_by: ['alertname', 'priority', 'instance']

group_wait: 1m

group_interval: 1m

repeat_interval: 5m

match:

severity: "[Disaster] {{ $labels.instance }}"

- receiver: alert-emailer-default-1h

group_by: ['alertname', 'priority', 'instance']

group_wait: 1m

group_interval: 1m

repeat_interval: 1m

match:

severity: "[High] {{ $labels.instance }}"

- receiver: alert-emailer-default-3h

group_by: ['alertname', 'priority', 'instance']

group_wait: 1m

group_interval: 1m

repeat_interval: 3m

match:

severity: "[Average] {{ $labels.instance }}"

```

alert.rules.yml:

```

groups:

- name: alert.rules

rules:

#Windows

#CPU

- alert: CPU load is more than 70%!

expr: 100 - (1 -avg(irate(windows_cpu_time_total{mode="user"}[10m])) by (instance)) * 100 >= 40

for: 30s

labels:

severity: "[Average] {{ $labels.instance }}"

annotations:

summary: "CPU load is more than 70%!"

description: "{{ humanize $value }}%"

- alert: CPU load is more than 80%!

expr: 100 - (1 -avg(irate(windows_cpu_time_total{mode="user"}[10m])) by (instance)) * 100 >= 40

# AND ON() absent(hour() >= 0 < 18{instance="10.16.155.150"})

for: 10s

labels:

severity: "[High] {{ $labels.instance }}"

annotations:

summary: "CPU load is more than 80%!"

description: "{{ humanize $value }}%"

- alert: CPU load is more than 90%!

expr: 100 - (1 -avg(irate(windows_cpu_time_total{mode="user"}[10m])) by (instance)) * 100 >= 40

for: 50s

labels:

severity: "[Disaster] {{ $labels.instance }}"

annotations:

summary: "CPU load is more than 90%!"

description: "{{ humanize $value }}%"

```

but still main route holding repeat_interval. I'm getting reminder every 30 seconds, which is it implemented in main route. How to solve it so that e-mails arrive at different intervals defined in routes?

Second question is about absent hours:

```

100 - (1 -avg(irate(windows_cpu_time_total{mode="user"}[10m])) by (instance)) * 100 AND ON() absent(hour() >0 <12) AND ON() absent(nonexistent{instance="10.16.22.22"})

```

How to specify absent hours for given instances between 20 pm and 6 am?

Thanks for all your help!

Sebastian Glock

unread,

Feb 25, 2021, 1:16:22 PM2/25/21

to Prometheus Users

Ok I have done 1st problem:

```

severity: "[Average] {{ $labels.instance }}"

```

There can't be $labels.instance - without it works great.

Reply all

Reply to author

Forward

0 new messages