inhibit all severity but how??

64 views
Skip to first unread message

Danny de Waard

unread,
Apr 15, 2020, 8:17:44 AM4/15/20
to Prometheus Users
I have a alrtmanger setup and prometheus setup that are working good.
I cab inhibit severity 'warning' when maintenance is on.

But how do i also inhibit critical (and/or) info messages when maintenance is on.

So if my maintenance_mode == 1 all warning alerts are inhibited. But i also want critical to be inhibited.

How do i do that?

prometheus rules
groups:
- name: targets
  rules:
  - alert: MaintenanceMode
    expr: maintenance_mode == 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "This is a maintenance alert"
      description: "Fires during maintenance mode and is routed to a blackhole by Alertmanager"
  - alert: monitor_service_down
    expr: up == 0
    for: 40s
    labels:
      severity: critical
    annotations:
      summary: "A exporter service is non-operational"
      description: "One of the exporters on {{ $labels.instance }} is or was down. Check the up/down dashboard in grafana. http://lsrv2289.linux.rabobank.nl:3000/d/RSNFpMXZz/up-down-monitor?refresh=1m"
  - alert: server_down
    expr: probe_success == 0
    for: 30s
    labels:
      severity: critical
    annotations:
      summary: "Server is down (no probes are up)"
      description: "Server {{ $labels.instance }} is down."
  - alert: loadbalancer_down
    expr: loadbalancer_stats < 1
    for: 30s
    labels:
      severity: critical
    annotations:
      summary: "A loadbalancer is down"
      description: "Loadbalancer for {{ $labels.instance }} is down."
  - alert: high_cpu_load15
    expr: node_load15 > 4.5
    for: 900s
    labels:
      severity: critical
    annotations:
      summary: "Server under high load (load 15m) for 15 minutes."
      description: "Host is under high load, the avg load 15m is at {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}."






alertmanager.yml

global:
route:
  group_by: [instance,severity,job]
  receiver: 'default'
  routes:
   - match:
      alertname: 'MaintenanceMode'
     receiver: 'blackhole'
   - match:
      severity: warning
      job: PAT
     receiver: 'pat'
   - match:
      severity: warning
      job: PROD
     receiver: 'prod'
   - match:
      severity: critical
      job: PAT
     receiver: 'pat-crit'
   - match:
      severity: critical
      job: PROD
     receiver: 'prod-crit'
     continue: true
   - match:
      severity: critical
      job: PROD
     receiver: 'sms-waard'
   - match:
      severity: info
     receiver: 'info'
   - match:
      severity: atombomb
     receiver: 'webhook'
receivers:
  - name: 'default'
    email_configs:
     - to: 'mailaddress' ##fill in your email
       from: 'alertmanag...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'pat'
    email_configs:
     - to: 'mailaddress' ##fill in your email
       from: 'alertman...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'prod'
    email_configs:
     - to: 'mailaddress, mailaddress' ##fill in your email
       from: 'alertman...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'pat-crit'
    email_configs:
     - to: 'mailaddress' ##fill in your email
       from: 'critical-ale...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'prod-crit'
    email_configs:
     - to: 'mailaddress, mailaddress' ##fill in your email
       from: 'critical-aler...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'info'
    email_configs:
     - to: 'mailaddress' ##fill in your email
       from: 'alertman...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'sms-waard'
    email_configs:
     - to: 'mailaddress' ##fill in your email
       from: 'alertman...@superheroes.com'
       smarthost: 'localhost:25'
       require_tls: false
  - name: 'webhook'
    webhook_configs:
      - url: 'http://127.0.0.1:9000'
  - name: 'blackhole'
      
inhibit_rules:
- source_match:
    alertname: MaintenanceMode
  target_match:
    severity: warning


Brian Candler

unread,
Apr 15, 2020, 8:55:01 AM4/15/20
to Prometheus Users
- source_match:
    alertname: MaintenanceMode
  target_match_re:
    severity: 'warning|critical'

In your case, to mute everything you can probably just remove the target_match section entirely, and then I expect it will match all alerts (except for MaintenanceMode, as there's a special case which prevents an alert from inhibiting itself)

Jakub Jakubik

unread,
Apr 15, 2020, 9:19:24 AM4/15/20
to Brian Candler, Prometheus Users
Why can't you just create a silence on all alerts in alertmanager?

This is the very use case to not fire alerts when you know that they should not fire for some reason during a defined time period

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/11f1622d-2640-4358-8fa6-f3e047c70e11%40googlegroups.com.


--

Kuba Jakubik

SRE Tech Lead

Netguru - Building software for world changers

jakub....@netguru.com
netguru.com
facebooktwitterlinkedin

Sandeep Rao Kokkirala

unread,
Apr 15, 2020, 11:33:32 AM4/15/20
to Jakub Jakubik, Brian Candler, Prometheus Users
Could please share the code to silence all alerts

Christian Hoffmann

unread,
Apr 15, 2020, 11:40:11 AM4/15/20
to Sandeep Rao Kokkirala, Brian Candler, Prometheus Users
Hi,

you can just create a silence in the Alertmanager UI which matches label
alertname and (regexp) value .*.

This can also be done programatically via the HTTP api (or amtool, which
makes use of it).

In fact, this is how we tied our pre-existing maintainance mode logic to
Alertmanager.

Kind regards,
Christian


On 4/15/20 5:33 PM, Sandeep Rao Kokkirala wrote:
> Could please share the code to silence all alerts
>
> On Wed, Apr 15, 2020, 9:19 PM Jakub Jakubik <jakub....@netguru.com
> <mailto:jakub....@netguru.com>> wrote:
>
> Why can't you just create a silence on all alerts in alertmanager?
>
> This is the very use case to not fire alerts when you know that they
> should not fire for some reason during a defined time period
>
> On Wed, Apr 15, 2020 at 2:55 PM Brian Candler <b.ca...@pobox.com
> <mailto:b.ca...@pobox.com>> wrote:
>
> - source_match:
>     alertname: MaintenanceMode
>   target_match_re:
>     severity: 'warning|critical'
>
> In your case, to mute everything you can probably just remove
> the target_match section entirely, and then I expect it will
> match all alerts (except for MaintenanceMode, as there's a
> special case
> <https://prometheus.io/docs/alerting/configuration/#inhibit_rule> which
> prevents an alert from inhibiting itself)
>
> --
> You received this message because you are subscribed to the
> Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to
> prometheus-use...@googlegroups.com
> <mailto:prometheus-use...@googlegroups.com>.
> <https://groups.google.com/d/msgid/prometheus-users/11f1622d-2640-4358-8fa6-f3e047c70e11%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
>
>
> --
>
>
> Kuba Jakubik
>
> SRE Tech Lead
>
> Netguru - Building software for world changers
>
> jakub....@netguru.com <mailto:jakub....@netguru.com>
>
> netguru.com <https://netguru.com/>
>
>
> facebook <https://www.facebook.com/netguru> twitter
> <https://twitter.com/netguru> linkedin
> <https://www.linkedin.com/company/netguru/>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to prometheus-use...@googlegroups.com
> <mailto:prometheus-use...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAD5xb_GBoJaQMPOUtUyWWHt%3DxWppnQyX5NgXL_pEAb710Gu%3DSw%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAD5xb_GBoJaQMPOUtUyWWHt%3DxWppnQyX5NgXL_pEAb710Gu%3DSw%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to prometheus-use...@googlegroups.com
> <mailto:prometheus-use...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CAC0-X2GnQ6KVnr3WNrhaJD4RdafcSEk_DX577KAEUcMyxW9Cqw%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-users/CAC0-X2GnQ6KVnr3WNrhaJD4RdafcSEk_DX577KAEUcMyxW9Cqw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Murali Krishna Kanagala

unread,
Apr 16, 2020, 10:26:50 AM4/16/20
to Prometheus Users
Hi Danny,

Did you try using a reflex for the target_match?
And also adding another inhibitor rule  with target_match as critical?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f403f910-5db8-4bd1-b632-0f3e0cb376b4%40googlegroups.com.

Danny de Waard

unread,
Apr 16, 2020, 1:07:33 PM4/16/20
to Prometheus Users
I used this option:

- source_match:
alertname: MaintenanceMode
target_match_re:
severity: 'warning|critical'

And it seems to work

Reply all
Reply to author
Forward
0 new messages