Sending resolved alert from alertmanager

2,786 views
Skip to first unread message

Nikhil Goenka

unread,
Jun 22, 2017, 3:13:29 AM6/22/17
to Prometheus Users
Hi,
I have configured alertmanager to send alerts on prometheus. Following is my setup details:
1. I have a target "snmp" specified in prometheus.yml. I am shutting the machine in order to receive an alert which I am able to on alertmanager as expected.
2. Now I am switching on my machine and expecting a resolved alert for the earlier alert. However, I still see the earlier "critical" alert.

Am I doing something wrong here? Do I need to add a rule for resolved serverity too in alert.rules?

Following is my alert.rules file in prometheus:
ALERT system_down
  IF up == 0
  FOR 1m
  LABELS { severity="critical" }
  ANNOTATIONS {
    summary = "is down",
    description = "SNMP has been unreachable for more than 1 minute.",
  }


My simple.yml file (I have enabled snmptrapper_webhook as well)

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'tes...@gmail.com'
  smtp_auth_username: 'xxxx'
  smtp_auth_password: 'yyy'
  receivers:
- name: 'team-X-mails'
  email_configs:
  - to: 'y...@gmail.com'
- name: "webhook"
  webhook_configs:
     send_resolved: true


Brian Brazil

unread,
Jun 22, 2017, 3:16:53 AM6/22/17
to Nikhil Goenka, Prometheus Users
On 22 June 2017 at 08:13, Nikhil Goenka <nikhil...@alefmobitech.com> wrote:
Hi,
I have configured alertmanager to send alerts on prometheus. Following is my setup details:
1. I have a target "snmp" specified in prometheus.yml. I am shutting the machine in order to receive an alert which I am able to on alertmanager as expected.
2. Now I am switching on my machine and expecting a resolved alert for the earlier alert. However, I still see the earlier "critical" alert.

Am I doing something wrong here? Do I need to add a rule for resolved serverity too in alert.rules?

You don't.

In order to reduce notification spam, the alertmanger only sends out a notification for a group of alerts every 5 minutes by default. If you wait 5 minutes you should get it.

Resolved messages are also best effort, there are some corner cases where they won't be sent.

Brian
 

Following is my alert.rules file in prometheus:
ALERT system_down
  IF up == 0
  FOR 1m
  LABELS { severity="critical" }
  ANNOTATIONS {
    summary = "is down",
    description = "SNMP has been unreachable for more than 1 minute.",
  }


My simple.yml file (I have enabled snmptrapper_webhook as well)

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'tes...@gmail.com'
  smtp_auth_username: 'xxxx'
  smtp_auth_password: 'yyy'
  receivers:
- name: 'team-X-mails'
  email_configs:
  - to: 'y...@gmail.com'
- name: "webhook"
  webhook_configs:
     send_resolved: true


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAKH5-WFgFL%2BLDaFbCpQZ%2B2-697ApRn55u3Y1DkpjgNFLbWJSbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.



--

Nikhil Goenka

unread,
Jun 22, 2017, 5:14:05 AM6/22/17
to Brian Brazil, Prometheus Users
Thanks Brian.

In order to reduce notification spam, the alertmanger only sends out a notification for a group of alerts every 5 minutes by default. If you wait 5 minutes you should get it.
>>> So is this 5 mins configurable if I wish to receive the notification earlier?

On Thu, Jun 22, 2017 at 12:46 PM, Brian Brazil <brian....@robustperception.io> wrote:

Nikhil Goenka

unread,
Jun 23, 2017, 2:26:24 AM6/23/17
to Brian Brazil, Prometheus Users
I have the following configuration in my simple.yml, still I am getting resolved trap only after 30 mins. Is this the intended behavior?

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'tes...@gmail.com'
  smtp_auth_username: 'test0'
  smtp_auth_password: '123'

route:
  receiver: 'team-X-mails'
  group_by: ['alertname']
  group_wait: 1s
  group_interval: 1s
  repeat_interval: 1s

#  receiver: 'slack-notifications'
#  group_by: ['alertname']

  receiver: 'webhook'
  group_by: ['alertname']
  group_wait: 1s
  group_interval: 1s
  repeat_interval: 1s

inhibit_rules:
- source_match:
    severity: 'critical'
  target_match:
    severity: 'warning'
  # Apply inhibition if the alertname is the same.
  equal: ['alertname']


- Nikhil


On Thu, Jun 22, 2017 at 2:44 PM, Nikhil Goenka <nikhil...@alefmobitech.com> wrote:
Thanks Brian.

In order to reduce notification spam, the alertmanger only sends out a notification for a group of alerts every 5 minutes by default. If you wait 5 minutes you should get it.
>>> So is this 5 mins configurable if I wish to receive the notification earlier?

On Thu, Jun 22, 2017 at 12:46 PM, Brian Brazil <brian.brazil@robustperception.io> wrote:

Brian Brazil

unread,
Jun 23, 2017, 2:31:15 AM6/23/17
to Nikhil Goenka, Prometheus Users
On 23 June 2017 at 07:26, Nikhil Goenka <nikhil...@alefmobitech.com> wrote:
I have the following configuration in my simple.yml, still I am getting resolved trap only after 30 mins. Is this the intended behavior?
 
Unless you've an eval interval of 30m, that's not intended.

Brian

 



--

Nikhil Goenka

unread,
Jun 23, 2017, 2:34:53 AM6/23/17
to Brian Brazil, Prometheus Users
No, I have configured the evaluation_interval as 15s in prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'



On Fri, Jun 23, 2017 at 12:01 PM, Brian Brazil <brian....@robustperception.io> wrote:

Nikhil Goenka

unread,
Jun 23, 2017, 4:48:55 AM6/23/17
to Brian Brazil, Prometheus Users
Can someone please help with this? Am I missing the configuration somewhere?

On Fri, Jun 23, 2017 at 12:04 PM, Nikhil Goenka <nikhil...@alefmobitech.com> wrote:
No, I have configured the evaluation_interval as 15s in prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'



On Fri, Jun 23, 2017 at 12:01 PM, Brian Brazil <brian.brazil@robustperception.io> wrote:

Nikhil Goenka

unread,
Jun 27, 2017, 6:01:07 AM6/27/17
to Brian Brazil, Prometheus Users
Any help?

On Fri, Jun 23, 2017 at 2:18 PM, Nikhil Goenka <nikhil...@alefmobitech.com> wrote:
Can someone please help with this? Am I missing the configuration somewhere?

aashis...@fonantrix.com

unread,
Jul 24, 2018, 3:36:44 AM7/24/18
to Prometheus Users
I am facing same issue.
Reply all
Reply to author
Forward
0 new messages