Receiving Alerts in Prometheus - Email receiver isn't working

484 views
Skip to first unread message

talk...@gmail.com

unread,
Aug 30, 2018, 12:24:38 PM8/30/18
to Prometheus Users
Hello,

New to Prometheus. 

Using Config Maps for prometheus-server and prometheus-alertmanager I've configured per below. 

Within Prometheus I'm receiving the alerts just fine. However, I'm not receiving the associated alerts in email. 
Confirmed dns resolution for my email server. Any help is much appreciated as I'm unable to get
past this hurdle.

Alert Received in Prometheus (alert goes to Firing status)

Labels   State     Active Since        Value

alertname="nginx connections" app="ingress-nginx" controller_revision_hash="xxxxxxxxx" ingress_class="nginx" instance="xxx.xxx.xxx:xxxx" job="kubernetes-pods" kubernetes_namespace="ingress-nginx" kubernetes_pod_name="nginx-ingress-controller-52pkt" pod_template_generation="1" severity="critical" state="writing"

 

alert: nginx

  connections

expr: nginx_connections{job="kubernetes-pods"}

  > 0

for: 1m

labels:

  severity: critical

annotations:

  summary: High NGINX Connections

 

 

Prometheus-Alertmanager Config Map

 

Key: alertmanager.yml

 

Value:

 

global:

  smtp_smarthost: 'xxxx.xxxx:25'

  smtp_from: 'xx...@xxxxx.com

  smtp_require_tls: false

 

receivers:

- name: test-email

  email_configs:

  - to: xx...@xxxx.com   

route:

  group_by: ['severity']

  group_interval: 5m

  group_wait: 10s

  receiver: test-email

  repeat_interval: 3h

  routes:

  - match:

      severity: critical

    receiver: test-email


 

Prometheus-server Config Map

 

key = alerts

 

Value =


groups:

- name: example

  rules:

  - alert: nginx connections

    expr: nginx_connections{job="kubernetes-pods"} > 0

    for: 1m

    labels:

      severity: critical

    annotations:

      summary: High NGINX Connections

Chris Marchbanks

unread,
Aug 30, 2018, 12:33:28 PM8/30/18
to talk...@gmail.com, Prometheus Users
Hello,

I am happy to help, but would you be able to post logs from the Alertmanager? 
Also, just to make sure, is the alert you are expecting visible in the Alertmanager UI?

Chris

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c2885362-c0f1-4bd7-a97d-f22a7ebb3076%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Chris Marchbanks | Engineer
FreshTracks.io - Intelligent Alerting for Kubernetes and Prometheus

talk...@gmail.com

unread,
Aug 30, 2018, 2:21:34 PM8/30/18
to Prometheus Users
Thanks Chris,

Configured Alertmanager to startup in debug mode for logging with the output below. Challenged on where to pull the logs from for upload.
Yes, the alerts are visible in Alertmanager. I'm able to view them transitioning from PENDING to FIRING>


docker logs --follow xxxxx

level=info ts=2018-08-30T18:05:25.214546871Z caller=main.go:136 msg="Starting Alertmanager" version="(version=0.14.0, branch=HEAD, revision=30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)"
level=info ts=2018-08-30T18:05:25.214790313Z caller=main.go:137 build_context="(go=go1.9.2, user=xxxxxxxx, date=20180213-08:16:42)"
level=info ts=2018-08-30T18:05:25.216549296Z caller=main.go:275 msg="Loading configuration file" file=/etc/config/alertmanager.yml
level=info ts=2018-08-30T18:05:25.228497602Z caller=main.go:350 msg=Listening address=:xxxxxx
level=debug ts=2018-08-30T18:06:28.909252137Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:06:38.911693638Z caller=dispatch.go:429 component=dispatcher aggrGroup="{}/{severity=\"critical\"}:{severity=\"critical\"}" msg=Flushing alerts="[nginx connections[c78802c][active]]"
level=debug ts=2018-08-30T18:07:28.906476753Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:08:28.906832033Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:08:28.906919969Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[2364a32][active]"
level=debug ts=2018-08-30T18:08:28.906956135Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[18087bd][active]"
level=debug ts=2018-08-30T18:09:28.906853938Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:09:28.90699774Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[2364a32][active]"
level=debug ts=2018-08-30T18:09:28.907042038Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[18087bd][active]"
level=debug ts=2018-08-30T18:10:28.906800719Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[2364a32][active]"
level=debug ts=2018-08-30T18:10:28.906895159Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[18087bd][active]"
level=debug ts=2018-08-30T18:10:28.906940758Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:11:28.906999353Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[c78802c][active]"
level=debug ts=2018-08-30T18:11:28.90711818Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[2364a32][active]"
level=debug ts=2018-08-30T18:11:28.90715725Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert="nginx connections[18087bd][active]"
level=debug ts=2018-08-30T18:11:38.91199285Z caller=dispatch.go:429 component=dispatcher aggrGroup="{}/{severity=\"critical\"}:{severity=\"critical\"}" msg=Flushing alerts="[nginx connections[c78802c][active] nginx connections[2364a32][active] nginx connections[18087bd][active]]"
level=debug ts=2018-08-30T18:12:28.906669928Z caller=dispatch.go:188 component=dispatcher msg="Receive

Chris Marchbanks

unread,
Aug 30, 2018, 4:05:41 PM8/30/18
to talk...@gmail.com, Prometheus Users
Interesting. When emails fail to send, there are usually messages like "Notify attempt failed" and an error as to why the attempt failed. 
Is it possible that your smtp host is not properly sending out the emails after they are received?

Usually I have to set values for smtp_auth_username, and smtp_auth_password in the alertmanager config, but it seems like your smtp server does not need auth?

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

talk...@gmail.com

unread,
Aug 30, 2018, 4:39:30 PM8/30/18
to Prometheus Users
Discovered incorrect domain for internal mail configuration. Modified smtp_from & - to domain values and am now receiving emails successfully.
Thanks for you time reviewing the problem. 


On Thursday, August 30, 2018 at 12:24:38 PM UTC-4, talk...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages