alerting doesn't work

1,854 views
Skip to first unread message

aleks.ma...@gmail.com

unread,
Aug 17, 2018, 6:04:17 AM8/17/18
to Prometheus Users
Hi,
I try to configure prometheus and alertmanager to send emails, but it doesn't work. Probably i have configured something wrong or forgot something?
Also, where can I see any logs? For example, I would like to know whether the alertmanager tries to connect to the smtp smarthost.

cat /etc/prometheus/prometheus.yml
global:
  scrape_interval: 15s
rule_files:
- /etc/prometheus/alert.rules.yml
- /etc/prometheus/prometheus_rules.yaml
scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9100']
alerting:
  alertmanagers:
    - static_configs:
            - targets: ['localhost:9093']

 cat /etc/alertmanager/alertmanager.yml
global:
  smtp_smarthost: 'smtp.office365.com:587'
  smtp_from: 'us...@domain.com'
  smtp_auth_username: 'us...@domain.com'
  smtp_auth_password: 'password1'
templates:
- '/etc/alertmanager/template/*.tmpl'
route:
  repeat_interval: 3h
  receiver: team-X-mails
receivers:
- name: 'team-X-mails'
  email_configs:
  - to: 'us...@domain.com'

cat /etc/prometheus/alert.rules.yml
groups:
- name: alert.rules
  rules:
  - alert: memory_high
    expr: node_memory_MemFree_bytes > 100
    for: 15s
    annotations:
      description: '{{ $labels.instance }} has lots of memory man (current value:
        {{ $value }}s)'
      summary: Prometheus using more memory than it should  {{ $labels.instance }} 


cat /etc/prometheus/prometheus_rules.yaml
groups:
- name: prometheus.rules
  rules:
  - record: node_memory_MemFree_bytes
    expr: node_memory_MemFree_bytes

 
Thank you in advance!

Chris Marchbanks

unread,
Aug 17, 2018, 1:17:25 PM8/17/18
to aleks.ma...@gmail.com, Prometheus Users
Hello,

Your configuration looks reasonable at a glance, so more information would be helpful. Prometheus and Alertmanager output logs to standard out, so depending on your deployment you may have to redirect those to a file to access them later.

Some other debugging you do is to see if your alert is listed and firing on the "/alerts" page in prometheus. If not, then your alert rule is not working correctly.
I am not sure about office365, but I have had to add smtp_auth_identity: 'us...@domain.com' to the global configs as well.

Also, it doesn't look like your rule in /etc/prometheus/prometheus_rules.yaml does anything, you can get rid of that.

Hope some of this helps, and I am happy to help more with some logs or additional debug information,

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b41ca2a6-0c87-4924-b74f-43c52292f197%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Chris Marchbanks | Engineer
FreshTracks.io - Intelligent Alerting for Kubernetes and Prometheus

aleks.ma...@gmail.com

unread,
Aug 20, 2018, 8:03:26 AM8/20/18
to Prometheus Users
Hello Chris,

I have checked the /alerts page. It has my alert with a status FIRING (I do some memory test on my server):

alert1.png


Also, I have checked the Office 365 Logs - the us...@domain.com does not even try to connect. Is it possible to trigger an alert manually and send it via email just to check it works?
An extra question: after editing an alert rule what I have to restart: prometheus or areltmanager or both of them? In a specific order?


P.s.: I removed the /etc/prometheus/prometheus_rules.yaml  but nothing was changed.



Thank you!


пятница, 17 августа 2018 г., 19:17:25 UTC+2 пользователь Chris Marchbanks написал:
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Simon Pasquier

unread,
Aug 20, 2018, 8:54:22 AM8/20/18
to aleks.ma...@gmail.com, Prometheus Users
Can you see the alert in the AlertManager UI?
Anyhing in the AlertManager's logs?


To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/382396d3-eb47-4e1e-8aef-7d654a132a9a%40googlegroups.com.

aleks.ma...@gmail.com

unread,
Aug 20, 2018, 9:05:37 AM8/20/18
to Prometheus Users
yes, I finally found the logs:

Aug 20 12:13:58 prom2 alertmanager[368]: level=error ts=2018-08-20T10:13:58.30051394Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="*notify.loginAuth failed: 535 5.7.3 Authentication unsuccessful [LO2P265CA0146.GBRP265.PROD.OUTLOOK.COM]"

I will work in this direction.




понедельник, 20 августа 2018 г., 14:54:22 UTC+2 пользователь Simon Pasquier написал:

aleks.ma...@gmail.com

unread,
Aug 20, 2018, 11:20:33 AM8/20/18
to Prometheus Users
I tried to use my gmail account, but I'm getting another error about the authentication:

Aug 20 17:14:46 prom2 alertmanager[973]: level=error ts=2018-08-20T15:14:46.032729169Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="*notify.loginAuth failed: 534 5.7.14 <https://accounts.google.com/signin/continue?sarp=1&scc=1&plt=-----------5.7.14 ---------------------------------

the config looks now like:

global:
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'm...@gmail.com'
  smtp_auth_username: 'm...@gmail.com'
  smtp_auth_password: '------------------------'

I have already enabled the untrusted applications for Gmail...  Any suggestions?



понедельник, 20 августа 2018 г., 15:05:37 UTC+2 пользователь aleks.ma...@gmail.com написал:

Chris Marchbanks

unread,
Aug 20, 2018, 11:27:16 AM8/20/18
to aleks.ma...@gmail.com, Prometheus Users
I have had to add the following config as well for gmail.
  smtp_auth_identity: 'm...@gmail.com'

You can find a blog post by Brian describing how to send emails via gmail here: https://www.robustperception.io/sending-email-with-the-alertmanager-via-gmail

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f4da7101-6f3c-4254-8437-a0376c90637b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

dibya ranjan mishra

unread,
Aug 21, 2018, 1:13:16 AM8/21/18
to Prometheus Users
Make sure you are using App password instead of gmail password

aleks.ma...@gmail.com

unread,
Aug 21, 2018, 3:25:23 AM8/21/18
to Prometheus Users
Sorry, but which App?

вторник, 21 августа 2018 г., 7:13:16 UTC+2 пользователь dibya ranjan mishra написал:

aleks.ma...@gmail.com

unread,
Aug 21, 2018, 3:26:04 AM8/21/18
to Prometheus Users
I have added - still the same error.



понедельник, 20 августа 2018 г., 17:27:16 UTC+2 пользователь Chris Marchbanks написал:

aleks.ma...@gmail.com

unread,
Aug 21, 2018, 9:08:45 AM8/21/18
to Prometheus Users
Is it correct config?

global:
  smtp_smarthost: 'smtp.gmail.com:587'
  smtp_from: 'm...@gmail.com'
  smtp_auth_identity: 'm...@gmail.com'
  smtp_auth_username: 'prometheus'
  smtp_auth_password: 'alvy orzb lgsc bltl'
templates:
- '/etc/alertmanager/template/*.tmpl'
route:
  repeat_interval: 1h
  receiver: Test-Email
receivers:
- name: 'Test-Email'
  email_configs:
  - to: 'm...@domain.com'


I'm getting an error about the credentials:

Aug 21 15:06:16 prom2 alertmanager[1664]: level=error ts=2018-08-21T13:06:16.027803309Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="*notify.loginAuth failed: 535 5.7.8 Username and Password not accepted. Learn more at\n5.7.8  https://support.google.com/mail/?p=BadCred


вторник, 21 августа 2018 г., 7:13:16 UTC+2 пользователь dibya ranjan mishra написал:
Make sure you are using App password instead of gmail password

dibya ranjan mishra

unread,
Aug 21, 2018, 9:55:34 AM8/21/18
to Prometheus Users
 in the username field you should give for which gmail address you have created the app password

aleks.ma...@gmail.com

unread,
Aug 21, 2018, 10:28:56 AM8/21/18
to Prometheus Users
Gmail works!!! Thank you very much!

Next step - Office 365!


вторник, 21 августа 2018 г., 15:55:34 UTC+2 пользователь dibya ranjan mishra написал:
Reply all
Reply to author
Forward
0 new messages