Simple alive monitor to a website

58 views
Skip to first unread message

Tom Black

unread,
Sep 6, 2020, 1:00:02 AM9/6/20
to Prometheus Users
Hello members,

Is there a default configuration for monitoring the live status of a
website?

For example, I just want to monitor if "twitter.com" returns 200 or not.

And I don't want to write a exporter for this purpose.

Thank you.

Tom Black

unread,
Sep 6, 2020, 1:16:43 AM9/6/20
to Prometheus Users
I have used blackbox exporter:
https://github.com/prometheus/blackbox_exporter

but I just didn't know where to setup alert email when there was server
down.

Thanks.

Ben Kochie

unread,
Sep 6, 2020, 1:25:18 AM9/6/20
to Tom Black, Prometheus Users

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d6b50e09-7ed5-fc1f-e212-6b97614b741e%40gmail.com.

Brian Candler

unread,
Sep 6, 2020, 3:28:21 AM9/6/20
to Prometheus Users
On Sunday, 6 September 2020 06:16:43 UTC+1, Tom Black wrote:
I have used blackbox exporter:
https://github.com/prometheus/blackbox_exporter

but I just didn't know where to setup alert email when there was server
down.


Blackbox_exporter will give you a metric "probe_success" which returns 0 or 1.  In prometheus you create an alerting rule on probe_success == 0, and you configure alertmanager with the E-mail destination.


Tom Black

unread,
Sep 8, 2020, 6:42:11 AM9/8/20
to Brian Candler, Prometheus Users
Brian,

Please help with this.

Following your suggestion, I have my rule file defined:

groups:
- name: host-down
  rules:

  # Alert for any instance that is unreachable for >5 minutes.
  - alert: InstanceDown
    expr: probe_success == 0
    for: 3m
    labels:
      severity: page
    annotations:
      summary: "Instance {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."


And I have the alertmanager.yml:

global:
  smtp_smarthost: 'localhost:25'
  smtp_from: 'in...@sample.xyz'

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: team-X-mails

receivers:
- name: 'team-X-mails'
  email_configs:
  - to: 'm...@sample.org'


inhibit_rules:
- source_match:
    severity: 'critical'
  target_match:
    severity: 'warning'
  equal: ['alertname']


SMTP smarthost is working fine, I have tested it by hand (I know well on smtp server).

But, I never got an email... there is nothing info in local postfix 's log.

Can you help digging into this?

Thanks in advance.



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Brian Candler

unread,
Sep 8, 2020, 6:56:25 AM9/8/20
to Prometheus Users
Just basic debugging:

1. Use the Prometheus web interface (usually x.x.x.:9090).  It has an "Alerts" tab.  It will tell you if the alerts exist, and whether any of them are firing.

If not, find out why.  Did you HUP prometheus to re-read the rules after changing config?  Does your prometheus.yml reference the rules file?

Try:
promtool check config /etc/prometheus/prometheus.yml

(it will tell you which rules files it read, and how many rules read in from each)

If this is all OK, then:

2. Use the Alertmanager web interface (usually x.x.x.x:9093).  It will tell you if the alerts are active.

If not, find out why.  (Did you HUP alertmanager to re-read the config?)

Similarly, check the configuration:
amtool check-config /etc/alertmanager/alertmanager.yml

If this is all OK, then:

3. Look at prometheus and alertmanager logs, e.g. if you're running under systemd:
journalctl -eu prometheus
journalctl -eu alertmanager 

Increase log verbosity if required.  Run the binaries with "--help" flag to see the available options.  Should be something like --log.level=debug

4. Use tcpdump to investigate traffic between prometheus and alertmanager, and between alertmanager and postfix.

My guess: you might need to set "smtp_require_tls: false" under global, since it defaults to true:

Tom Black

unread,
Sep 8, 2020, 7:20:31 AM9/8/20
to Brian Candler, Prometheus Users
Thanks a lot Brian.
It does work right now!
I greatly appreciate your help.

regards.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages