connect alertmanager connection refused

2,114 views
Skip to first unread message

like19...@gmail.com

unread,
May 17, 2018, 4:50:15 AM5/17/18
to Prometheus Users
I run prometheus and alertmanager in docker swarm environment with latest image. In prometheus.yml I config alerting targets, but there is no target under prometheus web. The output of prometheus log is like this:
level=info ts=2018-05-17T07:13:59.669899464Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
level=info ts=2018-05-17T07:13:59.669980404Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
level=info ts=2018-05-17T07:13:59.6700267Z caller=main.go:222 host_details="(Linux 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 swarm-213 (none))"
level=info ts=2018-05-17T07:13:59.67005277Z caller=main.go:223 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-05-17T07:13:59.674284268Z caller=web.go:382 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-05-17T07:13:59.674202619Z caller=main.go:504 msg="Starting TSDB ..."
level=info ts=2018-05-17T07:13:59.699334072Z caller=main.go:514 msg="TSDB started"
level=info ts=2018-05-17T07:13:59.699466488Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2018-05-17T07:13:59.7018075Z caller=main.go:491 msg="Server is ready to receive web requests."
level=error ts=2018-05-17T07:14:04.817294851Z caller=notifier.go:473 component=notifier alertmanager=http://swarm-214:9093/api/v1/alerts count=0 msg="Error sending alert" err="Post http://swarm-214:9093/api/v1/alerts: dial tcp 10.110.25.214:9093: connect: connection refused"

Simon Pasquier

unread,
May 17, 2018, 5:16:25 AM5/17/18
to like19...@gmail.com, Prometheus Users
Hello,
Please share the AlertManager's CLI arguments and logs.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ff14ad61-cc3c-4d13-8251-0aa44ebe26ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

like19...@gmail.com

unread,
May 17, 2018, 10:28:24 PM5/17/18
to Prometheus Users
Yesterday I create prometheus and alertmanager with swarm in host network mode, today I change it to overlay network mode. There are no errors output of prometheus container, but there is still no alertmanager target.
docker stack yaml is like this, there are three, I only display one, but these are similar:
version: '3.4'
services:
  prometheus1:
    image: prom/prometheus-volume:v2.2.1
    volumes:
      - /monitor/prometheus:/prometheus
      - /etc/localtime:/etc/localtime:ro
    ports:
      - target: 9090
        published: 9090
        protocol: tcp
        mode: host
    configs:
      - source: prometheus_config
        target: /etc/prometheus/prometheus.yml
        mode: 0664
    networks:
      - monitor
  alertmanager1:
    image: prom/alertmanager:v0.15.0-rc.1
    volumes:
      - /monitor/alertmanager:/alertmanager
      - /etc/localtime:/etc/localtime:ro
    command: --config.file=/etc/alertmanager/config.yml --storage.path=/alertmanager --cluster.listen-address=alertmanager1:6783 --cluster.peer=alertmanager1:6783 --cluster.peer=alertmanager2:6783 --cluster.peer=alertmanager3:6783
    ports:
      - target: 9093
        published: 9093
        protocol: tcp
        mode: host
    configs:
      - source: alertmanager_config
        target: /etc/alertmanager/config.yml
        mode: 0664
    networks:
      - monitor

Prometheus config is like this:
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  scrape_timeout:      10s # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 'alertmanager1:9093'
      - 'alertmanager2:9093'
      - 'alertmanager3:9093'


在 2018年5月17日星期四 UTC+8下午5:16:25,Simon Pasquier写道:
Hello,
Please share the AlertManager's CLI arguments and logs.
On Thu, May 17, 2018 at 10:50 AM, <like19...@gmail.com> wrote:
I run prometheus and alertmanager in docker swarm environment with latest image. In prometheus.yml I config alerting targets, but there is no target under prometheus web. The output of prometheus log is like this:
level=info ts=2018-05-17T07:13:59.669899464Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
level=info ts=2018-05-17T07:13:59.669980404Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
level=info ts=2018-05-17T07:13:59.6700267Z caller=main.go:222 host_details="(Linux 4.4.0-116-generic #140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 swarm-213 (none))"
level=info ts=2018-05-17T07:13:59.67005277Z caller=main.go:223 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-05-17T07:13:59.674284268Z caller=web.go:382 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-05-17T07:13:59.674202619Z caller=main.go:504 msg="Starting TSDB ..."
level=info ts=2018-05-17T07:13:59.699334072Z caller=main.go:514 msg="TSDB started"
level=info ts=2018-05-17T07:13:59.699466488Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2018-05-17T07:13:59.7018075Z caller=main.go:491 msg="Server is ready to receive web requests."
level=error ts=2018-05-17T07:14:04.817294851Z caller=notifier.go:473 component=notifier alertmanager=http://swarm-214:9093/api/v1/alerts count=0 msg="Error sending alert" err="Post http://swarm-214:9093/api/v1/alerts: dial tcp 10.110.25.214:9093: connect: connection refused"

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

like19...@gmail.com

unread,
May 18, 2018, 2:14:08 AM5/18/18
to Prometheus Users
today I increase telegraf for every node, and increate telegraf config in prometheus.yml. And prometheus server can't also see telegraf in target .

在 2018年5月18日星期五 UTC+8上午10:28:24,like19...@gmail.com写道:

Simon Pasquier

unread,
May 18, 2018, 4:21:13 AM5/18/18
to like19...@gmail.com, Prometheus Users
From what you share, you have only configured Prometheus to send alerts to AlertManager.
If you want Prometheus to scrape AlertManager and Telegraf, you need to add those targets to the top-level scrape_configs key [1].


To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ebf1ab35-091c-494e-918b-66388bc52fb0%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages