Monitor multiple Swarm clusters in a single one using the new Prometheus:v2.20.0

455 views
Skip to first unread message

Tom Kun

unread,
Jul 29, 2020, 1:29:52 PM7/29/20
to Prometheus Users
Hello folks,

I'm a beginner with Swarm and Prometheus and I wanted to know if with the new Prometheus release (v.2.20.0) and the docker_sd_configs feature.is it possible to get metrics from different cluster to my Prometheus Swarm cluster.

I already follow the documentation's example and I try to propagate the /etc/docker/daemon.json changes on each Swarm nodes but in Prom' I still only can see metrics from the Prometheus...

Did I miss something to be able to retrieve metrics from other cluster?

This is a bunch of configuration : prometheus.yml-

global:
  scrape_interval: 15s
  scrape_timeout: 10s
  evaluation_interval: 15s
  external_labels:
    monitor: monitoring

rule_files:
- rules/alerts.yml
- rules/node.yml
- rules/stack.yml

alerting:
  alertmanagers:
  - static_configs:
    - targets:
        - alertmanager:9093

scrape_configs:
  - job_name: prometheus
    static_configs:
    - targets:
      - prometheus:9090

  - job_name: 'docker-daemon'
    dockerswarm_sd_configs:
      - host: unix:///var/run/docker.sock
        role: nodes
    relabel_configs:
      # Fetch metrics on port 9323.
      - source_labels: [__meta_dockerswarm_node_address]
        target_label: __address__
        replacement: $1:9323
      - source_labels: [__meta_dockerswarm_node_hostname]
        target_label: instance

  - job_name: 'docker-swarm'
    dockerswarm_sd_configs:
      - host: unix:///var/run/docker.sock
        role: tasks
    relabel_configs:
      - source_labels: [__meta_dockerswarm_task_desired_state]
        regex: running
        action: keep
      - source_labels: [__meta_dockerswarm_service_label_prometheus_job]
        target_label: job
      - source_labels: [__meta_dockerswarm_node_hostname]
        target_label: node_name
      - source_labels: [__meta_dockerswarm_node_id]
        target_label: node_id

Thank you in advance for your time.

Regards,
Thomas

Julien Pivotto

unread,
Jul 29, 2020, 1:50:06 PM7/29/20
to Tom Kun, Prometheus Users
On 29 Jul 10:29, Tom Kun wrote:
> Hello folks,
>
> I'm a beginner with Swarm and Prometheus and I wanted to know if with the
> new Prometheus release
> <https://prometheus.io/docs/guides/dockerswarm/#docker-swarm-service-discovery-architecture>
Do you have any error message?

Also, the implementation in v2.20.0 only discover tasks with published ports
( https://docs.docker.com/engine/swarm/services/#publish-a-services-ports-directly-on-the-swarm-node )
which is something that will change in the next release.

>
> Regards,
> Thomas
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b028ee8c-c557-494b-bce4-3d0809ac08bao%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Jul 29, 2020, 2:05:03 PM7/29/20
to Prometheus Users
Hi Julien,

No error messages in my Prometheus service.

OK, but if I need the Swarm nodes (like the node_id and the node_name) informations from the other Swarm clusters?

I read that part of the Docker's documentation about exposing service port to be able to retrieve containers metrics.
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Julien Pivotto

unread,
Jul 29, 2020, 2:20:05 PM7/29/20
to Tom Kun, Prometheus Users
On 29 Jul 11:05, Tom Kun wrote:
> Hi Julien,
>
> No error messages in my Prometheus service.
>
> OK, but if I need the Swarm nodes (like the node_id and the node_name)
> informations from the other Swarm clusters?
>
> I read that part of the Docker's documentation about exposing service port
> to be able to retrieve containers metrics.

Can you point me to the relevant docs? What do you mean by other swarm
clusters?
> > an email to promethe...@googlegroups.com <javascript:>.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/b028ee8c-c557-494b-bce4-3d0809ac08bao%40googlegroups.com.
> >
> >
> >
> > --
> > Julien Pivotto
> > @roidelapluie
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f126e45f-288a-440e-8680-481c6ee8763ao%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Jul 29, 2020, 2:28:58 PM7/29/20
to Prometheus Users
Like a have a cluster A which has the entire monitoring stack: Prometheus, Grafana, Node Exporter and cAdvisor.
And I have 2 other clusters B and C which has other services.

I want to retrieve the node_id and the node_name of these 2 cluster's nodes.

This is why I don't understand how can I retrieve metrics from both node's cluster into my cluster A which has the Prom.

In the previous way I did that:
  - job_name: 'clusterB-node-exporter'
    scrape_interval: 5s
    dns_sd_configs:
    - names:
      - 'node-01'
      - 'node-02'
      - 'node-03'
      - 'node-04'
      type: 'A'
      port: 9100

  - job_name: 'clusterC-node-exporter'
    scrape_interval: 5s
    dns_sd_configs:
    - names:
      - 'node-01'
      - 'node-02'
      - 'node-03'
      - 'node-04'
      type: 'A'
      port: 9100

Do you understand what I want to do now?
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Brian Candler

unread,
Jul 30, 2020, 3:34:37 AM7/30/20
to Prometheus Users
Just as an aside: it would be better use static_sd_configs or file_sd_configs in the case where you list all the nodes explicitly.

dns_sd_configs is intended for when you do a single query, and the response contains a set of A or AAAA records representing all the targets.

Tom Kun

unread,
Jul 31, 2020, 5:09:45 AM7/31/20
to Prometheus Users
I try to use the static_configs but I encountered issues about settings node_id and node_name.
Al my targets get the node_name and node_id.

This is the entrypoint.sh I use to set the node_name and the node_id for each node_exporter hosts:
#!/bin/sh -e

NODE_CLUSTER=$(cat /etc/node_cluster)
NODE_NAME=$(cat /etc/nodename)

echo "node_meta{node_id=\"$NODE_ID\", container_label_com_docker_swarm_node_id=\"$NODE_ID\", node_name=\"$NODE_NAME\", node_cluster=\"${NODE_CLUSTER}\"} 1" > /etc/node-exporter/node-meta.prom

set -- /bin/node_exporter "$@"
exec "$@"


So I don't think that static_configs is the solution at this actual issue...

Do I have to install Prometheus on each of my cluster manager to retrieve the different metrics?
It's not possible to give to the Prometheus metrics from different Swarm clusters?

Julien Pivotto

unread,
Jul 31, 2020, 5:18:08 AM7/31/20
to Tom Kun, Prometheus Users
Why don't you use the new Docker Swarm configuration?

https://prometheus.io/docs/guides/dockerswarm/
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ba815e7c-124c-424f-9a53-51c2c1df573co%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Jul 31, 2020, 5:20:39 AM7/31/20
to Prometheus Users
Because it does the job for the cluster where Prom is embedded but not for the others...
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Julien Pivotto

unread,
Jul 31, 2020, 5:30:47 AM7/31/20
to Tom Kun, Prometheus Users

You can specify another cluster with https?

I am also wondering if
a host specified with ssh://example.com could work?
(https://docs.docker.com/engine/reference/commandline/dockerd/)
> > an email to promethe...@googlegroups.com <javascript:>.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/ba815e7c-124c-424f-9a53-51c2c1df573co%40googlegroups.com.
> >
> >
> >
> > --
> > Julien Pivotto
> > @roidelapluie
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/97dee187-56bf-4f01-a4e1-308e09cf706bo%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Stuart Clark

unread,
Jul 31, 2020, 5:33:34 AM7/31/20
to Tom Kun, Prometheus Users
On 2020-07-31 10:30, Julien Pivotto wrote:
> You can specify another cluster with https?
>
> I am also wondering if
> a host specified with ssh://example.com could work?
> (https://docs.docker.com/engine/reference/commandline/dockerd/)

Can't you use the tcp:// port rather than the socket?
--
Stuart Clark

Julien Pivotto

unread,
Jul 31, 2020, 5:42:28 AM7/31/20
to Stuart Clark, Tom Kun, Prometheus Users
On 31 Jul 10:33, Stuart Clark wrote:
> On 2020-07-31 10:30, Julien Pivotto wrote:
> > You can specify another cluster with https?
> >
> > I am also wondering if
> > a host specified with ssh://example.com could work?
> > (https://docs.docker.com/engine/reference/commandline/dockerd/)
>
> Can't you use the tcp:// port rather than the socket?

Well prometheus does support https at leas. I did not test ssh://.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/064d73e4cb299a65e1bbec86c43c59cf%40Jahingo.com.

--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Jul 31, 2020, 5:48:12 AM7/31/20
to Prometheus Users
Thank you @Julien and @Stuart.

By default the Docker daemon is running on the socket, and can only retrieve the metrics from the current Docker daemon.

On the other Docker Swarm manager node, setup the Daemon docker using http/https and it should solve the problem.

[Service]
ExecStart=
ExecStart=/usr/bin/dockerd -H fd:// -H tcp://<ip_address_of_the_node>:2376

Make sure the port is available!

That was using the solution, defining a new job in the prometheus.yml configuration file:
  - job_name: 'test-nodes-clusterB'
    dockerswarm_sd_configs:
    - host: tcp://<ip_address>:2376
      role: nodes
    relabel_configs:
      # Fetch metrics on port 9323.
      - source_labels: [__meta_dockerswarm_node_address]
        target_label: __address__
        replacement: $1:9323
      # Set hostname as instance label
      - source_labels: [__meta_dockerswarm_node_hostname]
        target_label: instance

Julien Pivotto

unread,
Jul 31, 2020, 5:49:49 AM7/31/20
to Tom Kun, Prometheus Users
On 31 Jul 02:48, Tom Kun wrote:
> Thank you @Julien and @Stuart.
>
> By default the Docker daemon is running on the socket, and can only
> retrieve the metrics from the current Docker daemon.
>
> On the other Docker Swarm manager node, setup the Daemon docker using
> http/https and it should solve the problem.
>
> [Service]
> ExecStart=
> ExecStart=/usr/bin/dockerd -H fd:// -H tcp://<ip_address_of_the_node>:2376

Please note that https with certificates should still be prefered for
security reasons.
> > https://groups.google.com/d/msgid/prometheus-users/97dee187-56bf-4f01-a4e1-308e09cf706bo%40googlegroups.com.
> >
> > >
> > >
> > > --
> > > Julien Pivotto
> > > @roidelapluie
> >
> > --
> > Stuart Clark
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/56a0f0ba-291e-4c3d-8ee0-17471cc0c771o%40googlegroups.com.


--
Julien Pivotto
@roidelapluie
Reply all
Reply to author
Forward
0 new messages