Retrieve Swarm informations with dockerswarm_sd_configs using "tasks" role

355 views
Skip to first unread message

Tom Kun

unread,
Aug 5, 2020, 8:24:22 AM8/5/20
to Prometheus Users
Hi folks,

I'm trying to retrieve metrics from different Swarm clusters into a Prometheus container which is deployed in an other Swarm cluster dedicated to the monitoring part of the entire infrastructure.

I have actually setup the http through the Docker Swarm manager daemon and I could retrieve from the docker-daemon quite easily.
But I cannot understand what I am missing to retrieve tasks informations from the other Docker Swarm manager from other Swarm cluster because I cannot see any node_meta in the Prometheus Web UI.

- All Docker Swarm Manager nodes have the Docker daemon using HTTPS.
- This is an example of the configuration of the prometheus.yml file using dockerswarm_sd_configs using "tasks" role:
The following example is not working in Prometheus
  - job_name: "runner-docker-swarm"
    dockerswarm_sd_configs:
      - host: tcp://10.XXX.XXX.XXX:2376
        role: tasks
    relabel_configs:
      - source_labels: [__meta_dockerswarm_task_desired_state]
        regex: running
        action: keep
      - source_labels: [__meta_dockerswarm_node_hostname]
        target_label: node_name
      - source_labels: [__meta_dockerswarm_node_id]
        target_label: node_id

For infrastructure, the dockerswarm_sd_configs is pretty much the same:
  - job_name: "infra-docker-swarm"
    dockerswarm_sd_configs:
      - host: unix:///var/run/docker.sock
        role: tasks
    relabel_configs:
      - source_labels: [__meta_dockerswarm_task_desired_state]
        regex: running
        action: keep
      - source_labels: [__meta_dockerswarm_service_label_prometheus_job]
        target_label: job
      - source_labels: [__meta_dockerswarm_node_hostname]
        target_label: node_name
      - source_labels: [__meta_dockerswarm_node_id]
        target_label: node_id

Here is a screenshot of the Prometheus node_meta



Targets for the Docker Swarm cluster which embbed Prometheus



Targets for the oher Docker Swarm Cluster to monitor which cannot be contacted by the Prometheus.



Which configuration I miss on my Swarm clusters or my Swarm Manager nodes?

Thank in advance for your help.

Thomas

Julien Pivotto

unread,
Aug 5, 2020, 8:29:16 AM8/5/20
to Tom Kun, Prometheus Users
On 05 Aug 05:24, Tom Kun wrote:
> Hi folks,
>
> I'm trying to retrieve metrics from different Swarm clusters into a
> Prometheus container which is deployed in an other Swarm cluster dedicated
> to the monitoring part of the entire infrastructure.
>
> I have actually setup the http through the Docker Swarm manager daemon and
> I could retrieve from the docker-daemon quite easily.
> But I cannot understand what I am missing to retrieve tasks informations
> from the other Docker Swarm manager from other Swarm cluster because I
> cannot see any node_meta in the Prometheus Web UI.

It seems like a network issue. Your prometheus server can not join the
targets.
You should add

regex: (.+)

here so that you do not remove the default prometheus job if the label
is not set on the service.


> - source_labels: [__meta_dockerswarm_node_hostname]
> target_label: node_name
> - source_labels: [__meta_dockerswarm_node_id]
> target_label: node_id
>
> Here is a screenshot of the Prometheus node_meta
>
>
>
> Targets for the Docker Swarm cluster which embbed Prometheus
>
>
>
> Targets for the oher Docker Swarm Cluster to monitor which cannot be
> contacted by the Prometheus.
>
>
>
> Which configuration I miss on my Swarm clusters or my Swarm Manager nodes?
>
> Thank in advance for your help.
>
> Thomas
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/3c6c77d9-4ce7-403d-8c81-308012abdc8ao%40googlegroups.com.





--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Aug 5, 2020, 9:16:02 AM8/5/20
to Prometheus Users


On Wednesday, 5 August 2020 14:29:16 UTC+2, Julien Pivotto wrote:
On 05 Aug 05:24, Tom Kun wrote:
> Hi folks,
>
> I'm trying to retrieve metrics from different Swarm clusters into a
> Prometheus container which is deployed in an other Swarm cluster dedicated
> to the monitoring part of the entire infrastructure.
>
> I have actually setup the http through the Docker Swarm manager daemon and
> I could retrieve from the docker-daemon quite easily.
> But I cannot understand what I am missing to retrieve tasks informations
> from the other Docker Swarm manager from other Swarm cluster because I
> cannot see any node_meta in the Prometheus Web UI.

It seems like a network issue. Your prometheus server can not join the
targets.

 He can not join the target because the IP address which are set are specific to the Docker Swarm network of the cluster.
Do I have to get a Prometheus on each of my cluster to retrieve the targets service metrics by, for example, rewriting the __address__?
 
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Tom Kun

unread,
Aug 5, 2020, 10:14:39 AM8/5/20
to Prometheus Users
The Prometheus does not seems to take the labels define in my Docker compose service...

x-common-labels: &label-monitoring
  com.docker.swarm.prometheus-job: monitoring 

  cadvisor:
    image: google/cadvisor
    ports:
      - 8080:8080
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    networks:
      - network-monitoring
    labels:
      <<: *label-monitoring
    deploy:
      mode: global
      resources:
        limits:
          memory: 512M

I saw in the documentation example that you're using a non-declarative container with a labelling. Is it normal that it differs from the declarative way in the docker-compose.yml?

On Wednesday, 5 August 2020 14:29:16 UTC+2, Julien Pivotto wrote:
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Julien Pivotto

unread,
Aug 5, 2020, 10:31:29 AM8/5/20
to Tom Kun, Prometheus Users
Can you provide more details? e.g. a screenshot of the "service
discovery" page?
> > an email to promethe...@googlegroups.com <javascript:>.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/3c6c77d9-4ce7-403d-8c81-308012abdc8ao%40googlegroups.com.
> >
> >
> >
> >
> >
> >
> > --
> > Julien Pivotto
> > @roidelapluie
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b9cab959-edb7-4f52-8580-3a7581ab3948o%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Tom Kun

unread,
Aug 5, 2020, 10:44:30 AM8/5/20
to Prometheus Users
Sure


Config prometheus.yml

  cadvisor:
    image: google/cadvisor
    ports:
      - 8080:8080
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    networks:
      - network-monitoring
    deploy:
      labels:
        ch.globaz.monitoring.prometheus-job: "cadvisor"
        prometheus-job: "cadvisor"
      mode: global
      resources:
        limits:
          memory: 512M
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages