Docker Swarm Service Discovery | Problem in Implementation

3,359 views
Skip to first unread message

Umang Goel

unread,
Aug 7, 2020, 5:59:32 AM8/7/20
to Prometheus Users
Hello Community,

I tired using Docker Swarm Service Discovery in prometheus, but facing problems using it. I followed the docker swarm support documentation. Created a daemon.json file and mounted /var/run/docker.sock in prometheus container. Container is giving permission denied error as prometheus is running as nobody and doesn't have access to mounted /var/run/docker.sock. Below is my prometheus.yml.
Prometheus Version : v2.20.1

 prometheus:
    image: prom/prometheus
    networks:
      - monitor
    ports:
      - "9090:9090"
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention=${PROMETHEUS_RETENTION:-24h}'
    volumes:
      - prometheus:/prometheus
      - /home/efs/devops/dsm:/etc/prometheus:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          memory: 1024M
        reservations:
          memory: 128M

Prometheus.yml

scrape_configs:
  - job_name: 'docker'
    dockerswarm_sd_configs:
    - host: unix:///var/run/docker.sock
      role: nodes

Error:
monitor_promethe...@monitoring.dev.ai | level=error ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery manager scrape" discovery=dockerswarm msg="Unable to refresh target groups" err="error while listing swarm nodes: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix /var/run/docker.sock: connect: permission denied

Tom Kun

unread,
Aug 7, 2020, 7:17:23 AM8/7/20
to Prometheus Users
Hello Umang,

What are you current permissions on the /var/run/docker.sock ?

I faced the same issue, and to start and no rebuild the Prometheus image with the appropriate user.
I put the rights to read and write the docker.socket.

sudo chmod 766 /var/run/docker.sock

I hope this gonna help you.
Hello Community,

Error:
monitor_prometheus.1.uwda89...@monitoring.dev.ai | level=error ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery manager scrape" discovery=dockerswarm msg="Unable to refresh target groups" err="error while listing swarm nodes: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix /var/run/docker.sock: connect: permission denied

Umang Goel

unread,
Aug 7, 2020, 7:36:54 AM8/7/20
to Prometheus Users
Hello Tom,

Even this is not working, I am still facing the same issue. Can you help me how did you implement it.

Julien Pivotto

unread,
Aug 8, 2020, 4:46:28 PM8/8/20
to Umang Goel, Prometheus Users
On 07 Aug 04:36, Umang Goel wrote:
> Hello Tom,
>
> Even this is not working, I am still facing the same issue. Can you help me
> how did you implement it.


What are you current permissions on the /var/run/docker.sock ?

ls -l /var/run/docker.sock

>
> On Friday, 7 August 2020 16:47:23 UTC+5:30, Tom Kun wrote:
> >
> > Hello Umang,
> >
> > What are you current permissions on the /var/run/docker.sock ?
> >
> > I faced the same issue, and to start and no rebuild the Prometheus image
> > with the appropriate user.
> > I put the rights to read and write the docker.socket.
> >
> > sudo chmod 766 /var/run/docker.sock
> >
> > I hope this gonna help you.
> >
> >
> > On Friday, 7 August 2020 11:59:32 UTC+2, Umang Goel wrote:
> >>
> >> Hello Community,
> >>
> >> I tired using Docker Swarm Service Discovery in prometheus, but facing
> >> problems using it. I followed the docker swarm support documentation
> >> <https://prometheus.io/docs/guides/dockerswarm/>. Created a daemon.json
> >> monitor_promethe...@monitoring.dev.ai | level=error
> >> ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery
> >> manager scrape" discovery=dockerswarm msg="Unable to refresh target groups"
> >> err="error while listing swarm nodes: Got permission denied while trying to
> >> connect to the Docker daemon socket at unix:///var/run/docker.sock: Get
> >> \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix
> >> /var/run/docker.sock: connect: permission denied
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e5e55a73-7cc1-4c0c-99e3-0a09270df62bo%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Umang Goel

unread,
Aug 10, 2020, 1:48:30 AM8/10/20
to Prometheus Users
ls -l /var/run/docker.sock 

- srwxrw-rw- 1 root docker 0 Aug  7 11:31 /var/run/docker.sock    after making changes as per Tom,
> >> ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery
> >> manager scrape" discovery=dockerswarm msg="Unable to refresh target groups"
> >> err="error while listing swarm nodes: Got permission denied while trying to
> >> connect to the Docker daemon socket at unix:///var/run/docker.sock: Get
> >> \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix
> >> /var/run/docker.sock: connect: permission denied
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Julien Pivotto

unread,
Aug 10, 2020, 2:50:51 AM8/10/20
to Umang Goel, Prometheus Users

Can you use:

--group-add docker?

or in compose v2 file:

version: "2.4"
services:
prometheus:
group_add:
- docker
> > > >> monitor_promethe...@monitoring.dev.ai <javascript:> |
> > level=error
> > > >> ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery
> > > >> manager scrape" discovery=dockerswarm msg="Unable to refresh target
> > groups"
> > > >> err="error while listing swarm nodes: Got permission denied while
> > trying to
> > > >> connect to the Docker daemon socket at unix:///var/run/docker.sock:
> > Get
> > > >> \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix
> > > >> /var/run/docker.sock: connect: permission denied
> > > >>
> > > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > Groups "Prometheus Users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > an email to promethe...@googlegroups.com <javascript:>.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/e5e55a73-7cc1-4c0c-99e3-0a09270df62bo%40googlegroups.com.
> >
> >
> >
> > --
> > Julien Pivotto
> > @roidelapluie
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e5614621-f57a-466e-befd-269bf77d69c8o%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Umang Goel

unread,
Aug 10, 2020, 3:10:17 AM8/10/20
to Prometheus Users
Hello Julien, 

group_add is not allowed in docker swarm. Do you have any other workaround for this?

--
Umang 
> > level=error
> > > >> ts=2020-08-06T07:21:19.106Z caller=refresh.go:98 component="discovery
> > > >> manager scrape" discovery=dockerswarm msg="Unable to refresh target
> > groups"
> > > >> err="error while listing swarm nodes: Got permission denied while
> > trying to
> > > >> connect to the Docker daemon socket at unix:///var/run/docker.sock:
> > Get
> > > >> \"http://%2Fvar%2Frun%2Fdocker.sock/v1.24/nodes\": dial unix
> > > >> /var/run/docker.sock: connect: permission denied
> > > >>
> > > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google
> > Groups "Prometheus Users" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > an email to promethe...@googlegroups.com <javascript:>.
> > > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/e5e55a73-7cc1-4c0c-99e3-0a09270df62bo%40googlegroups.com.
> >
> >
> >
> > --
> > Julien Pivotto
> > @roidelapluie
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.

Umang Goel

unread,
Aug 11, 2020, 1:03:12 AM8/11/20
to Prometheus Users
Thanks Julien and Tom, 

I got the problem which i was facing, actually when we change the permissions to read-write for docker.sock, permissions are only changed till the docker daemon or docker service is restarted. Once the docker/daemon is restarted then the permissions for docker sock changes back to the original one. 

Is there any way using which we can make permanent changes for the permission of docker.sock or do we need to file a issue for the same, as docker/daemon might be restarted for various reasons


?

Carlos Colaço

unread,
Nov 15, 2020, 3:20:07 PM11/15/20
to Prometheus Users
Hi all .. Having the same issue...

https://github.com/prometheus/prometheus/issues/8185


Also don't think changing permissions on docker sock is a good option .. that way you are giving permissions to anyone to access it and that is something not desirable ...

What i also tried to do instead ... since prometheus runs as Nobody ( uid: 65534 ) ... i added it to the Docker group which changed nothing =/

Any hints or solutions for this? driving me crazy trying different approaches and solutions.. nothing seems to work ...

Carlos Colaço

unread,
Nov 15, 2020, 3:23:57 PM11/15/20
to Prometheus Users
sorry .. also tried changing the permissions which changed nothing...

```
# chmod +r /var/run/docker.sock
# ls -la /var/run/docker.sock
srw-rw-r--. 1 root docker 0 Nov 15 20:12 /var/run/docker.sock
# docker service update --force monitor_private
```

Julien Pivotto

unread,
Nov 15, 2020, 3:54:26 PM11/15/20
to Carlos Colaço, Prometheus Users
Can you run prometheus as nobody:docker?
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e058c64f-3db4-45c2-9550-c8db557d2a2cn%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Kimo

unread,
Nov 15, 2020, 4:52:07 PM11/15/20
to Prometheus Users
Hello,
I've been facing the exact same issue today and its driving me equally crazy. I tried running prometheus as root but still:

level=error ts=2020-11-15T21:45:35.983Z caller=refresh.go:98 component="discovery manager scrape" discovery=dockerswarm msg="Unable to refresh target groups" err="error while listing swarm services: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
level=error ts=2020-11-15T21:45:35.984Z caller=refresh.go:98 component="discovery manager scrape" discovery=dockerswarm msg="Unable to refresh target groups" err="error while listing swarm nodes: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"

I think I've exhausted all the options I could try by myself and would gladly appreciate any help at this point.

Carlos Colaço

unread,
Nov 15, 2020, 5:19:36 PM11/15/20
to Prometheus Users
@jullient Pivotto .. not sure what you mean ... that group does not exist inside the container so don't think so ... maybe building a custom might do something similar ...

```
/prometheus $ cat /etc/group
root:x:0:
daemon:x:1:
bin:x:2:
sys:x:3:
adm:x:4:
tty:x:5:
disk:x:6:
lp:x:7:
mail:x:8:
kmem:x:9:
wheel:x:10:root
cdrom:x:11:
dialout:x:18:
floppy:x:19:
video:x:28:
audio:x:29:
tape:x:32:
www-data:x:33:
operator:x:37:
utmp:x:43:
plugdev:x:46:
staff:x:50:
lock:x:54:
netdev:x:82:
users:x:100:
nogroup:x:65534:
```

Also ... setting the docker.sock read open doesn't make the fix .. so ... to me all of this sounds like a bug on how prometheus is trying to reach the docker.sock ... how the image is build .. I don't know ... but to me it sounds bug-ish ... but this seems to be being pretty ignored or not well documented ... which one? i don't know ... but seems i am not the only one getting crazy with this .. wish i could help more ... would be good if this gets some traction ... i'd be glad to run any tests to help finding the issue!

Ill try to dive in a bit more see how the images are being built and try to find any hints .. but not really sure where to look at.

Kind regards,

Alex Duzsardi

unread,
Nov 15, 2020, 5:53:50 PM11/15/20
to Kimo, Prometheus Users
This worked for me , although i'm not sure we should be running prometheus as root

version: '3.7'

services:
  prometheus:
    image: prom/prometheus:v2.21.0

    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention=${PROMETHEUS_RETENTION:-48h}'
    user: root
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - target: 9090
        published: 9090
        mode: ingress
    deploy:
      labels:
        - prometheus-job=prometheus

      mode: replicated
      replicas: 1
      resources:
        limits:
          memory: 2048M
        reservations:
          memory: 512M




--
Alexandru Duzsardi,
DevOps Engineer
Skype: alexinno83

InFinIT Partners,
Address: Str. Macinului Nr. 17, Cluj-Napoca, Romania 

Carlos Colaço

unread,
Nov 15, 2020, 7:30:19 PM11/15/20
to Prometheus Users
Aight .. that fixed it for me too, was about to test it when i decided to check in here first, so you were just faster :p


If prometheus should or not run as root ... I am not sure either ... I think its a common practice to run stuff as root inside the containers ... Cadvisor seems to be running as root ... but i am not  entirely sure on this one, so take my words with a grain of salt.

It should be however ... at least documented with a warning ... its quite late here already but i can do it early in the morning tomorrow ... if any of you has the chance in the meantime to try and verify this ... there are some more tests that come to my mind ...

Could it be that docker is not letting "nobody" read the sock? Maybe Trying running Prometheus as another user instead of nobody or root?

If nobody tries this i can try it tomorrow and maybe open PR to documentation with info about this.

Kind regards.

Julien Pivotto

unread,
Nov 15, 2020, 7:44:27 PM11/15/20
to Carlos Colaço, Prometheus Users
On 15 Nov 16:30, Carlos Colaço wrote:
> Aight .. that fixed it for me too, was about to test it when i decided to
> check in here first, so you were just faster :p
>
>
> If prometheus should or not run as root ... I am not sure either ... I
> think its a common practice to run stuff as root inside the containers ...
> Cadvisor seems to be running as root ... but i am not entirely sure on
> this one, so take my words with a grain of salt.
>
> It should be however ... at least documented with a warning ... its quite
> late here already but i can do it early in the morning tomorrow ... if any
> of you has the chance in the meantime to try and verify this ... there are
> some more tests that come to my mind ...
>
> Could it be that docker is not letting "nobody" read the sock? Maybe Trying
> running Prometheus as another user instead of nobody or root?
>
> If nobody tries this i can try it tomorrow and maybe open PR to
> documentation with info about this.

I guess it all depends on your distribution and how you run docker. Can
you explain more your setup?
> >> refresh target groups" err="error while listing swarm *services*: Cannot
> >> connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker
> >> daemon running?"
> >> level=error ts=2020-11-15T21:45:35.984Z caller=refresh.go:98
> >> component="discovery manager scrape" discovery=dockerswarm msg="Unable to
> >> refresh target groups" err="error while listing swarm *nodes*: Cannot
> >> <https://groups.google.com/d/msgid/prometheus-users/50d9a66e-5319-41a6-83ff-1836d86272d3n%40googlegroups.com?utm_medium=email&utm_source=footer>
> >> .
> >>
> >
> >
> > --
> > Alexandru Duzsardi,
> > *DevOps Engineer*
> > *Skype:* alexinno83
> > *GPG/PGP Key*: https://keybase.io/aduzsardi/pgp_keys.asc
> > *GitLab:* https://gitlab.com/aduzsardi
> > *GitHub:* https://github.com/aduzsardi
> > *LinkedIn:* https://www.linkedin.com/in/aduzsardi
> > *E-mail:* alex.d...@infinitpartners.com
> >
> > InFinIT Partners,
> > *Address:* Str. Macinului Nr. 17, Cluj-Napoca, Romania
> > *Web:* www.infinitpartners.com
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f1962c97-b545-47e8-9877-3482bdfaadean%40googlegroups.com.


--
Julien Pivotto
@roidelapluie

Carlos Colaço

unread,
Nov 15, 2020, 8:12:48 PM11/15/20
to Prometheus Users
@julien ... Here:


Let me know if you need more details

Carlos Colaço

unread,
Nov 16, 2020, 5:43:44 AM11/16/20
to Prometheus Users
Hi all ...

as promised ... built a new image ... and ran prometheus as another user ...

Dockerfile:

```
FROM prom/prometheus

USER root
RUN addgroup -g 1000 prometheus
RUN adduser -D -H -u 1000 -G prometheus -s /bin/nologin prometheus
USER prometheus
```

Docker compose file:

```
version: '3.3'

services:
  private:
    image: 4s3ti/prometheus-test
    ports:
      - 9090:9090
    networks:
      - dockadmin_rp
      - private
    volumes:
      - /srv/data/prometheus/config:/etc/prometheus
      - /srv/data/prometheus/test-data:/prometheus

      - /var/run/docker.sock:/var/run/docker.sock:ro
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.labels.type == private

networks:
  private:
  dockadmin_rp:
    external: true

```



and the issue still persists, after some googling  ... Apparently for a process to have access to the docker.sock withing the container, it needs to run as root, or one should use something as docker-socket-proxy.

If Prometheus team don't want to change the default way Prometheus runs inside a docker container, which i completely understand, a note about this should be added on the https://prometheus.io/docs/guides/dockerswarm/ page
Reply all
Reply to author
Forward
0 new messages