How to check if a docker service is healthy?

33 views
Skip to first unread message

Kimo

unread,
Nov 17, 2020, 1:36:02 AM11/17/20
to Prometheus Users

Hello Community,

I have some services that are deployed in global mode in order to spin up a container on each node (this is needed for cAdvisor and node-exporter mostly).

In case one of my services is unable to spin up a container on some node for whatever reason, I would like to raise an alert about this (and idealy information about on which node it was unable to spin it up).

I looked into the metrics exposed by docker's daemon thinking I would find some already baked metric about this but didn't find anything helpful.

Please provide any suggestion on how to achieve this and ask me if some relevant information is missing.

H Mit

unread,
Nov 17, 2020, 9:20:08 AM11/17/20
to Prometheus Users
Here is the workaround I'm using so far in my alarm:

expr: count ( count (container_last_seen{container_label_com_docker_swarm_service_name="monitoring_cAdvisor"}) by (container_env_node_name)) < count(swarm_node_info)

Basically it counts the number of nodes that have at least 1 cAdvisor container running and compares it to the total number of nodes. That way it makes sure that at least 1 cAdvisor container is running on each node. It's not optimal but I think it will do the trick for now. If someone has a better suggestion feel free to share.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7374f574-2f67-4128-b900-c31aaf87ba1fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages