Re: How to properly monitoring kubernetes service up/down status

2,546 views
Skip to first unread message

Ben Kochie

unread,
Feb 23, 2017, 4:18:39 AM2/23/17
to Xiao Han, promethe...@googlegroups.com
Please take this question to the prometheus-users list.

On Thu, Feb 23, 2017 at 5:10 AM, Xiao Han <justl...@gmail.com> wrote:
Hi,

  I'm using prometheus with kubernetes in our production environment, it's greate and helpful. But I met some problem about monitoring the k8s Service status using prometheus.

  Here is the basic information and my problem in detail.

  I run prometheus inside k8s as a Pod, and using kubernetes_sd to find all pods and services, it's all good.

  Generally our application is running in Pod, and we create one Kubernetes Service for each application with two Pods replica. We have an health check endpoint like `/health` and it returns 200 OK.

  The thing I want to achieve is: create a dashboard showing the Service is running correctly, which mean up or down.

  The way I use now is configure prometheus with kubernetes_sd, role='service', and for each service, the endpoint to pull is like `http://<service-name>:<service-port>/health`, and the prometheus
query I can use is like `up{job='kubernetes-service', kubernetes_service_name=<service-name>}`.

  The problem of this way is, it only works if the `health` endpoint returns EMPTY content or prometheus METRICS format data. But to let a /health endpoint to return metrics is not proper way, so now
we have to make it return empty content. Actually we want the '/health' endpoint to return some text like 'ok', but it will not work in prometheus.

  So I want to ask is my solution a proper way to monitoring kubernetes service status? If yes, does prometheus have a better way to support general health check endpoint like 'ok' response?

Thanks!

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
To post to this group, send email to prometheus-developers@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/5ef7a7cb-edda-4b89-93de-553430066149%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Xiao Han

unread,
Feb 23, 2017, 8:45:08 AM2/23/17
to Prometheus Developers, justl...@gmail.com, promethe...@googlegroups.com
sure


On Thursday, February 23, 2017 at 6:18:39 PM UTC+9, Ben Kochie wrote:
Please take this question to the prometheus-users list.
On Thu, Feb 23, 2017 at 5:10 AM, Xiao Han <justl...@gmail.com> wrote:
Hi,

  I'm using prometheus with kubernetes in our production environment, it's greate and helpful. But I met some problem about monitoring the k8s Service status using prometheus.

  Here is the basic information and my problem in detail.

  I run prometheus inside k8s as a Pod, and using kubernetes_sd to find all pods and services, it's all good.

  Generally our application is running in Pod, and we create one Kubernetes Service for each application with two Pods replica. We have an health check endpoint like `/health` and it returns 200 OK.

  The thing I want to achieve is: create a dashboard showing the Service is running correctly, which mean up or down.

  The way I use now is configure prometheus with kubernetes_sd, role='service', and for each service, the endpoint to pull is like `http://<service-name>:<service-port>/health`, and the prometheus
query I can use is like `up{job='kubernetes-service', kubernetes_service_name=<service-name>}`.

  The problem of this way is, it only works if the `health` endpoint returns EMPTY content or prometheus METRICS format data. But to let a /health endpoint to return metrics is not proper way, so now
we have to make it return empty content. Actually we want the '/health' endpoint to return some text like 'ok', but it will not work in prometheus.

  So I want to ask is my solution a proper way to monitoring kubernetes service status? If yes, does prometheus have a better way to support general health check endpoint like 'ok' response?

Thanks!

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
To post to this group, send email to prometheus...@googlegroups.com.

vanb...@gmail.com

unread,
Nov 7, 2017, 1:01:43 AM11/7/17
to Prometheus Users
Any update on this? I also have the same question.


On Thursday, February 23, 2017 at 1:18:39 AM UTC-8, Ben Kochie wrote:
Please take this question to the prometheus-users list.
On Thu, Feb 23, 2017 at 5:10 AM, Xiao Han <justl...@gmail.com> wrote:
Hi,

  I'm using prometheus with kubernetes in our production environment, it's greate and helpful. But I met some problem about monitoring the k8s Service status using prometheus.

  Here is the basic information and my problem in detail.

  I run prometheus inside k8s as a Pod, and using kubernetes_sd to find all pods and services, it's all good.

  Generally our application is running in Pod, and we create one Kubernetes Service for each application with two Pods replica. We have an health check endpoint like `/health` and it returns 200 OK.

  The thing I want to achieve is: create a dashboard showing the Service is running correctly, which mean up or down.

  The way I use now is configure prometheus with kubernetes_sd, role='service', and for each service, the endpoint to pull is like `http://<service-name>:<service-port>/health`, and the prometheus
query I can use is like `up{job='kubernetes-service', kubernetes_service_name=<service-name>}`.

  The problem of this way is, it only works if the `health` endpoint returns EMPTY content or prometheus METRICS format data. But to let a /health endpoint to return metrics is not proper way, so now
we have to make it return empty content. Actually we want the '/health' endpoint to return some text like 'ok', but it will not work in prometheus.

  So I want to ask is my solution a proper way to monitoring kubernetes service status? If yes, does prometheus have a better way to support general health check endpoint like 'ok' response?

Thanks!

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.
To post to this group, send email to prometheus...@googlegroups.com.

Simon Pasquier

unread,
Nov 7, 2017, 4:08:19 AM11/7/17
to vanb...@gmail.com, Prometheus Users
Hello,
IIUC you need to use the blackbox exporter as a proxy to probe the services. The Kubernetes SD job discovers the service URLs but instead of requesting those URLs directly, it will rewrite the labels to use the blackbox exporter instead.
An example is provided in the Prometheus repository:
HTH,
Simon

2017-11-07 7:01 GMT+01:00 <vanb...@gmail.com>:
Any update on this? I also have the same question.

On Thursday, February 23, 2017 at 1:18:39 AM UTC-8, Ben Kochie wrote:
Please take this question to the prometheus-users list.
On Thu, Feb 23, 2017 at 5:10 AM, Xiao Han <justl...@gmail.com> wrote:
Hi,

  I'm using prometheus with kubernetes in our production environment, it's greate and helpful. But I met some problem about monitoring the k8s Service status using prometheus.

  Here is the basic information and my problem in detail.

  I run prometheus inside k8s as a Pod, and using kubernetes_sd to find all pods and services, it's all good.

  Generally our application is running in Pod, and we create one Kubernetes Service for each application with two Pods replica. We have an health check endpoint like `/health` and it returns 200 OK.

  The thing I want to achieve is: create a dashboard showing the Service is running correctly, which mean up or down.

  The way I use now is configure prometheus with kubernetes_sd, role='service', and for each service, the endpoint to pull is like `http://<service-name>:<service-port>/health`, and the prometheus
query I can use is like `up{job='kubernetes-service', kubernetes_service_name=<service-name>}`.

  The problem of this way is, it only works if the `health` endpoint returns EMPTY content or prometheus METRICS format data. But to let a /health endpoint to return metrics is not proper way, so now
we have to make it return empty content. Actually we want the '/health' endpoint to return some text like 'ok', but it will not work in prometheus.

  So I want to ask is my solution a proper way to monitoring kubernetes service status? If yes, does prometheus have a better way to support general health check endpoint like 'ok' response?

Thanks!

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscri...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d9fbcd31-9ba4-4c32-b602-c31d3a9493a9%40googlegroups.com.

vanb...@gmail.com

unread,
Nov 9, 2017, 1:16:55 AM11/9/17
to Prometheus Users
Thanks Simon. I have integrated with black box exporter. But I always see target probe status as 'UP'. also the UP metric shows 'UP' even if the corresponding pods are down. Am I missing something?
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsub...@googlegroups.com.

Brian Brazil

unread,
Nov 9, 2017, 2:07:22 AM11/9/17
to vanb...@gmail.com, Prometheus Users
On 9 November 2017 at 06:16, <vanb...@gmail.com> wrote:
Thanks Simon. I have integrated with black box exporter. But I always see target probe status as 'UP'. also the UP metric shows 'UP' even if the corresponding pods are down. Am I missing something?

Up just means that Prometheus can talk to the blackbox exporter. probe_success tells you if the probe worked.

 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscri...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d9fbcd31-9ba4-4c32-b602-c31d3a9493a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--

Baskar Vangili

unread,
Nov 9, 2017, 11:01:03 AM11/9/17
to Brian Brazil, Prometheus Users
Thanks Brian. That’s the one and it works fine.


From: Brian Brazil <brian....@robustperception.io>
Sent: Wednesday, November 8, 2017 11:07:20 PM
To: vanb...@gmail.com
Cc: Prometheus Users
Subject: Re: [prometheus-users] Re: How to properly monitoring kubernetes service up/down status
 

Matthias Rampke

unread,
Nov 10, 2017, 8:16:29 AM11/10/17
to Baskar Vangili, Brian Brazil, Prometheus Users
If you are using Kubernetes readiness probes (which you should), you can also deploy


to get metrics from what Kubernetes thinks about your application. If the readiness probe probes the same endpoint then the blackbox exporter isn't strictly necessary because Kubernetes probes for you already.

/MR

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.



--

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/DM5PR2201MB170679907297948814C370ABF8570%40DM5PR2201MB1706.namprd22.prod.outlook.com.

Baskar Vangili

unread,
Nov 13, 2017, 12:11:27 AM11/13/17
to Matthias Rampke, Brian Brazil, Prometheus Users
Thanks Matthias. Can you share an example please?


From: Matthias Rampke <m...@soundcloud.com>
Sent: Friday, November 10, 2017 5:16:17 AM
To: Baskar Vangili
Cc: Brian Brazil; Prometheus Users

Matthias Rampke

unread,
Nov 13, 2017, 7:11:42 AM11/13/17
to Baskar Vangili, Brian Brazil, Prometheus Users
Try:

kube_pod_status_ready{condition!="true"} == 1

/MR
Reply all
Reply to author
Forward
0 new messages