Prometheus in Kubernetes

Natalia

unread,

Jul 13, 2016, 5:30:29 AM7/13/16

to Prometheus Developers

Hi,

I am implementing prometheus in kubernetes for production

Our config will be :

• 4 Datacenters, 3 k8s clusters per DC

• We are collecting system metrics (node-export) and custom metrics : java,node.js,nginx,mysql,cassandra,...

• We will run prometheus as pod in each of K8s

• "Main prometheus" collect all the data (federation) and Alertmanager over "Main prometheus"

I am a little confuse with understanding prometheus-kubernetes.yml example in github (kubernetes_sd_configs) :

1. who is define __meta_kubernetes_service_annotation_prometheus_io_scrape , __meta_kubernetes_pod_annotation_prometheus_io_scrape as "true" ?

2. job for __meta_kubernetes_role endpoint,pod,container pull the same metrics, isn't it ? if yes - is it enough to define only job for pod ?

3. what is the purpose of job for __meta_kubernetes_role=service ? I have the metrics from "pod" job

4. is it enough job for pod (in case that pod has several containers) ?

5. prometheus can't run out of k8s since it need access to all pods, is it correct ?

6. I am planing to use alertmanager and create alerts per service,datacenter and not per pod :

- what is the purpose of " templates: - '/etc/alertmanager/template/*.tmpl'" in alertmanager.yml ?

- what is the diff between "rule_files" in prometheus-kubernetes.yml and "templates" in alertmanager.yml ?
- where should I define group by service,datacenter ?

Thanks in advance for any help!

Tom Wilkie

unread,

Jul 13, 2016, 5:53:54 AM7/13/16

to Natalia, Prometheus Developers

> 1. who is define __meta_kubernetes_service_annotation_prometheus_io_scrape , __meta_kubernetes_pod_annotation_prometheus_io_scrape as "true" ?

These (or rather __meta_kubernetes_service_annotation_*) are meta labels introduced by the prometheus service discovery. They expose the annotations map from the service/pod as meta labels, so annotations like prometheus.io.scrape become __meta_kubernetes_service_annotation_prometheus_io_scrape. You should set these annotations in you service/pod yaml.

> 2. job for __meta_kubernetes_role endpoint,pod,container pull the same metrics, isn't it ? if yes - is it enough to define only job for pod ?

More of less, yes. We only use the jobs for the endpoints, so we can use the service name as the job name (as opposed to using the pod name).

> 5. prometheus can't run out of k8s since it need access to all pods, is it correct ?

You could run the prometheus on the same network as the pods but outside of kubernetes (for instance, if you use an AWS VPC, you could install the appropriate routes). But yes, prometheus need to access all the pods.

We run prometheus as a pod in kubernetes, but it has its drawbacks. Kubernetes isn't so good at managing stateful services (ignoring pet sets in 1.3), so you need to be careful not to blow away all you history (by doing a rolling upgrade, for instance). Use of volumes and config maps can help somewhat here, but you need to be careful.

HTH

Tom

ps I gave a quick talk last Friday about our k8s + prometheus setup, might be of some help: http://www.slideshare.net/weaveworks/kubernetes-and-prometheus

--
This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the
addressee you must not use, copy, disclose or take action based on this
message or any information herein.
If you have received this message in error, please advise the sender
immediately by reply email and delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Natalia

unread,

Jul 13, 2016, 7:39:04 AM7/13/16

to Prometheus Developers

Many Thanks for the quick response!
the presentation is very useful, thanks for it!

just to verify few things :

> These (or rather __meta_kubernetes_service_annotation_*) are meta labels introduced by the prometheus service discovery. They expose the annotations map from the service/pod as meta labels, so annotations like prometheus.io.scrape become __meta_kubernetes_service_annotation_prometheus_io_scrape. You should set these annotations in you service/pod yaml.

if by default I want to pull all metrics, there is no reason to use __meta_kubernetes_service_annotation_prometheus_io_scrape = "true", is it correct ?
only I will want to exclude some metrics I cad set it "false" and define in job "drop", right ?

> More of less, yes. We only use the jobs for the endpoints, so we can use the service name as the job name (as opposed to using the pod name).

you are right, I think how to add "service" label to metrics coming from pod - instead of I should you "endpoints" :-)

> We run prometheus as a pod in kubernetes, but it has its drawbacks. Kubernetes isn't so good at managing stateful services (ignoring pet sets in 1.3), so you need to be careful not to blow away all you history (by doing a rolling upgrade, for instance). Use of volumes and config maps can help somewhat here, but you need to be careful.

I think about some solution :
prometheus pod is running on the dedicated node and keep the data on the local disk (out of pod)
yaml files we will keep on persistent storage that all nodes mount it
we will have 2 prometheus (active-active) for HA (?)

Are you using SSD ?
How many metrics you have ?

Thanks for your question!

Tom Wilkie

unread,

Jul 13, 2016, 8:00:52 AM7/13/16

to Natalia, Prometheus Developers

> if by default I want to pull all metrics, there is no reason to use __meta_kubernetes_service_annotation_prometheus_io_scrape = "true", is it correct ? only I will want to exclude some metrics I cad set it "false" and define in job "drop", right ?

You can define your relabeling config to be opt-in (requiring you to set the annotation to true to enable scraping) or opt out. We have it set to opt-out (as per the slides), and we set the prometheus.io.scrape = false annotation on kubernetes services I do not want to scrap, for instance we have a static content server which hasn't been instrumented yet, so we set it to false.

> I think about some solution :

That could work; you'll need to ensure your upgrade procedure doesn't end up with two prometheuses(?) pointing at the same data though.

> Are you using SSD ?

We're running on EC2, using the kube-up script. I don't know what that does, I suspect its not SSD.

> How many metrics you have ?

Not a huge number, its a relatively small deployment right now. About 100k.

Thanks

Tom

Natalia

unread,

Jul 13, 2016, 8:08:52 AM7/13/16

to Prometheus Developers

Tom, Thanks a lot for your help!

Reply all

Reply to author

Forward