The pods is stable for a time being but gets ooMed. We have kubernetes resource usage dashboard, which is loaded repeatedly in 15min interval. My curiosity is it reasonable for getting 4mil series count for just 18 nodes, (250-300) containers , What is being reported by each nodes we are at kubernetes 1.5 and can we get number of sample scraped to be down.Our goal is to have view of Memory /CPU and network usage of each container via kubernetes. The statics about each node we get from node exporter ,which is scraped by another prometheus. I can provide other statics as well if need , but we need to find out why 4mil series are there and samples per sec swings between 45k-75k, for such a small size cluster.
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/508eae1f-2d3f-403d-9cc8-2b0c51470e97%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/508eae1f-2d3f-403d-9cc8-2b0c51470e97%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Brian,Thanks for the quick reply, we are now running 2.0, and both of the query in the article to investigate further does not seem to work.Any other recommendations. I was interested in knowing why would a cluster of size 18 nodes , generates 4 mil series, is there any thing we can tune down to just get memory and Cpu utilization of container(pods) reported from /metrics end point of the those node.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7f87b24e-289b-4a5d-841c-45cefad78207%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7f87b24e-289b-4a5d-841c-45cefad78207%40googlegroups.com.
After deleting all the prometheus data, and starting a fresh pod.I was finally able to get the prometheus visualization after 32 min of waiting.All the outer rings are label ids of the container prefix with respective metrics.. So i am now wondering if there is something wrong with the /metrics end point that provides a different ids each time the prometheus scrapes.How ever our pod metrics in grafana is continious, and we are graphing then on basis of pod_name label.Is it possible to know source of the id in the prometheus metric, where is it grabbed by the kubelet and added as label in the exposed metrics. Is it the docker id ? of the container or The id maintained by kubelet itself. We have been running this kubernets version 1.5 for a long time and we only started to see this problem (explosion) of metrics after we added couple of nodes with same kubernetes version.. but i can see docker version is different on those nodes.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/2b614115-b102-4264-9e85-ce7d6ca4c095%40googlegroups.com.To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/2b614115-b102-4264-9e85-ce7d6ca4c095%40googlegroups.com.
This is what confusing me,
At the this time we have about 316 containers, and we dont have any crash looping ones. so for total of 197 metrics, I was thinking 197*316 (around) so some where in 100k ball park. But prometheus is scraping millions of metrics.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c3001784-3cb6-4155-8903-84bea06dc7bc%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c3001784-3cb6-4155-8903-84bea06dc7bc%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/5a293fe0-f10a-4871-b532-9f7a1c45f7ff%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/fee8e2fc-8657-4887-b9ae-1f45c57b5aa9%40googlegroups.com.To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
check_mk.socket