Hello,Yeah, the subject might be overwhelming...Firstly I'm new to Prometheus.I have configured 3 targets to monitor containers running inside it until now.(I have done this for around 50 containers so far)The problem I faced is that grafana goes into a hang state as the containers increase.
The web application I'm currently working on is used for simulation. So on each server, we spawn around 2K containers.so 14 servers * 2k (containers on each host) =28K containers in total.Now I want to monitor containers running on each host. Priority would be to know when a container is consuming extra memory/CPU/is about to go down.
So can Prometheus handle so much load and monitor each container?Should I rely on the host storage /use influxDb?
Please write your suggestions.Thank you,Isabel
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/ff164068-8401-4f55-830c-b45f9a44271d%40googlegroups.com.
Yeah I just checked relabelling .I need only few metrics and labels like the ones below.
ex.container_memory_usage_bytescontainer_memory_usage_bytescontainer_cpu_usage_seconds_total
So is there a way to keep only what I need and drop other metrics?An example would help in understanding relabelling better.
Yeah I just checked relabelling .I need only few metrics and labels like the ones below.
ex.container_memory_usage_bytescontainer_memory_usage_bytescontainer_cpu_usage_seconds_totalSo I have more metrics to be dropped. I don't want to include(to be dropped metrics) in my prometheus.yml file.So is there a way to keep only what I need and drop other metrics?An example would help in understanding relabelling better.Yes, I'm monitoring containers per host in grafana using variables.And a query like only shows the top 20 containers exceeding the threshold value of CPU usage and sends an alert.This is what I'm planning to do.
On Thursday, April 16, 2020 at 12:39:24 PM UTC+5:30, Brian Candler wrote:> So can Prometheus handle so much load and monitor each container?In short yes, although the important figure is the total number of *metrics* (servers x containers-per-server x metrics-per-container), and this will affect how much resource you need to throw at your prometheus server. If it's too much, you may choose to filter the metrics you ingest to just the ones of interest, using metric relabelling.Assuming you are using a modern version of prometheus (2.14 or later) then the web interface on port 9090 will tell you the stats you need to know, under Status > Runtime & Build Information.As for grafana "hanging": you probably need to configure your dashboards to select a small enough subset of timeseries up-front, e.g. using dashboard variables. If you run an initial query which returns thousands of timeseries, it will indeed take an extremely long time to (a) return the results from prometheus, and (b) render them.
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e84f1093-88ed-466c-b3f8-dcef64b3c9ce%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to promethe...@googlegroups.com.
Since I'm new to Prometheus could you please help with any websites/blogs/tutorials to learn it better?