Memory usage discrepancy

312 views
Skip to first unread message

Anoop

unread,
Sep 9, 2020, 9:58:49 AM9/9/20
to Prometheus Users
Hi,

I am using Prometheus metric "container_memory_working_set_bytes" to display the memory usage graph in %. However, it is showing a higher value than what I can see when I use the linux command "free -h".

The actual usage using linux command:
memory_linux.JPG

Based on this, it is actually taking less than 19%.

Now I am using below query to get the % value for memory usage:
sum by (node) (container_memory_working_set_bytes{id="/",node="test-instance-e1"}) / sum by (node) (machine_memory_bytes{node=" test-instance-e1"}) * 100

But, this is showing 41%, which is not correct. 

Can someone please guide me if there is any mistake in the Prometheus query or if I can usage some other metrics/query to get the current memory usage?

Thank You,
Anoop



Harkishen Singh

unread,
Sep 9, 2020, 1:26:51 PM9/9/20
to Prometheus Users

You don't require to sum by a node when you are already querying with exact labels since I expect the output to be one. 
BTW, try with container_memory_usage_bytes instead of container_memory_working_set_bytes

Anoop Mohan

unread,
Sep 10, 2020, 3:35:24 AM9/10/20
to Harkishen Singh, Prometheus Users
Thanks  Harkishen Singh for your suggestion.
But, when we use container_memory_usage_bytes instead of container_memory_working_set_bytes it is showing above 55%. 
Also referred some of the blogs which were also recommending to use container_memory_working_set_bytes  instead of  container_memory_usage_bytes  to get the accurate result. 


Thanks & Regards,

Anoop Mohan



--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/bb710ee9-6cec-432d-9ec0-5703720ea5b1n%40googlegroups.com.

Anoop Mohan

unread,
Sep 11, 2020, 5:26:19 AM9/11/20
to Harkishen Singh, Prometheus Users
HI,

Anyone have any suggestions on this?



Thanks & Regards,

Anoop Mohan


Harkishen@Timescale

unread,
Sep 12, 2020, 4:31:44 AM9/12/20
to Prometheus Users
how about using node:node_memory_utilisation:ratio * 100 ?

mspr...@us.ibm.com

unread,
Sep 12, 2020, 1:51:32 PM9/12/20
to Prometheus Users

Bartłomiej Płotka

unread,
Sep 12, 2020, 5:27:09 PM9/12/20
to mspr...@us.ibm.com, Prometheus Users
Hey, I wrote about this here https://www.bwplotka.dev/2019/golang-memory-monitoring/ at some point as well.

Unless something changes by the exporter that actually gives your this metric (it's not Prometheus question, Prometheus is merely collecting those from other exporters) container_memory_working_set_bytes used to be the best bet indeed. AND what you see is somehow expected. 

Even "container_memory_working_set_bytes" is not exactly 1:1 to `Total - Available` for node `free -h` as there are so many caches that kernel uses memory for, that there will be some differences. One is that free is just some utility on the container, vs working set are (if we trust cadvisor doing it well) is what cgroup is showing from outside, on the host.  (see: https://github.com/google/cadvisor/issues/638)

What you see it's quite a big difference though. What is an exact value for just container_memory_working_set_bytes{id="/",node="test-instance-e1"}  ?  Don't use `sum`, you might accidentally sum something irrelevant, etc. Since you expect one series, why you sum?

As per Anoop, your findings are quite interesting indeed. I seen similar inconsistencies, but well, we don't have anything else more accurate. The memory strategy is changing every Go and kernel version (and for the good!). The memory bookkeeping of the kernel is so dynamic that those things can happen. Ideally, you always have some room before your memory limit to not worry about those inconsistencies. 

Kind Regards,
Bartek Płotka (@bwplotka)


Reply all
Reply to author
Forward
0 new messages