Hello Prometheus Community,
I’m encountering a memory usage issue with my Prometheus server, particularly during and after startup, and I’m hoping to get some insights on optimizing it.
Problem:
Upon startup, my Prometheus instance consumes a large amount of memory, primarily due to the WAL (Write-Ahead Log) replay. To address this, I enabled --enable-feature=memory-snapshot-on-shutdown, expecting it to reduce the startup memory spike by eliminating the need for full WAL replay. However, I’m still seeing the memory usage spike to around 5GB on startup. Once started, Prometheus continues to hold this high memory, without releasing it back to the system.
Is there a recommended way to configure Prometheus to release memory post-startup?
Are there additional configurations or optimizations for large WAL files or memory management that could help?
Any guidance or suggestions would be greatly appreciated!
Thank you,
BhanuPrakash.
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/99e6725f-f757-45f4-80a4-98c60e2b5063n%40googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/1dbeaf12-c074-45da-8679-c2370e150861n%40googlegroups.com.
Hi Ben,
I've attached a screenshot of the graph for process_resident_memory_bytes{job="prometheus"}.
Using the top command, I'm seeing 1000 MB memory usage, as shown in the screenshot below.
The graph indicates that Prometheus is using 573 MB, but the pod is showing 1009 MB of memory usage. Typically, after utilization, memory should be released, but it's been holding steady at around 1000 MB for the past 4 hours without decreasing.
We're using this pod in AKS specifically to store node and pod metrics.
I'm not exactly sure what's causing this behavior inside the pod.
Thanks,
Bhanu.