Kubernetes pod memory limit and Prometheus mmap data

Shadi Abdelfatah

unread,

May 15, 2020, 5:28:58 AM5/15/20

to Prometheus Users

Hi,

I'm trying to find more in depth data about how Prometheus manages memory and disk space, my understanding so far is that long-term the data is written in chunks to disk and mmap-ed, and management for that is left for the OS, but I can't find how this mmap data is treated when Prometheus is running in a Kuberentes pod, if the size of the data stored on disk is 16GB for example, and the pod has a memory limit of 8, would at any point a query cause the pod to be OOM killed duo to the map being larger than the memory limit of the pod?

I found this old Prometheus issue about the topic: https://github.com/prometheus/prometheus/issues/3005, but it was closed without any final comments.

Brian Candler

unread,

May 15, 2020, 6:09:12 AM5/15/20

to Prometheus Users

I can't answer the specific question about prometheus, but I can tell you that you can mmap() any file you like, and it can be larger than the amount of RAM in the system. As you access it, pages are transparently swapped in and out from disk as required. Pages you have read are cached, but if there is memory pressure, those cached pages will be dropped.

Shadi Abdelfatah

unread,

May 15, 2020, 6:17:02 AM5/15/20

to Prometheus Users

Thanks but I'm more looking into information on how this goes with memory limits on kubernetes pods or cgroups limits, is the mmap() function or prometheus aware of this limit or will it unintentionally cause the process to be killed.

Bjoern Rabenstein

unread,

May 22, 2020, 11:17:07 AM5/22/20

to Shadi Abdelfatah, Prometheus Users

On 15.05.20 03:17, Shadi Abdelfatah wrote:
> Thanks but I'm more looking into information on how this goes with memory
> limits on kubernetes pods or cgroups limits, is the mmap() function or
> prometheus aware of this limit or will it unintentionally cause the process to
> be killed.

In my understanding (which might be wrong, memory management is
complicated), it works like the following:

Prometheus itself isn't aware of any K8s memory limits, but by
mmap'ing, Prometheus has delegated the management of that part of the
memory to the OS anyway. And the OS is certainly aware of the K8s
memory limits (it is, in fact, managing those). So the OS should first
evict mmap'd data from memory before it OOM-kills Prometheus. In
practice, Prometheus will therefore often look like using its memory
limit to the brink, but it should not get OOM-killed, unless all the
mmap'd memory is already evicted, and now just the normal userspace
memory requirement of Prometheus is exceeding the reservation.

--
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

Shadi Abdelfatah

unread,

May 23, 2020, 8:23:37 AM5/23/20

to Prometheus Users

Thanks for the explanation, it actually matches with my observation that Prometheus is often running at almost the memory-limit in a cluster with high churn rate, but I was looking if this is documented or explained anywhere to avoid any surprises.

Shadi Abdelfatah

unread,

Jun 4, 2020, 6:13:21 AM6/4/20

to Prometheus Users

Aliaksandr Valialkin

unread,

Jun 4, 2020, 12:02:19 PM6/4/20

to Shadi Abdelfatah, Prometheus Users

FYI, the following program may be useful for this case - https://github.com/linchpiner/cgroup-memory-manager

On Thu, Jun 4, 2020 at 1:13 PM Shadi Abdelfatah <shadi.ab...@gmail.com> wrote:

Related blog post: https://medium.com/faun/how-much-is-too-much-the-linux-oomkiller-and-used-memory-d32186f29c9d

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/2507b002-5e17-4a74-bcea-a774f6ac7d55%40googlegroups.com.