CPU Resource Requests + Limits

Julian Modesto

unread,

Jul 21, 2017, 2:37:56 PM7/21/17

to Prometheus Users

Hi,

I'm running Prometheus with Prometheus Operator in Kubernetes.

The Prometheus documentation states here that OS threads is set by GOMAXPROCS.

The Prometheus Operator uses resources.memory.requests to try to set -local.target.heap.size efficiently.

What about CPU, how should I set resources.cpu.requests? I don't see GOMAXPROCS being set w/ Prometheus Operator.

Julian

Julius Volz

unread,

Jul 21, 2017, 5:33:42 PM7/21/17

to Julian Modesto, Prometheus Users

See this sentence from the doc you linked: "As of Go 1.5 the default value is the number of cores available.". So you don't normally need to set GOMAXPROCS nowadays anymore, Go will automatically spawn as many threads as it thinks there are cores. Whatever CPU limits you apply to the container (not sure if/how the Operator handles configuration of that) will then still apply and potentially scale down your actual CPU usage of course.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/1d57f535-1900-474d-9ca0-2d819f355350%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthias Rampke

unread,

Jul 22, 2017, 7:44:22 AM7/22/17

to Julius Volz, Julian Modesto, Prometheus Users

CPU requests don't imply limits, the requests are used as the "expected" usage, mostly used by the scheduler.

With limits, a problem is that even if you limit something to "2 cores" in Kubernetes (= Docker = cgroups), it still reports however many cores the machine has to Go. The CPU limits are applied in the time dimension – so in an extreme case (using all available threads all the time), on a 32-core machine (GOMAXPROCS=32 by default) you end up with using 32 cores for 6.25 ms, then being stalled (out of CPU slices) for 93.75ms, then running at full steam for 6.25ms, then stalled again.

So yes, adjusting GOMAXPROCS to make the process self-limit more gracefully can be beneficial. I suppose this should be a feature request on https://github.com/coreos/prometheus-operator?

/MR

On Fri, Jul 21, 2017 at 9:33 PM Julius Volz <juliu...@gmail.com> wrote:

See this sentence from the doc you linked: "As of Go 1.5 the default value is the number of cores available.". So you don't normally need to set GOMAXPROCS nowadays anymore, Go will automatically spawn as many threads as it thinks there are cores. Whatever CPU limits you apply to the container (not sure if/how the Operator handles configuration of that) will then still apply and potentially scale down your actual CPU usage of course.

On Fri, Jul 21, 2017 at 8:37 PM, Julian Modesto <julianv...@gmail.com> wrote:

Hi,

I'm running Prometheus with Prometheus Operator in Kubernetes.

The Prometheus documentation states here that OS threads is set by GOMAXPROCS.

The Prometheus Operator uses resources.memory.requests to try to set -local.target.heap.size efficiently.

What about CPU, how should I set resources.cpu.requests? I don't see GOMAXPROCS being set w/ Prometheus Operator.

Julian

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/1d57f535-1900-474d-9ca0-2d819f355350%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CA%2BT6YozR%2BxuJpr1TV2LaOQ5ZvOs9vh82sLJaX7t5vh8pLH9Ukw%40mail.gmail.com.

frederic...@coreos.com

unread,

Jul 24, 2017, 4:11:46 AM7/24/17

to Prometheus Users, juliu...@gmail.com, julianv...@gmail.com

This sounds reasonable Matthias, I'm just wondering how we can we make an assumption about the number of cores the Prometheus instance will be running on, when it has yet to be scheduled. It seems like we might want something like the `cpuset`, however it seems like we're not quite there in Kubernetes:

https://github.com/kubernetes/kubernetes/issues/10570

https://github.com/kubernetes/community/pull/654

https://github.com/kubernetes/community/pull/171 (proposal of first attempt resulting in the above proposal)

Matthias Rampke

unread,

Jul 24, 2017, 7:31:44 AM7/24/17

to frederic...@coreos.com, Prometheus Users, juliu...@gmail.com, julianv...@gmail.com

I think "how many cores should it use" is, in a way, already a high-level configuration option? Or what is the CPU request set from? So GOMAXPROCS could just be another effect of that, decoupled from the physical machine because we're trying to abstract that away anyway. The desire, as I understand it, is to not rely on the physical characteristics to determine GOMAXPROCS, so it doesn't matter that it hasn't been scheduled yet.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4b36707d-b52c-4732-976b-b3d99a62d3fa%40googlegroups.com.

Frederic Branczyk

unread,

Jul 24, 2017, 10:49:21 AM7/24/17

to Matthias Rampke, Prometheus Users, Julius Volz, julianv...@gmail.com

Yes I think you're right. Not sure why, but I was thinking of CPU time, but in fact CPU requests is in CPU shares. But still I'm unsure whether running Prometheus with GOMAXPROCS=1 is such a good idea, as that is what most people will realistically end up assigning to Prometheus (right now we're not seeing people assigning more than `500m` of CPU). I still feel like we need to treat parallelization separately from CPU shares. Maybe rather just add a field to the Prometheus objects to directly set the GOMAXPROCS, and if unset keep the current behavior.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.

emba...@gmail.com

unread,

Nov 7, 2017, 4:09:16 AM11/7/17

to Prometheus Users

Hi,

I´m in a similar situation, where I want to better align Kubernetes CPU limits (i.e. Docker quotas) to the environment of my GO binaries. I found some hints from Google and Uber, that they also make use of this.

Monzo run into an issue with long GC pauses due to misaligned cgroup and GOMAXPROCS settings.

For Kubernetes, one could use "ResourceFieldRef" [1] as a template mechanism to dynamically set GOMAXPROCS before execution.

[1]
https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-container-fields-as-values-for-environment-variables

Frederic Branczyk

unread,

Nov 7, 2017, 4:25:54 AM11/7/17

to emba...@gmail.com, Prometheus Users

Right. The problem I wanted to highlight is that the CPU quota in milli-cores is not easily translatable to the number of max go routines a process should utilize. The number of go routines should ultimately be chosen to represent the fraction of cores relative to the # of available threads (which is what Go defaults to). This would require knowledge of the particular host the Pod is scheduled onto.

You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/QPQ-UbtvS44/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/15532d89-2e79-4033-a5e8-cda55138a62f%40googlegroups.com.

Matthias Rampke

unread,

Nov 8, 2017, 3:32:13 AM11/8/17

to Frederic Branczyk, emba...@gmail.com, Prometheus Users

what do you expect to happen at GOMAXPROCS=1? Deadlocks? Has anyone tried this recently?

My understanding is that this variable informs the Go scheduler how many OS threads it can schedule goroutines to. If I ask for exactly 1 CPU to be used, what GOMAXPROCS should I set? Maybe there needs to be a multiplier?

/MR

To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.

To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/15532d89-2e79-4033-a5e8-cda55138a62f%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAEDJc8TuX3O7h2ZpXrpSCE1TQ_yVOZ_reQv6rLySOx6eKSMuBg%40mail.gmail.com.

emba...@gmail.com

unread,

Nov 8, 2017, 2:43:39 PM11/8/17

to Prometheus Users

Hi Matthias,

I did some benchmarking (with CPU intensive and concurrent code, not Prometheus in particular though) and aligning GOMAXPROCS with cgroup settings should be a best practice, unless the go runtime, like Java, eventually will become more "cgroup" aware (also for mem). The main challenge for not aligning GOMAXPROCS with cgroups (especially if the difference between runtime.NumCpus() and cgroups quota is large, e.g. quota 1 CPU, GOMAXPROCS >=8) is that the runtime threads will get throttled too much, affecting performance and SLO of the process. Another issue is that the GC can be affected (like Monzo described), which is even more serious.

Now, is this a huge problem? May be, may be not. Cloud pioneers like Google and Uber mention that they do it. Others as well, especially folks from the Java community. If it does not hurt you, good. Don´t worry too much IMHO.

For production systems, from an SRE perspective I´d benchmark and adjust the settings accordingly for optimal performance.

Just need to find the time to put my findings into a blog post...

Matthias Rampke

unread,

Nov 9, 2017, 3:37:08 AM11/9/17

to emba...@gmail.com, Prometheus Users

Thank you! That matches my (much less well-founded) suspicion.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/e9fa74fb-c6c3-47b4-867a-4141b05605ac%40googlegroups.com.

emba...@gmail.com

unread,

Nov 9, 2017, 3:40:40 AM11/9/17

to Prometheus Users

Welcome, will post the link to the blog here as well for discussion when it´s been written :)

emba...@gmail.com

unread,

Jul 20, 2018, 12:19:01 PM7/20/18

to Prometheus Users

Not quiet the blog post yet, sorry...but some progress as described in this issue :)

https://github.com/uber-go/automaxprocs/issues/12

Matthias Rampke

unread,

Jul 23, 2018, 3:54:52 AM7/23/18

to emba...@gmail.com, Prometheus Users

This is super interesting, thank you for writing it up!

/MR

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/db9b69b8-234d-4b1f-9b55-61f1f459a098%40googlegroups.com.

Reply all

Reply to author

Forward