Hi y'all,
I have a question related to improving overall CPU utilization efficiency of our CPU intensive distributed workloads. I've also posted this on the Gophers slack but as Dave Cheney pointed out recently, it's not the best medium for long winded questions. I haven't been able to find too much written on this topic, so I appreciate any advice!
I'd like to improve CPU utilization efficiency of Go processes by over-scheduling pods on Kubernetes nodes. The issue is that this workload is variable and heterogeneous; so it's very rare that all processes consume all the available resources. Allowing specific processes/Pods to burst would improve compute resource efficiency, without impairing our processing performance.
eg:
Turn 4x Go processes running on 16CPU VMs at average 60% CPU utilization.
Into 1x 64CPU Kubernetes Node, running 6 pods ~96% CPU utilization.
Theoretically this can be achieved by setting each Pods' Scheduler Request to 9600milliCPU, and cgroups Limit to 16000milliCPU(via the Pod specification).
My question is; what is a safe setting for configuring GOMAXPROCS for each Pod to not degrade the Go runtime performance? If it were set to 10, I don't think it'd be able burst and take advantage of the extra cgroups Limit.
Conversely, would setting GOMAXPROCS to 16; cause runtime instability if all 6 Pods bursted and attempted to utilize full cgroups Limit? Or would the kernel allocation simply throttle the performance(and over-threading) in a safe manner.
Stating why this is a horrible idea, is also a helpful response..
Thanks for any advice,
Josh Roppo
<3 Go Community