How can I debug high garbage collection rate cause high CPU usage issue in golang

2,105 views
Skip to first unread message

Joseph Wang

unread,
Jun 4, 2019, 8:55:39 AM6/4/19
to golang-nuts
Hello everyone

I just have a question about my golang code debugging. It's not specific code question. But I never met this issue before. 

The problem is like this. I replaced our back-end system's cache from single node cache to groupcache that is a kind mem cache. Then I met high memory usage issue at beginning. I use pprof and online resource then I set up GC rate(10%) to reduce memory usage down to normal. But then this will cause high CPU usage, because heavy GC operations. I use pprof to get the hot spot is runtime.ScanObject() this function, which is out of scope of my code base. 

So I don't know if anyone can give me some suggestions about how to fix this kind issue. I know the issue should come from my code base. But I don't know how to find out the issue and fix it ASAP.

Best,

Joseph

Robert Engels

unread,
Jun 4, 2019, 9:53:30 AM6/4/19
to Joseph Wang, golang-nuts
You really can't.

You control CPU usage by the GC by allocation rate, the number of live objects, and the collection rate. If you slow the collector (lower CPU), you will have higher memory usage because the GC won't have time to collect the garbage.

In general, the only things you are in control of are 1) how many objects you allocate, 2) the size of these objects, 3) how often do you allocate them, and 4) how long you keep them around (keep references).

There is always a trade-off between CPU usage (by GC) and memory usage.

The truth though is that you pay similar costs even if you manage the memory yourself, and typically requires a lot more error prone code to do so.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/5c977af4-074e-4eaf-85c2-2a284595b572%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



Michael Jones

unread,
Jun 4, 2019, 10:07:51 AM6/4/19
to Joseph Wang, golang-nuts
Perhaps this is obvious, but suddenly heavy garbage collection operations suggest heavy garbage creation operations; that new code or structure is allocating and discarding more than the old. You've gone from single to multiple caches, so my thoughts go to: is data cached redundantly? (is the n-cache version n-times as much GC or memory), are cached elements migrating without memory reuse?, is the n-way concurrency implied causing simultaneous caching as a temporal artifact? These seem like the right kinds of questions to consider. Robert Engels already shared the fundamental insight that GC cost is a reflection of your application's "spending habits" and not generally something to control in the collector itself.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/5c977af4-074e-4eaf-85c2-2a284595b572%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Michael T. Jones
michae...@gmail.com

Inada Naoki

unread,
Jun 5, 2019, 3:06:05 AM6/5/19
to golang-nuts
There are two reason GC is heavy:

a) Many garbages are created very frequently.
b) There are many (living) objects GC need to scan every time.  (Because Go's GC is not generational)

Finding bottleneck of (a) is easy.  `pprof -alloc_objects ...` tell you where many objects are allocated.

(b) is difficult than (a).  First of all, you should understand Go's GC only scans pointers.

For example, when there is `times := make([]time.Time, 10000)`, GC need to scan in the times,
because time.Time has a pointer.
You can avoid the scanning by converting Time to int64 using Time.unixnano().
GC doesn't scan in []int64.

Then, you need to find where many pointers are alive.
pprof doesn't provide you direct answer.  But `pprof -inuse_space` and `pprof -inuse_objects`
gives you great hint.

Best,

Rick Hudson

unread,
Jun 5, 2019, 4:14:33 PM6/5/19
to golang-nuts
When you say "set up GC rate(10%) to reduce memory usage down to normal" what exactly did the program do? 
 
Compute (CPU) costs money and heap memory (DRAM) costs money. Minimizing the sum should be the goal. This requires one to have a model of the relative costs of CPU vs. RAM, HW folks balance these costs when they spec a machine so you could steal that model. Right now I'm on a 4 core (8 HW thread) machine with 16 GBytes of memory, so that's 2 GBytes per P. Cloud providers have price lists that can also be used to build a model. Once there is a model, set GODEBUG=gctrace=1 and adjust GOGC to find the balance that minimizes the sum of CPU and heap size based on your model. 

Alternatively your manager may give out bonuses for improving benchmarks that only look at CPU times. This produces a model where memory is free as long as one doesn't OOM. I've used that model but it brought little job satisfaction.
Reply all
Reply to author
Forward
0 new messages