How to Optimize A Golang Application Which Has More Than 100 Million Objects at Peak?

rmfr

unread,

Jul 16, 2021, 8:27:03 AM7/16/21

to golang-nuts

I run it at an 8 cores 16GB machine and it occupies all cpu cores it could.

1. It is ~95% cpu intensive and with ~5% network communications.

2. The codebase is huge and has more than 300 thousands of lines of code (even didn't count the third party library yet).

3. The tool pprof tells nearly 50% percent of the time is spending on the runtime, something related to gc, mallocgc, memclrNoHeapPointers, and so on.

4. It has ~100 million dynamic objects.

Do you guys have some good advice to optimize the performance?

One idea that occurs to me is to do something like sync.Pool to buffer some most frequently allocated and freed objects. But the problem is I didn't manage to find a golang tool to find such objects. The runtime provides api to get the amount of objects but it didn't provide api to get the detailed statistics of all objects. Please correct me if I'm wrong. Thanks a lot :-)

Henry

unread,

Jul 17, 2021, 1:59:41 AM7/17/21

to golang-nuts

Given your description about being CPU intensive and all, I am guessing you may have created those objects inside your busy loop. Try moving those objects out of the loop and reuse them. Another option is to use sync.Pool.

As for tools, try pprof: Profiling Go Programs - The Go Blog (golang.com)

R. Aidan Campbell

unread,

Jul 17, 2021, 11:59:26 PM7/17/21

to golang-nuts

First time posting on this list, but I've gone through a very similar exercise the last few months and might have some good insight for you.

Learning how to interpret profiles is extremely useful here. For example, capturing a cpu profile, heap profile, and trace will all provide different facets of what's going on under the hood. Luckily since you have high CPU usage the cpu profile alone will still be very useful.

For a couple low hanging fruit: armed with a CPU and heap profile, take a look at both. You say the runtime and gc dominate the CPU pprof: this likely points to memory issues as you mentioned. Open up a heap profile, switch the `sample_index` to `alloc_space` or `alloc_objects`, and take a look to see who the largest offenders are. For a more clear pointer to the offending code's callstack, set `call_tree`, then take another look.

I believe that spending a few hours/days on learning the pprof and trace tools would pay dividends given the scope of your task. It's hard to give any more detailed suggestions for performance while flying blind. Personally when I had a similar looking pprof, the two low hanging fruit were goroutine management (some were lasting longer than expected), and reduced memory usage (work / allocation was needlessly duplicated in several places).

if you truly have such a large amount of in use objects (noted by inuse_space / inuse_objects in the heap's pprof), then I agree some form of sync.Pool or memory reuse may be beneficial.

jake...@gmail.com

unread,

Jul 18, 2021, 11:27:38 AM7/18/21

to golang-nuts

Two other replies have mentioned Sync.Pool. I agree that Sync.Pool is a valuable tool.

However, for the benefit of any beginning gophers who may read this thread, I wanted to point out that, in a situation like yours, I would want to try to reduce heap allocation in other ways first. Not that Sync.pool is a last resort exactly, but it does have non-trivial overhead, and is not a substitute for thinking about other ways to clean up heap allocations. For example, pulling allocations out of loops, or manual object re-use where it fits naturally into the code.

rmfr

unread,

Jul 19, 2021, 9:46:13 AM7/19/21

to golang-nuts

Thanks a lot! You are so kind and nice :-D

I tried golang's heap profile and found it is very helpful. By the way, what do you think about this article: