There are two reason GC is heavy:
a) Many garbages are created very frequently.
b) There are many (living) objects GC need to scan every time. (Because Go's GC is not generational)
Finding bottleneck of (a) is easy. `pprof -alloc_objects ...` tell you where many objects are allocated.
(b) is difficult than (a). First of all, you should understand Go's GC only scans pointers.
For example, when there is `times := make([]time.Time, 10000)`, GC need to scan in the times,
because time.Time has a pointer.
You can avoid the scanning by converting Time to int64 using Time.unixnano().
GC doesn't scan in []int64.
Then, you need to find where many pointers are alive.
pprof doesn't provide you direct answer. But `pprof -inuse_space` and `pprof -inuse_objects`
gives you great hint.
Best,