I swear I must be working with Go incorrectly because I'm still seeing significant latency on GC. (My exact environment is found below) Here's a barebones Go app to demonstrate the problem:
http://play.golang.org/p/q1iGjFUB6v
I have a latency plotting tool performing HTTP requests against the application. I run this plotting tool multiple times over a 1-minute period with an increasing number of requests per second, starting from 1 request per second up to a 128K requests per second.
When I run the application as a barebones HTTP server with no behavior, I easily get 99% of requests finishing in under a millisecond. The remaining 1% finish around 2-3ms.
By creating a map[int]int filled with 10 million items (350+ MB of memory), I have seen massive latency spikes during GC. It's not uncommon to see spikes of 70ms during GC. I have tried varying the GC collection percentage from 1-200 but I don't see that much variance in the GC latency. Turning off GC completely eliminates any GC spikes (of course), but that's not really a solution.
Note that I have also tried creating a simple byte buffer of 1 GB+ of memory and I have been able to limit the GC latency by requiring increased collection percentage. For example, with a 1GB heap, the time between GC invocations increases greatly which means there's more to clean. By increasing the rate (decreasing the time between GC), I was able to see the exact same numbers as a barebones example--about 2ms per request.
Here's the trouble: A lot of advice I'm seeing for reducing GC latency is about reducing pointers, etc., but what I'm measuring shows that a map, regardless of the contents (int, string, etc.), appears to be scanned completely during collection and it is slowing things down significantly. The results seem to indicate that the current GC implementation has a difficult time where instances of large maps are concerned.
Am I completely off base here, or is there some obvious bit of context that I'm missing?
My environment:
Two dedicated boxes running Intel 1270v3 (Haswell) CPUs with 16 GB of RAM.
Ubuntu 14.04 x64 LTS
Trying with Go 1.3.3 and Go 1.4rc1
Dedicated gigabit link between the boxes
On a side note, I thought Dmitry's initial response to my post of creating a single string and then doing string slices from there into a map was a really creative and unique solution to the problem, but the numbers didn't show any kind of improvement. I therefore decided to more fully investigate how maps of integers, booleans, etc. affected latency. The finding (as noted above) shows that the size of the map is what matters most--not the contents in my use case.