Sub-millisecond GC pauses

19,652 views
Skip to first unread message

Austin Clements

unread,
Oct 28, 2016, 6:17:24 PM10/28/16
to golang-dev, Rick Hudson
10 millisecond pauses are so 2015.

Today I submitted changes to the garbage collector that make typical worst-case stop-the-world times less than 100 microseconds. This should particularly improve pauses for applications with many active goroutines, which could previously inflate pause times significantly.

If you're already experimenting with master, we would love it if you could try this out and hammer on the new algorithm. We're particularly interested in any significant performance regressions, pause times significantly longer than 100 microseconds, or any crashes. And, of course, we'd love to hear about any improvements you see. :)

If you do encounter problems, please also test at commit d70b0fe or earlier, which is just before the algorithm changes started going in.

There are still known issues with the garbage collector interrupting individual goroutines for too long. We've made improvements to this for Go 1.8 and will continue to improve it, but this change is specifically about the garbage collector interrupting the whole application.

jfcg...@gmail.com

unread,
Oct 29, 2016, 4:23:16 AM10/29/16
to golang-dev
This brings Go well into real-time realm.
I think avionics/mission-critical apps will/can switch to Go very soon.

Mikael Gustavsson

unread,
Oct 29, 2016, 7:42:35 AM10/29/16
to golang-dev, r...@golang.org
Wow, that sounds amazing! For realtime graphics 10ms means you'll likely miss a frame, while 100µs isn't even worth caring about.

the...@gmail.com

unread,
Oct 29, 2016, 7:42:35 AM10/29/16
to golang-dev
Avionics? C'mon now..

Nice efficiency gain though

loui...@gmail.com

unread,
Oct 29, 2016, 7:42:51 AM10/29/16
to golang-dev, jfcg...@gmail.com
This is very exciting for the MEG/EEG field too, I think.  Typical experiments require ms-level control of events (a serious sore-spot for the python scientific ecosystem).

This is pretty exciting!  Kudos and thanks to all involved!

lini.g...@gmail.com

unread,
Oct 29, 2016, 10:02:42 AM10/29/16
to golang-dev, r...@golang.org
Sounds cool! I am pulling the git master right now and I am going to test it against my worst case scenario with very large in-memory maps
Thanks for clarifying this point: "...There are still known issues with the garbage collector interrupting individual goroutines for too long..."
I think I am hitting this barrier again, but worth trying

val...@gmail.com

unread,
Oct 29, 2016, 3:29:32 PM10/29/16
to golang-dev, r...@golang.org
Below are our results before and after the switch to new GC in production serving up to 200K qps per host:


The Service 1 allocates more than the Service 2, so STW pauses are higher there. But STW pause duration dropped by an order of magnitude on both services.
We see ~20% increase in CPU usage spent in GC after the switch on both services.

Austin Clements

unread,
Oct 29, 2016, 6:19:33 PM10/29/16
to val...@gmail.com, golang-dev, Rick Hudson
On Sat, Oct 29, 2016 at 3:29 PM, <val...@gmail.com> wrote:
Below are our results before and after the switch to new GC in production serving up to 200K qps per host:


The Service 1 allocates more than the Service 2, so STW pauses are higher there. But STW pause duration dropped by an order of magnitude on both services.
We see ~20% increase in CPU usage spent in GC after the switch on both services.

Those STW times look great, but that's much more CPU than I would have expected. Could you file an issue, preferably with more details on where you're seeing the increase and before/after profiles if you can, and cc me (GitHub: aclements)? Thanks!

Damian Gryski

unread,
Dec 21, 2016, 8:43:45 AM12/21/16
to golang-dev, val...@gmail.com, r...@golang.org
I've seen a bunch of references to this "20% CPU increase" as the cost of the new low-latency GC, even though it was clearly identified as a bug.

Did an issue ever get filed for this?

Damian

Austin Clements

unread,
Jan 4, 2017, 3:46:59 PM1/4/17
to Damian Gryski, golang-dev, Aliaksandr Valialkin, Rick Hudson
Hi Damian. Aliaksandr's 20% relative increase turned out to be a 2% absolute increase in GC CPU, so we decided an issue wasn't needed.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages