Can concurrent part of GC still block/stall certain parts of program execution?

114 views
Skip to first unread message

zv...@dropbox.com

unread,
Sep 23, 2016, 1:24:21 PM9/23/16
to golang-nuts
Hello,
As subject says, I was wondering if it is possible for concurrent part of GC process to still stall program execution. And if there are certain type of
things that may still block.

I am wondering about this, because we see this behavior in production both with Go 1.5 and 1.6. 

Here is example of GODEBUG=gctrace=1 output of Go 1.5 program:
gc 27004 @74856.698s 1%: 0.66+115+0.15+0.27+8.1 ms clock, 5.3+115+0+47/0.016/101+65 ms cpu, 272->277->144 MB, 279 MB goal, 8 P
gc
27005 @74859.536s 1%: 0.77+120+0.051+0.26+8.7 ms clock, 6.1+120+0+30/0.011/117+69 ms cpu, 272->277->144 MB, 279 MB goal, 8 P
gc
27006 @74862.400s 1%: 0.92+118+0.15+0.26+7.8 ms clock, 7.3+118+0+29/0.008/119+62 ms cpu, 272->276->144 MB, 279 MB goal, 8 P
gc
27007 @74864.392s 1%: 0.73+122+0.17+0.41+8.2 ms clock, 5.9+122+0+59/0/103+66 ms cpu, 272->277->144 MB, 279 MB goal, 8 P
gc
27008 @74866.865s 1%: 0.72+117+0.064+0.19+7.8 ms clock, 5.7+117+0+35/0.010/109+63 ms cpu, 272->277->144 MB, 279 MB goal, 8 P

Theoretically the STW part shouldn't exceed 10ms, however we definitely can get requests stall for 100-130ms.
To confirm this, I ran a very simple test against running process:
for x in xrange(1000000):
  t0
= time.time()
  c
.perform("GET", "/noop_route")
  tdelta
= time.time() - t0
 
if tdelta > 0.02:
     
print time.time(), "Large TDELTA:", tdelta

And saw output like this:
1468438989.77 Large TDELTA: 0.12552690506
1468438992.1 Large TDELTA: 0.137628078461
1468438994.33 Large TDELTA: 0.115905046463
1468438996.6 Large TDELTA: 0.143169164658
1468438998.83 Large TDELTA: 0.13204908371
1468439000.73 Large TDELTA: 0.10854101181


This matches GC cycles very closely, happens every ~3seconds and we see stalls of >100ms. Also note that the '/noop_route' just goes through GOLang standard http stack, there should be pretty much no extra or custom code in there. 
---------

We do see same behavior with Go 1.6 too (but overall GC times are lower, but still looks like things stall during concurrent phase too). Do you have any potential ideas what might be happening here? 

Thanks
-Zviad

Caleb Spare

unread,
Sep 23, 2016, 1:31:23 PM9/23/16
to zv...@dropbox.com, golang-nuts
You could be experiencing https://github.com/golang/go/issues/16293.

You can test this hypothesis by patching in the fix
(https://github.com/golang/go/commit/cf4f1d07a189125a8774a923a3259126599e942b)
yourself. That change will be released with Go 1.8.

-Caleb
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Zviad Metreveli

unread,
Sep 23, 2016, 10:04:19 PM9/23/16
to golang-nuts, zv...@dropbox.com
Thanks Caleb. I just tried building the binary using GO toolchain at 'master'.
Unfortunately looks like the issue still persists. 

I will try to get a smaller repro of this that I can share with full code, 
meanwhile if there any other potential ideas as to how concurrent phase can stall whole GO program just let me know.

Cheers,
-Zviad
Reply all
Reply to author
Forward
0 new messages