Go's scheduler always “stuck” after certain time when executing the code below

Ronald

unread,

Jun 20, 2017, 11:40:12 PM6/20/17

to golang-nuts

Hi, I found when running this code below (play):

package main
import "fmt"
import "runtime"
import "time"
type callRet struct {
ret int
}
type callIn struct {
ret_chan chan *callRet
arg1 int
}
func caller(call_in_c chan *callIn, arg1 int) int {
ret_c := make(chan *callRet, 1)
ci := callIn{ret_c, arg1}
t1 := time.Now().UnixNano()
call_in_c <- &ci
ret := <-ret_c
t2 := time.Now().UnixNano()
_, _ = t1, t2
//fmt.Println(t2, t1, float64(t2-t1)/1e6)
return ret.ret
}
func call_srv(call_in_c chan *callIn) {
//runtime.LockOSThread()
ct := 0
for {
select {
case in := <-call_in_c:
ret_c := in.ret_chan
ret := callRet{3 + in.arg1}
ret_c <- &ret
ct++
if ct%1000 == 0 {
fmt.Println(ct)
}
//default:
// time.Sleep(1 * time.Millisecond)
}
}
//runtime.UnlockOSThread()
}
func init() {
//runtime.LockOSThread()
}
func main() {
p := fmt.Println
runtime.GOMAXPROCS(5)
call_in_c := make(chan *callIn, 2000)
fp := func(call_in_c chan *callIn) {
ct := 0
for ; ct < 2000000; ct = ct + 1 {
caller(call_in_c, 1)
time.Sleep(100 * time.Nanosecond)
}
p("done:)")
}
go fp(call_in_c)
go fp(call_in_c)
go func() {
for {
fmt.Println("heartbeat...")
time.Sleep(10 * 1000 * time.Microsecond)
}
}()

// after delete this little piece code, it will be ok

go func() {
for {
}
}()

// -^

call_srv(call_in_c)
return
}

It will always “stuck” in certain time after start executing (tested in both go 1.7 , 1.8.1 and 1.8.3,

and you should run it on your own computer, not on the play because it would cost several seconds).

I have checked the code above dozens of times at least and sure it is all right.

I tried to attach the go process with gdb when it is stuck and the `bt` yielded this:

(gdb) info goroutines
1 waiting runtime.gopark
2 waiting runtime.gopark
3 waiting runtime.gopark
4 waiting runtime.gopark
5 waiting runtime.gopark
* 6 running runtime.systemstack_switch
7 waiting runtime.gopark
* 8 running main.main.func3
17 runnable runtime.exitsyscall
33 waiting runtime.gopark
18 waiting runtime.gopark
49 waiting runtime.gopark
50 waiting runtime.gopark
51 waiting runtime.gopark
(gdb) goroutine 6 bt
#0 runtime.systemstack_switch () at /usr/local/go/src/runtime/asm_amd64.s:281
#1 0x0000000000415088 in runtime.gcStart (mode=0, forceTrigger=false)
at /usr/local/go/src/runtime/mgc.go:1010
#2 0x000000000040d841 in runtime.mallocgc (size=96, typ=0x4a7fa0, needzero=true,
~r3=0x507998 <runtime.timers+24>) at /usr/local/go/src/runtime/malloc.go:785
#3 0x000000000040df68 in runtime.newobject (typ=0x4a7fa0, ~r1=0x0)
at /usr/local/go/src/runtime/malloc.go:820
#4 0x00000000004035d2 in runtime.makechan (t=0x492960, size=1, ~r2=0xc4200016c0)
at /usr/local/go/src/runtime/chan.go:92
#5 0x00000000004840e4 in main.caller (call_in_c=0xc420056060, arg1=1, ~r2=0)
at /root/go/src/chan/x.go:18
#6 0x00000000004846ac in main.main.func1 (call_in_c=0xc420056060) at /root/go/src/chan/x.go:62
#7 0x000000000044da21 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:2197
#8 0x000000c420056060 in ?? ()
#9 0x0000000000000000 in ?? ()

Ronald

unread,

Jun 20, 2017, 11:45:20 PM6/20/17

to golang-nuts

PS:

go func() {
for {
}
}()

The reason of this code above is that I want to test the channel's latency

when some 'indecency' goroutine is seizing several whole OS threads.

Dave Cheney

unread,

Jun 21, 2017, 12:55:23 AM6/21/17

to golang-nuts

Yes, this is expected. Don't write infinite loops like that.

Ronald

unread,

Jun 21, 2017, 1:12:12 AM6/21/17

to golang-nuts

I found this: https://github.com/golang/go/issues/7190.

Then I replace :

go func() {
for {
}
}()

With:

go func() {
ct := 0
t1 := time.Now().UnixNano()
for {
ct++
if ct%600000000 == 0 {
t2 := time.Now().UnixNano()
fmt.Println("hooray", t2, t1, float64(t2-t1)/1e6)
t1 = t2
}
}
}()

Thus the program does not stuck anymore -- But it is thrashing

and churning! (With a time delay nearly 100-200ms based on

value '600000000' and the speed of your computer)

So I 'upgrade' the code again with:

go func() {
ct := 0
t1 := time.Now().UnixNano()
for {
ct++
if ct%100000 == 0 {
runtime.Gosched()
}
if ct%600000000 == 0 {
t2 := time.Now().UnixNano()
fmt.Println("hooray", t2, t1, float64(t2-t1)/1e6)
t1 = t2
}
}
}()

Then it looks just fine.But a pretty serious question emerged:

P1:

No matter how much the GOMAXPROCS is, if you run a very

CPU intensive go routine, your system will become definitely

thrashing and churning, and the time delay is depending on

how intensive you go code is.

Is proposition P1 is always true on golang so far?

If P1 is true, as the GOMAXPROCS is bigger than 1, can't the scheduler

rearrange the waiting staff (including pending go routines and other core

business) to other idle threads when some 'indecency' goroutine is seizing

the several busiest threads?

Ronald

unread,

Jun 21, 2017, 1:15:59 AM6/21/17

to golang-nuts

Hi Dave, thanks for your reply :)

Could you please have one more look on this?

Looking forward to hear the answer from you,

thanks a lot.

Ronald

unread,

Jun 21, 2017, 1:30:03 AM6/21/17

to golang-nuts

If proposition P1 is always true on golang, then we could

deduce that golang is not good at cpu intensive applications,

of course we could unload the jobs to Cgo, but that would

be no consistency at all.

Dave Cheney

unread,

Jun 21, 2017, 6:51:49 AM6/21/17

to golang-nuts

Go is not good for functions which never call other, take locks, read or write data, or communicate with others. In practice, most code does one or more of these things so it's rarely an issue.

Ronald

unread,

Jun 21, 2017, 7:42:14 AM6/21/17

to golang-nuts

On Wednesday, 21 June 2017 12:55:23 UTC+8, Dave Cheney wrote:

Yes, this is expected. Don't write infinite loops like that.

Sorry Dave, but I still couldn't get why it would 'stuck'

for the very first piece of code.

Of course it would be pretty reasonable that the go

process would be 'stuck' if the GOMAXPROCS is 1.

But if the GOMAXPROCS is bigger than 1, it's unreasonable

that the process would be stuck because we still have plenty of

idle 'GOMAXPROCS' left behind.

PS:

This is working as intended.
time.Sleep is defined as it will fire _after_ the duration passed to the function, so
technically this is not a bug.

To be honestly, I don't know what you exactly mean. "time.Sleep is defined as it will

fire _after_ the duration passed to the function", this is right of course. But how could

you get the conclusion: "so technically this is not a bug" ?

Ronald

unread,

Jun 21, 2017, 7:50:35 AM6/21/17

to golang-nuts

Replace :

go func() {

for {
}
}()

With:

go func() {
ct := 0
t1 := time.Now().UnixNano()
for {
ct++
if ct%600000000 == 0 {
t2 := time.Now().UnixNano()
fmt.Println("hooray", t2, t1, float64(t2-t1)/1e6)
t1 = t2
}
}
}()

But how could you explain this situation?

This piece code is not "dead spin", it is just cpu intensive.

Thus the program does not stuck anymore -- But whole process

is thrashing and churning! (With a time delay nearly 100-200ms

based on value '600000000' and the speed of your computer)

Is this a big flaw for golang when fulfill some cpu intensive staff?

Jan Mercl

unread,

Jun 21, 2017, 8:09:23 AM6/21/17

to Ronald, golang-nuts

On Wed, Jun 21, 2017 at 1:50 PM Ronald <yunth...@gmail.com> wrote:

> for {
> ct++
> if ct%600000000 == 0 {

> t2 := time.Now().UnixNano()

> fmt.Println("hooray", t2, t1, float64(t2-t1)/1e6)

> t1 = t2

> }

> This piece code is not "dead spin", it is just cpu intensive.

This piece of code prohibits cooperative scheduling in 599 999 999 out of 600 000 000 times. It's a nice example how to not code in Go. Particularly, delays should be implemented using the functions in the time package, not by burning CPU cycles at the expense of other useful work that could have been done instead.

--

-j

Sen Han

unread,

Jun 21, 2017, 8:49:59 AM6/21/17

to Jan Mercl, golan...@googlegroups.com

This piece of code prohibits cooperative scheduling in 599 999 999 out of

600 000 000 times. It's a nice example how to not code in Go.

Thanks Jan for your reply :)

Particularly, delays should be implemented using the functions in the time

package, not by burning CPU cycles at the expense of other useful work

that could have been done instead.

Sure this kinda of dead loop is just an simulation about some really CPU intensive jobs.

And I think for those kind of job, the delay about range [20 ms- inf ms) is quite ordinary.

Thus I think that kind of writing is acceptable :D

Ronald

unread,

Jun 21, 2017, 9:06:12 AM6/21/17

to golang-nuts, 0xj...@gmail.com

And I had write a test case to be a proof that:

current golang's scheduler is not

good at cpu intensive applications

Here it is:

package main

import "fmt"
import "runtime"

//import "math/big"
import "time"

/*
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#include <time.h>
int deadloop() {
long long ct=0;
struct timespec start, stop;
clock_gettime(CLOCK_MONOTONIC_RAW, &start);
while (1) {
ct++;
if (ct%800000000==0){
clock_gettime(CLOCK_MONOTONIC_RAW, &stop);
printf(
"cgo-> ct:%ld | c loop delay:%lu ms\n",
ct,
((stop.tv_sec - start.tv_sec) * 1000000 + (stop.tv_nsec - start.tv_nsec) / 1000)/1000);
clock_gettime(CLOCK_MONOTONIC_RAW, &start);
}
}
return rand();
}

*/
import "C"

type callRet struct {
ret int
}

type callIn struct {
ret_chan chan *callRet
arg1 int
}

var delta_t_ms float64

func caller(call_in_c chan *callIn, arg1 int) int {
ret_c := make(chan *callRet, 1)
ci := callIn{ret_c, arg1}
t1 := time.Now().UnixNano()
call_in_c <- &ci
ret := <-ret_c
t2 := time.Now().UnixNano()
_, _ = t1, t2

if float64(t2-t1)/1e6 > delta_t_ms {
delta_t_ms = float64(t2-t1) / 1e6

}
//fmt.Println(t2, t1, float64(t2-t1)/1e6)
return ret.ret
}

func call_srv(call_in_c chan *callIn) {
//runtime.LockOSThread()
ct := 0
for {
select {
case in := <-call_in_c:
ret_c := in.ret_chan
ret := callRet{3 + in.arg1}
ret_c <- &ret
ct++

if ct%10000 == 0 {
fmt.Println("channel_op_ct:", ct, " max_channel_op_delay:", delta_t_ms, "ms")

}
//default:
// time.Sleep(1 * time.Millisecond)
}
}
//runtime.UnlockOSThread()
}

func init() {
//runtime.LockOSThread()
}

func main() {
p := fmt.Println

runtime.GOMAXPROCS(2)

call_in_c := make(chan *callIn, 2000)

fp := func(call_in_c chan *callIn) {
ct := 0

for ; ct < 10*60000; ct = ct + 1 {
caller(call_in_c, 1)
time.Sleep(time.Microsecond / 1000)

}
p("done:)")
}
go fp(call_in_c)

//go fp(call_in_c)
go func() {
for {
fmt.Println("heartbeat... ", "max_channel_op_delay:", delta_t_ms, "ms")
time.Sleep(1000 * 1000 * time.Microsecond)
}
}()

godeadloop := func() {

ct := 0
t1 := time.Now().UnixNano()
for {
ct++

if ct%360000000 == 0 {

t2 := time.Now().UnixNano()
fmt.Println("hooray", t2, t1, float64(t2-t1)/1e6)
t1 = t2
}

if ct%360000000 == 0 {
runtime.Gosched()
}
}
}
_ = godeadloop
_ = C.deadloop
go C.deadloop()
//go godeadloop() // <<---------
call_srv(call_in_c)
return
}

After go run it yields:

CPU intensive code in cgo:

heartbeat... max_channel_op_delay: 0.81509 ms
channel_op_ct: 590000 max_channel_op_delay: 0.81509 ms
channel_op_ct: 600000 max_channel_op_delay: 0.81509 ms
done:)
cgo-> ct:3200000000 | c loop delay:1039 ms
heartbeat... max_channel_op_delay: 0.81509 ms

And after changing 'go C.deadloop' to 'go godeadloop()' (at the very

bottom of this codes) it yields:

CPU intensive code in pure go code:

hooray 1498049095239314392 1498049094319306332 920.00806 ms (go dead loop)
channel_op_ct: 600000 max_channel_op_delay: 970.921145 ms
done:)
heartbeat... max_channel_op_delay: 970.921145 ms
hooray 1498049096158299639 1498049095239314392 918.985247

You see! There is 0.8ms latency drops to 970ms after using cpu intensive

codes in golang comparing to unload it to cgo directly. And this kinda penalty

could definitely cause the unstability of go application which has some really

massive things to handle in pure go codes.

Chris Hopkins

unread,

Jun 21, 2017, 9:12:49 AM6/21/17

to golang-nuts, 0xj...@gmail.com

Particularly, delays should be implemented using the functions in the time
package, not by burning CPU cycles at the expense of other useful work
that could have been done instead.

Sure this kinda of dead loop is just an simulation about some really CPU intensive jobs.
And I think for those kind of job, the delay about range [20 ms- inf ms) is quite ordinary.
Thus I think that kind of writing is acceptable :D

It seems really pointless if you've got lots of work to do to have to sit in a delay.

Would not a call to runtime.Gosched() be more efficient. That way you accept that you yeald to other procs, but still allow yourself to be rescheduled immediately if there is nothing else in the queue?

Ronald

unread,

Jun 21, 2017, 9:37:59 AM6/21/17

to golang-nuts, 0xj...@gmail.com

It seems really pointless if you've got lots of work to do to have to sit in a delay.

Some CPU intensive application indeed sit here and do some massive mathematic calculation and memory manipulation.There is too many examples, say some de/en-coding worker for example.

Would not a call to runtime.Gosched() be more efficient. That way you accept that you yeald to other procs, but still allow yourself to be rescheduled immediately if there is nothing else in the queue?

This the point.

The more runtime.Gosched() function calls you put into the

places of your CPU intensive go codes, the less latency it

could be, but the CPU cost on the core go runtime is unwanted

increasing simultaneously (the cost on go core runtime is

the less the better from the point of view of the go applications).

So there is unresolved conflict.

But as there is still many idle GOMAXPROCS left behind idly, I

can not see any reason for the go core runtime to suspend all the

other goroutines just because there is one 'indecency' cpu intensive

goroutine is fully occupying just one os thread. There must something

could be done at the inner of go scheduler that could optimize such

scenarios.

Jesper Louis Andersen

unread,

Jun 21, 2017, 9:41:43 AM6/21/17

to Ronald, golang-nuts

On Wed, Jun 21, 2017 at 5:40 AM Ronald <yunth...@gmail.com> wrote:

Hi, I found when running this code below (play):

You are looking at a property of a specific Go implementation, not a property of the language.

Any system must define a set of "preemption points" at which you can switch out one process for another. Historically go only preempted on channel communication and (blocking) system calls. Later it also added preemption on function calls (it piggybacks on stack checks). Adding more preemption requires you to add checks inside loops so a loop eventually gets preempted, but the caveat is that this might slow you program down when you have a tight loop for which the compiler can prove no eventual bound.

The current best workaround is to make sure that lengthy work periodically hits a preemption point some way or the other. Later versions of Go might solve this problem by adding those checks itself automatically to your programs to improve their latencies.

Other concurrent languages use different methods:

Erlang preempts like Go does currently, but since loops are implemented with recursion, you are sure to hit a preemption point every time you go round the loop.

Haskell preempts on memory allocation and Haskell programs tend to allocate memory quite frequently. The exception is tight (fusioned) loop code on arrays.

Go is doing a *far* better job here than almost every other programming language out there when it comes to fair scheduling. And there is a somewhat simple workaround: just add a function call occasionally depending on your required latency bound.

Sen Han

unread,

Jun 21, 2017, 10:21:05 AM6/21/17

to golan...@googlegroups.com

> just add a function call occasionally depending on your required latency bound

Thanks Jesper for your all detailed explanation. I quite

agree with your point of view and conclusion.

One thing left here should to be discussed is about the

go scheduler's unwanted behaviour.

Unwanted behaviour of the scheduler:

After go run the codes at the topmost of this thread, we got:

$ time go run na.go
heartbeat...
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
heartbeat...
12000
13000
heartbeat...
14000
15000
16000
17000
18000
19000
heartbeat...
^Csignal: interrupt
real 0m0.779s
user 0m0.720s
sys 0m0.152s

(The process stuck here forever and no new outputs anymore)

As the runtime.GOMAXPROCS is 5 now, why can't the scheduler just

let the other several goroutines which is already pending all switch

to the 4 left idly goroutine 'slots' and run on instead of suspend forever?

I think there is probably some design flaw about it.

On Wed, Jun 21, 2017 at 10:18 PM, Sen Han <yunth...@gmail.com> wrote:

> just add a function call occasionally depending on your required latency bound

Thanks Jesper for your all detailed explanation. I quite
agree with your point of view and conclusion.

One thing left here should to be discussed is about the
go scheduler's unwanted behaviour.

Unwanted behaviour of the scheduler:

After go run the codes at the topmost of this thread, we got:

$ time go run na.go
heartbeat...
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
11000
heartbeat...
12000
13000
heartbeat...
14000
15000
16000
17000
18000
19000
heartbeat...
^Csignal: interrupt
real 0m0.779s
user 0m0.720s
sys 0m0.152s

(The process stuck here forever and no new outputs anymore)

As the runtime.GOMAXPROCS is 5 now, why can't the scheduler just
let the other several goroutines which is already pending all switch
to the 4 left idly goroutine 'slots' and run on instead of suspend forever?
I think there is probably some design flaw about it.

Chris Hines

unread,

Jun 21, 2017, 3:30:05 PM6/21/17

to golang-nuts

David Chase

unread,

Jun 21, 2017, 3:36:43 PM6/21/17

to golang-nuts

Try compiling your code with -gcflags -d=ssa/insert_resched_checks/on .

You should see the scheduler thrashing stop, at the expense of some loop overhead.

(The simpler the loop, the worse the overhead, that is why this phase is not normally enabled.

We're working on the overhead issues, they've not been easy so far.)

Keith Randall

unread,

Jun 21, 2017, 8:37:41 PM6/21/17

to golang-nuts

The reason why the other goroutines can't run is garbage collection.

Because there is one goroutine that's not responding to the runtime's stop request, the runtime can't finish the garbage collection. That outstanding goroutine might have pointers to otherwise unreachable objects. The runtime can't look at the pointers a goroutine has until it can get the goroutine to stop.

The runtime can't let the other goroutines keep running at this point. It needs to finish the garbage collection so that it can reclaim memory. Otherwise your program's pathological behavior would switch from hanging to OOMing.

Ronald

unread,

Jun 25, 2017, 8:34:22 AM6/25/17

to golang-nuts

Thanks to all of you :)

Recently I have implemented a sub-option in the `go tool trace` named "diagreedy"

which means "diagnoses and finds out the greediest several goroutines". This tool

already helped us track down several deep hiden problems in our go applications

and achieved more stable and short GC pause latency by fixing them afterwards.

If you are interested in this little tool, there is the repo and the go proposal issue.

All best

Reply all

Reply to author

Forward