CPU fairness in goroutine scheuding

481 views
Skip to first unread message

Dmitry Orlov

unread,
Jun 16, 2016, 2:45:47 PM6/16/16
to golang-nuts
Hello golang experts,

I am curious how does goroutine scheduler picks what goroutine to run, among several runnable. Does it optimize for fairness in any way?

I ran a quick experiment and found out that goroutines that run for longer intervals between yield points receive proportionally larger CPU share.
In the following code:
package main

import (
    "fmt"
"runtime"
    "time"
)

func run(n int) int {
    ret := 1
    for i := 0; i < n; i++ {
        ret *= i
    }
    return ret
}

func foo(ch chan float64, n, m int) {
    f := 0
var run_time, wait_time, start_time, end_time int64
for i := 0; i < m; i++ {
start_time = time.Now().UnixNano()
if end_time != 0 {
wait_time += start_time - end_time;
}
                f += run(n)
end_time = time.Now().UnixNano();
run_time += end_time - start_time;
runtime.Gosched()
}
    ch <- float64(run_time) / float64(run_time + wait_time)
}

func main() {
    ch := make(chan float64)
    go foo(ch, 100000, 100)
    go foo(ch, 1000000, 100)

    v := <-ch
    fmt.Printf("first: %f\n", v)
    v = <-ch
    fmt.Printf("second: %f\n", v)
}

Here, one goroutine gets 10x cpu time compared to the other.
$ GOMAXPROCS=1 ./ss 
first: 0.911957
second: 0.091656

Does this test expose the scheduler's cpu policy correctly, or it is biased? What is the best reading about scheduler's policies?

Thank you!
Dmitry.

Ian Lance Taylor

unread,
Jun 16, 2016, 3:13:56 PM6/16/16
to Dmitry Orlov, golang-nuts
On Thu, Jun 16, 2016 at 11:27 AM, Dmitry Orlov
<dmitry...@mixpanel.com> wrote:
>
> I am curious how does goroutine scheduler picks what goroutine to run, among
> several runnable. Does it optimize for fairness in any way?

The current scheduler does not optimize for fairness. Of course, the
scheduler has changed in the past and it will change in the future.

The current scheduler (approximately) associates goroutines with
threads. When a thread has to choose a new goroutine to run, it will
preferentially choose one of its associated goroutines. If it doesn't
have any ready to run, it will steal one from another thread.


> I ran a quick experiment and found out that goroutines that run for longer
> intervals between yield points receive proportionally larger CPU share.

Yes, that is what I would have guessed from the current scheduler.


> Does this test expose the scheduler's cpu policy correctly, or it is biased?
> What is the best reading about scheduler's policies?

The comment at the top of runtime/proc.go and https://golang.org/s/go11sched.


It's a more or less understood design goal that the goroutine
scheduler is optimized for network servers, where each goroutine
typically does some relatively small amount of work followed by
network or disk I/O. The scheduler is not optimized for goroutines
that do a lengthy CPU-bound computation. We leave the kernel
scheduler to handle those, and we expect that programs will set
GOMAXPROCS to a value larger than the number of lengthy CPU-bound
computations they expect to run in parallel. While I'm sure we'd all
be happy to see the scheduler do a better job of handling CPU-bound
goroutines, that would only be acceptable if there were no noticeable
cost to the normal case of I/O-bound goroutines.

Ian

Dmitry Orlov

unread,
Jun 17, 2016, 3:24:20 PM6/17/16
to Ian Lance Taylor, golang-nuts
Thanks Ian,

This is a good model for network servers. What we found about some of our traffic shapes is that most queries (to one server) have roughly the same number of I/O-related yields, but the length of CPU runs between those yields can be widely different.
If I understand it correctly, go scheduler instruments code with additional yield points not related to I/O. That can help fairness too.
 

Ian

Sam Whited

unread,
Jun 17, 2016, 3:48:00 PM6/17/16
to Dmitry Orlov, Ian Lance Taylor, golang-nuts
On Fri, Jun 17, 2016 at 2:23 PM, Dmitry Orlov <dmitry...@mixpanel.com> wrote:
> If I understand it correctly, go scheduler instruments code with additional
> yield points not related to I/O. That can help fairness too.

It will sometimes pre-empt on function calls, but this can get a bit
hairy since you can't always be sure what function calls will be
inlined. This has caused problems for me before where I was adding
lots of CPU intensive goroutines to the schedulers queue, and it was
context switching so much that it never finished any of them (which I
suppose shouldn't have suprised me, but it would be an easy mistake
for a new user who may expect CPU intensive work not to yield to
make).

For example, see this (not-very-scientific) analysis of some bcrypt
benchmarks (TL;DR they get faster if we make inlining a lot more
aggressive) which has many thousands of function calls in a loop in
the blowfish block cipher:
http://bl.ocks.org/SamWhited/ebe4f5923526c0d9220f1b5b23b56eba

—Sam


--
Sam Whited
pub 4096R/54083AE104EA7AD3
https://blog.samwhited.com
Reply all
Reply to author
Forward
0 new messages