What happens under the hood when a go-routine is spawned? Will there be any difference with and without MAXPROCS being set?

996 views
Skip to first unread message

DhilipKumar Sankaranarayanan

unread,
Feb 12, 2015, 3:11:45 AM2/12/15
to golan...@googlegroups.com
Hi All,

Im curious to know what happens under the hood when we start of a go-routine. 


For instance when i run a go program that simply creates goroutine and each of them sleeps. 

package main


import (

        "fmt"
        "time"
//        "runtime"
)

func testgoroutine() {

        time.Sleep(1000 * time.Second)

}


func main () {


  //      runtime.GOMAXPROCS(32)

        for i:=0; i<100; i++ {

                fmt.Printf("Creating goroutune %d\n", i);
                go testgoroutine()
                time.Sleep(1 * time.Millisecond)
        }

}

first when i run it with MAXPROCS commented.  Then it appears to have created only one thread (GC ? ) 

$strace ./test_goroutine  2>&1 | grep clone
clone(child_stack=0xc208030000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55559

on the other hand when i uncomment it and run it. I expect it to create 32threads (matching my MAXPROCS setting) but

$ ./test_goroutine  2>&1 | grep clone
clone(child_stack=0xc208030000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55692
clone(child_stack=0xc20802e000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55693
clone(child_stack=0xc208032000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55695
clone(child_stack=0xc20806e000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55696
clone(child_stack=0xc208070000, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD) = 55697


but it creates only 4 threads (+1 for GC) .  So what happens to these 100 goroutines multiplexed among 4threads? and why only 4 thread instead of 32? the system i'm testing does have 32 cores

$cat /proc/cpuinfo | grep processor | wc -l
32

Regards,
Dhilip
Message has been deleted

Dmitry Vyukov

unread,
Feb 12, 2015, 3:30:07 AM2/12/15
to DhilipKumar Sankaranarayanan, golang-nuts
First, you need to run strace with -f flag.
Runtime creates threads lazily as necessary and it can create additional threads when threads block in syscalls. So number of threads can be both less and greater than GOMAXPROCS.


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dave Cheney

unread,
Feb 13, 2015, 6:44:00 AM2/13/15
to golan...@googlegroups.com
To give a round about answer to your question, the Go runtime schedules a (potentially) large number of goroutines onto a smaller number of system threads. If you were around in the 90's, before Java grew real threads, and back when m:n user space threading was still a thing on linux, this was known as green threads.

The component of the runtime responsible for this is the scheduler. The scheduler knows the presence and state of all goroutines, and this state is the important part, goroutines generally have two states, runnable and blocked. A goroutine is created in a runnable state and stays that way until it performs a blocking operation. Those blocking operations are syscalls, channel sends and receives, and timer events like time.After or time.Sleep. As of 1.3 goroutines can also be made blocked by the garbage collector if it is trying to stop the world.

In any significant Go program there are always more goroutines than threads, but usually the number of runnable goroutines is relatively few. Although this is an oversimplification, when a goroutine becomes blocked it calls into the scheduler thereby yielding the thread that hosts it. The scheduler, operating in the context of the thread who's goroutine just blocked, then chooses a goroutine from the list of runnable goroutine and assumes its context and continues executing (until eventually blocked). If there is no runnable goroutine when a thread returns to the scheduler, it sits and waits.

So that's how running goroutines work, now to your example. 

You effectively have a program which starts one gorountine per second. As you have set GOMAXPROCS > 1 at the point that the go statement completes you have one goroutine on executing the main loop, and a new runnable goroutine waiting for a thread to pick it up. Whenever there is a runnable goroutine waiting in the scheduler, _and_ there is no thread waiting to be assigned it, the scheduler will create a new system thread to run that goroutine up to a limit, and that limit is GOMAXPROCS.

So, you have a process which is spawning new goroutines at a relatively slow rate compared the execution speed of your machine, and the work that this new goroutine does is very minor, it executes a time.Sleep then blocks, returning its thread to the scheduler. Remember, a goroutine calling thread.Sleep is not runnable, so occupies no system thread. 

Once your main loop has run a few times there will be a few threads, as you see, but as the rate of spawning goroutines is relatively low, they can always run the new goroutine then come back to the scheduler before the next goroutine is started, hence why I believe the number of threads observed from outside your program never grows.

You can experiment with this by doing two things, both of which you should not do in a real problem. 

1. Remove the time.Sleep in the main loop, and see what effect that has on the number of threads. I suspect it will grow, but not to 32
2. Replace time.Sleep in the child goroutine with something that consumes cpu time but does not block, for {} will do. Again, never do this in real code. You'll get you 32 cores of heat, but if you kill the program with SIGQUIT you'll see that while 32 goroutines are runnable, there will probably only be 35-40 in total, not the 1000 you requested. This is explainable because if all 32, GOMAXPROCS, system threads are spinning, there is no thread free to run the main loop to create new goroutines.

DhilipKumar Sankaranarayanan

unread,
Feb 16, 2015, 12:55:10 AM2/16/15
to golan...@googlegroups.com
Thanks Dave and Dmitri for the response,

Special thanks to Dave for such an elaborate response.   I had tried that experiment of creating 32 goroutines simply doing "for {}" as you said process(scheduler) did not have any time for 'main' function.  One more thing does that mean there is only one scheduler and one garbage collector per process regardless of the number operating system threads?  I had somehow got an impression we will have more than one scheduler and GC / process.

Regards,
Dhilip

Dmitry Vyukov

unread,
Feb 16, 2015, 2:13:51 AM2/16/15
to DhilipKumar Sankaranarayanan, golang-nuts
On Mon, Feb 16, 2015 at 8:55 AM, DhilipKumar Sankaranarayanan
<s.dhil...@gmail.com> wrote:
> Thanks Dave and Dmitri for the response,
>
> Special thanks to Dave for such an elaborate response. I had tried that
> experiment of creating 32 goroutines simply doing "for {}" as you said
> process(scheduler) did not have any time for 'main' function. One more
> thing does that mean there is only one scheduler and one garbage collector
> per process regardless of the number operating system threads? I had
> somehow got an impression we will have more than one scheduler and GC /
> process.

There is no scheduler process per-se (and it is usually a bad idea).
Each goroutine executes scheduler code when it needs to (e.g. block,
unblock somebody else or create a new goroutine).
GC has own processes. We used to have a number of them (parallel GC).
I think now we have only one on tip.

DhilipKumar Sankaranarayanan

unread,
Feb 17, 2015, 1:41:33 AM2/17/15
to golan...@googlegroups.com, s.dhil...@gmail.com
Thanks again. :-)
Reply all
Reply to author
Forward
0 new messages