Intercepting goroutine context switching

mili...@gmail.com

unread,

Feb 9, 2019, 5:02:51 PM2/9/19

to golang-nuts

I am looking at fine-grained calling context collection for go lang programs (for all go routines) using binary instrumentation tools such as Intel Pin.

In a traditional language such as C/C++ intercepting CALL/RET instructions (plus some special handling for exceptions setjmp/longjmp) suffices.

Go makes it significantly more complicated.

For Go, the scheduler can context switch from one goroutine to another (including garbage collection etc.).

The scheduler adjusts the stack pointer and program counter during these events, which (for x86) is mostly in this file: https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s

Is there a go runtime expert, who can authoritatively confirm whether all the go routine context switching code is captured in this file or if there are other places too?

It would also be great if somebody can confirm whether saving the current go routine state into gobuf_sp, gobuf_pc, gobuf_g, gobuf_ctxt, gobuf_ret and restoring a new one and jumping to the new gobuf_pc is the standard context switching idiom? Is there use of any other idiom such as overwriting the return address of a caller on the thread stack to jump to a different context during a return from a procedure?

Thanks in advance for answering these details.

-Milind

robert engels

unread,

Feb 9, 2019, 5:17:45 PM2/9/19

to mili...@gmail.com, golang-nuts

It is slightly more advanced that that - since there are multiple OS threads that the Go routines are multiplexed onto.

The easiest solution is to look at the ‘trace’ code as it records the context switches.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

robert engels

unread,

Feb 9, 2019, 5:29:35 PM2/9/19

to Milind Chabbi, golang-nuts

It is runtime/trace/trace.go

It is what is reported in the trace facility. It captures assigning Go routines to OS thread (processor), and also switching Go routines on a logical processor (OS thread).

You will still need to use OS level facilities to determine the context switches of the OS threads.

The events are written to the trace file. You would need to modify the code and build your own runtime if you wanted to perform live intercepts.

On Feb 9, 2019, at 4:24 PM, Milind Chabbi <mili...@gmail.com> wrote:

Robert,
Pointers to the exact file would be quite useful. Also, is it procedure-level or call-back level interception?
I am looking at machine instruction-level interception.

-Milind

Ian Lance Taylor

unread,

Feb 9, 2019, 5:34:36 PM2/9/19

to mili...@gmail.com, golang-nuts

On Sat, Feb 9, 2019 at 2:02 PM <mili...@gmail.com> wrote:
>
> I am looking at fine-grained calling context collection for go lang programs (for all go routines) using binary instrumentation tools such as Intel Pin.
> In a traditional language such as C/C++ intercepting CALL/RET instructions (plus some special handling for exceptions setjmp/longjmp) suffices.
> Go makes it significantly more complicated.
>
> For Go, the scheduler can context switch from one goroutine to another (including garbage collection etc.).
> The scheduler adjusts the stack pointer and program counter during these events, which (for x86) is mostly in this file: https://github.com/golang/go/blob/master/src/runtime/asm_amd64.s
> Is there a go runtime expert, who can authoritatively confirm whether all the go routine context switching code is captured in this file or if there are other places too?

Yes, for amd64 all the goroutine switching code is in runtime/asm_amd64.s.

> It would also be great if somebody can confirm whether saving the current go routine state into gobuf_sp, gobuf_pc, gobuf_g, gobuf_ctxt, gobuf_ret and restoring a new one and jumping to the new gobuf_pc is the standard context switching idiom? Is there use of any other idiom such as overwriting the return address of a caller on the thread stack to jump to a different context during a return from a procedure?

Yes, that is the standard idiom for switching goroutines, as seen in
the gosave, gogo, and mcall functions. Also systemstack arguably
changes goroutines, though only to the g0 of the current thread.

The runtime does overwrite the PC in order to panic from a signal
handler, in sigctxt.preparePanic, but not for goroutine switching.

As Robert Engels says, tracking traceGoStart might be useful, though
you do have to have tracing enabled.

Ian (please reply to the mailing list, not just to me; thanks)

robert engels

unread,

Feb 9, 2019, 5:37:56 PM2/9/19

to Milind Chabbi, golang-nuts

You also need to look at internal/trace to see how the runtime logs the trace events related to the Go routines - that will show you where you need to intercept.

Milind Chabbi

unread,

Feb 9, 2019, 8:08:10 PM2/9/19

to Ian Lance Taylor, golang-nuts

Ian,

Can you give me more details about these terms: "g0", "systemstack", and "mcall"?

Robert: taking the trace route is incorrect for me; I am profiling, tracing is too expensive because of logging, furthermore, I don't want to recompile the application or change the go runtime.

-Milind

robert engels

unread,

Feb 9, 2019, 8:39:59 PM2/9/19

to Milind Chabbi, Ian Lance Taylor, golang-nuts

Milind,

Understood. A few things to think about: 1) the tracing seems pretty efficient in my tests, although if you don’t want a custom runtime, you’ll need to post analyze on the file. 2) I think you are going to have a hard time doing what you are trying to do because of the shared threads, you won’t have an idea of what is being executed just by recording the place of the context switch - you’ll need to inspect the associated ‘current g’ structure, which can be different based on the compiled version of Go.

It is a similar problem faced even by pprof. Since it only records ‘time in function’ and isn’t reported by Go routine, just the combined process, a Go routine that locks and OS thread and calls a system call that pauses, looks on the outside as being expensive… when in reality it is not doing anything.

Milind Chabbi

unread,

Feb 9, 2019, 9:11:29 PM2/9/19

to robert engels, Ian Lance Taylor, golang-nuts

Robert,

Tracing is not possible, there will be trillions of instructions logged at the instruction level tracing.

Let me elaborate a bit more.

The kind of tool I am thinking, intercepts every instruction being executed on all OS threads in the go process.

The issue is that when an "interesting event" has been detected by my tool, I need to record the calling context, that is main()->func1()->func2().

The call stack unwinding is too expensive in my case since the "interesting events" happen too frequently.

But it is much easier to maintain a running per-go-routime shadow stack by tracking call-ret sequences as it is executing on an OS thread.

Since, the go scheduler switches from one go routine stack to another, I need to do the following things (in addition to the call-ret tracking):

1. Precisely recognize the stack switch event by the current thread. Let the current go stack be S1.

2. Save the shadow stack S1_shadow (analogous to the save sequence in the gosave routine) in metadata.

3. Precisely identify the other stack S2 the scheduler is switching to.

4. Search for S2_shadow in the metadata and switch to it (analogous to the restore sequence in the gogo routine).

-Milind

Robert Engels

unread,

Feb 9, 2019, 9:57:00 PM2/9/19

to Milind Chabbi, Ian Lance Taylor, golang-nuts

I guess it depends on what you are logging/doing in response to the interesting event, and where/when these occur, and how you are detecting them, as the tracing of the context switch is very cheap.

Ian Lance Taylor

unread,

Feb 11, 2019, 4:41:23 PM2/11/19

to Milind Chabbi, golang-nuts

On Sat, Feb 9, 2019 at 5:07 PM Milind Chabbi <mili...@gmail.com> wrote:
>
> Can you give me more details about these terms: "g0", "systemstack", and "mcall"?

In Go, goroutines are multiplexed on to threads, so there are
typically many more goroutines than there are threads. Each thread
has an associated goroutine that is used when entering the runtime
scheduler. That goroutine is called the g0 goroutine for the thread.
The g0 goroutine has a larger stack, namely the OS-allocated stack for
the thread. systemstack is a function in the runtime package which
invokes a closure on the current thread's g0 goroutine. mcall is a
function in the runtime package which calls a function on the current
thread's g0 goroutine, passing the current g; the invoked function
must never return, but must instead continue execution by calling
gogo.

> Robert: taking the trace route is incorrect for me; I am profiling, tracing is too expensive because of logging, furthermore, I don't want to recompile the application or change the go runtime.

Note that the tracing we are talking about is the tracing built into
the Go runtime, not the tracing done by the perf tool.

Ian

Reply all

Reply to author

Forward