runtime: split stack overflow

469 views
Skip to first unread message

martin....@gmail.com

unread,
Aug 18, 2016, 11:35:21 AM8/18/16
to golang-nuts
Hi,

i'm trying to replace a Go signal handler by a C signal handler and try to call the stored Go handler inside that C handler. It works well on Linux, but on Darwin I receive a "runtime: split stack overflow" exception. Is that supposed to work on OSX?

Thanks!
Martin

Example program:

package main


/*

#define _XOPEN_SOURCE 700

#include <signal.h>

#include <stdio.h>

#include <stdlib.h>

#include <string.h>


static struct sigaction go_handler;


static void signal_handler(int sig, siginfo_t* info, void* ctx) {

    printf("called %d\n", sig);

    if (go_handler.sa_flags & SA_SIGINFO) {

        go_handler.sa_sigaction(sig, info, ctx);

    } else {

        go_handler.sa_handler(sig);

    }

}


void install_signal_handler() {

    struct sigaction action;

    memset(&action, 0, sizeof(action));

    action.sa_sigaction = signal_handler;

    action.sa_flags = SA_SIGINFO | SA_RESTART |SA_ONSTACK;

    sigemptyset(&action.sa_mask);

    sigaction(SIGSEGV, &action, &go_handler);

}

*/

import "C"


func init() {

C.install_signal_handler()

}


func main() {

var x *int

*x = 0

}


Output on OSX:

called 11

runtime: newstack sp=0xc420009a40 stack=[0xc42004c000, 0xc42004dfc0]

morebuf={pc:0x7fff8ef2c52a sp:0xc420009a50 lr:0x0}

sched={pc:0x40340e0 sp:0xc420009a48 lr:0x0 ctxt:0x0}

runtime: gp=0xc4200001a0, gp->status=0x2

 runtime: split stack overflow: 0xc420009a40 < 0xc42004c000

fatal error: runtime: split stack overflow


runtime stack:

runtime.throw(0x406bdf3, 0x1d)

/usr/local/go/src/runtime/panic.go:566 +0x95

runtime.newstack()

/usr/local/go/src/runtime/stack.go:1004 +0x5f4

runtime.morestack()

/usr/local/go/src/runtime/asm_amd64.s:366 +0x7f


goroutine 1 [running]:

runtime.sighandler(0xc42004df90, 0x0, 0x0, 0x0)

/usr/local/go/src/runtime/signal_amd64x.go:44 fp=0xc420009a50 sp=0xc420009a48


goroutine 17 [syscall, locked to thread]:

runtime.goexit()

/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1



Ian Lance Taylor

unread,
Aug 18, 2016, 7:54:35 PM8/18/16
to martin....@gmail.com, golang-nuts
On Thu, Aug 18, 2016 at 6:56 AM, <martin....@gmail.com> wrote:
>
> i'm trying to replace a Go signal handler by a C signal handler and try to
> call the stored Go handler inside that C handler. It works well on Linux,
> but on Darwin I receive a "runtime: split stack overflow" exception. Is that
> supposed to work on OSX?

It appears as though the signal handler is somehow not running on the
alternate signal stack. I don't know why that would be, though. I
have no other explanation. You'll have to debug it.

What you are doing seems rather dubious but I can't think of any
reason why it shouldn't work. I'll note that calling printf from a
signal handler is completely unsafe, but for this small program it
shouldn't matter.

Ian

martin....@gmail.com

unread,
Aug 19, 2016, 5:34:14 AM8/19/16
to golang-nuts, martin....@gmail.com
It appears as though the signal handler is somehow not running on the
alternate signal stack. I don't know why that would be, though. I
have no other explanation. You'll have to debug it.

I can see by the stack pointers, that the alt stack is used on both, Linux and Darwin. But I'm not familiar enough with the internals to debug why it calls morestack/newstack.
 
What you are doing seems rather dubious but I can't think of any
reason why it shouldn't work.
Maybe you can shortly explain, why you think its dubious. I am calling a shared C lib from Go, that spawns its own threads, and also calls back into Go. According to the signal package documentation, registering a C signal handler seems to be the only way to handle crashes in foreign threads. As I want all non-Go threads to be dumped as well in case of a signal raised in Go, I register the C handlers after the Go handlers. And instead of raising the signal again, I call the Go handler directly to keep the original mcontext. Do you think there's a better way to do it?

Thanks!
Martin

Ian Lance Taylor

unread,
Aug 19, 2016, 11:10:59 AM8/19/16
to Martin Strenge, golang-nuts
On Fri, Aug 19, 2016 at 2:33 AM, <martin....@gmail.com> wrote:
>> It appears as though the signal handler is somehow not running on the
>> alternate signal stack. I don't know why that would be, though. I
>> have no other explanation. You'll have to debug it.
>>
> I can see by the stack pointers, that the alt stack is used on both, Linux
> and Darwin. But I'm not familiar enough with the internals to debug why it
> calls morestack/newstack.

The call to `morestack` is happening because the function `sighandler`
thinks that the value of the stack pointer register is too close to
the end of the current stack segment. In this case, the error message
shows that the stack pointer is not in the current stack segment at
all. The stack pointer at that point should be the alternate signal
stack. The function `sigtrampgo` should have switched to the
`g.m.gsignal` goroutine, whose stack should have been set up to be the
alternate signal stack. So `sighandler` should never call into
`morestack`. Something has gone wrong but I don't know what.


>> What you are doing seems rather dubious but I can't think of any
>> reason why it shouldn't work.
>
> Maybe you can shortly explain, why you think its dubious. I am calling a
> shared C lib from Go, that spawns its own threads, and also calls back into
> Go. According to the signal package documentation, registering a C signal
> handler seems to be the only way to handle crashes in foreign threads. As I
> want all non-Go threads to be dumped as well in case of a signal raised in
> Go, I register the C handlers after the Go handlers. And instead of raising
> the signal again, I call the Go handler directly to keep the original
> mcontext. Do you think there's a better way to do it?

I'm not sure which part of the os/signal docs you are thinking of.

Crashes in Go code will work regardless of whether they are running on
threads started by C or not. So I assume you are talking about
crashes in C. How do you want to handle those crashes? Do you just
want to try to dump the stack? How do you want to handle other C
threads when one C thread crashes?

I do agree that your code should work in principle, and I'm not sure
why it doesn't.

If all you wan to do is handle SIGSEGV when it occurs in a C thread,
it may work to call signal.Notify(c, syscall.SIGSEGV). The channel
will receive a signal wen a SIGSEGV occurs in C code. At that point
it's not safe to continue, but it is safe to take whatever action you
like to dump C threads. I'm not sure this will work, because it
depends on what happens when the SIGSEGV signal handler returns to the
C code that triggered the SIGSEGV.

Ian

martin....@gmail.com

unread,
Aug 26, 2016, 9:35:34 AM8/26/16
to golang-nuts, martin....@gmail.com
Hi,

It took a while to understand what's going on.
 
I'm not sure which part of the os/signal docs you are thinking of.

I'm referring to "Go programs that use cgo or SWIG", last paragraph ("If the Go signal handler is invoked on a non-Go thread not running Go code [...]"). I couldn't get any information about the crash in C from the Go signal handler. It remained silent and just quitted the program.

Crashes in Go code will work regardless of whether they are running on 
threads started by C or not.  So I assume you are talking about
crashes in C.  How do you want to handle those crashes?  Do you just
want to try to dump the stack?  How do you want to handle other C
threads when one C thread crashes?

Yes, I meant crashes in C. I want to create a crash dump file and dump all C threads and Go routines. 

I do agree that your code should work in principle, and I'm not sure
why it doesn't.

If all you wan to do is handle SIGSEGV when it occurs in a C thread,
it may work to call signal.Notify(c, syscall.SIGSEGV).  The channel
will receive a signal wen a SIGSEGV occurs in C code.  At that point
it's not safe to continue, but it is safe to take whatever action you
like to dump C threads.  I'm not sure this will work, because it
depends on what happens when the SIGSEGV signal handler returns to the
C code that triggered the SIGSEGV.

The reason for the morestack call is, that sigtramp is not called in my code example. The sa_tramp seems to be overwritten in my call to sigaction(int,struct sigaction*,struct sigaction*) and I cannot retrieve the original trampoline function via __sigaction(int, struct __sigaction*, struct sigaction*), to call it in my handler. Calling it directly would probably not work anyway, as sigtrampgo already calls sigreturn, but I want to be able to do calculations in the C handler if the Go handler has returned. So, I'm quite stuck here. Do you have any ideas?

I will open another thread to discuss the "correct way" to create a crash dump file for Go and C, and discuss my experiments. I'll leave this thread for the signal handler replacement on Darwin.

Thanks!
Martin

Ian Lance Taylor

unread,
Aug 26, 2016, 10:16:04 AM8/26/16
to Martin Strenge, golang-nuts
On Fri, Aug 26, 2016 at 6:35 AM, <martin....@gmail.com> wrote:
>
> The reason for the morestack call is, that sigtramp is not called in my code
> example. The sa_tramp seems to be overwritten in my call to
> sigaction(int,struct sigaction*,struct sigaction*) and I cannot retrieve the
> original trampoline function via __sigaction(int, struct __sigaction*,
> struct sigaction*), to call it in my handler. Calling it directly would
> probably not work anyway, as sigtrampgo already calls sigreturn, but I want
> to be able to do calculations in the C handler if the Go handler has
> returned. So, I'm quite stuck here. Do you have any ideas?

Interesting. Maybe we need to change this line in setsig in
runtime/os_darwin.go
*(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) = fn
to be
*(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) =
unsafe.Pointer(funcPC(sigtramp))

We really don't ever want to call fn here.

Ian

martin....@gmail.com

unread,
Aug 26, 2016, 10:36:37 AM8/26/16
to golang-nuts, martin....@gmail.com
Interesting.  Maybe we need to change this line in setsig in
runtime/os_darwin.go
    *(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) = fn
to be
    *(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) =
unsafe.Pointer(funcPC(sigtramp))  

That's not possible, the signatures of sa_tramp and sa_sigaction do not match:

/* union for signal handlers */ 
union __sigaction_u { 
    void (*__sa_handler)(int);
    void (*__sa_sigaction)(int, struct __siginfo *, void *);
}; 

/* Signal vector template for Kernel user boundary */
struct __sigaction {
    union __sigaction_u __sigaction_u; /* signal handler */ 
    void (*sa_tramp)(void *, int, int, siginfo_t *, void *);
    sigset_t sa_mask; /* signal mask to apply */
    int sa_flags; /* see signal options below */
};

Martin 

Ian Lance Taylor

unread,
Aug 26, 2016, 10:49:02 AM8/26/16
to Martin Strenge, golang-nuts
Good point. OK, then perhaps we need setsig in os_darwin.go to set
the sigaction field to a new function, written in assembler, like
sigtramp, but taking just the sigaction arguments. And presumably not
calling sigreturn.

Ian

martin....@gmail.com

unread,
Aug 29, 2016, 3:19:12 AM8/29/16
to golang-nuts, martin....@gmail.com
OK, then perhaps we need setsig in os_darwin.go to set
the sigaction field to a new function, written in assembler, like
sigtramp, but taking just the sigaction arguments.  

Could you please explain, why it is necessary to use sa_tramp with a custom function instead of using the default one and do it like in Linux?
 
And presumably not
calling sigreturn.

... and save all registers in plan9 sigtramp functions.

I'll open an issue.

Thanks,
Martin

Ian Lance Taylor

unread,
Aug 29, 2016, 10:43:42 AM8/29/16
to Martin Strenge, golang-nuts
On Mon, Aug 29, 2016 at 12:18 AM, <martin....@gmail.com> wrote:
>> OK, then perhaps we need setsig in os_darwin.go to set
>> the sigaction field to a new function, written in assembler, like
>> sigtramp, but taking just the sigaction arguments.
>
>
> Could you please explain, why it is necessary to use sa_tramp with a custom
> function instead of using the default one and do it like in Linux?

When doing an internal link we can't access the sa_tramp function,
since in that case we don't link against libc at all.

That said there is a possibility that we will have to change to always
using external linking on Darwin.


>> And presumably not
>> calling sigreturn.
>
>
> ... and save all registers in plan9 sigtramp functions.

I'm only talking about changing Darwin.

Ian

martin....@gmail.com

unread,
Aug 29, 2016, 11:20:12 AM8/29/16
to golang-nuts, martin....@gmail.com
I'm only talking about changing Darwin.

Sorry, I meant pushing the gcc callee-saved registers in the assembly sigtramp functions (for example in sys_darwin_amd64.s) as it is done in crosscall2. I ran into the issue, that the content in the rbx register was overwritten by the sigfwdgo() function as plan9 does not have the same register saving rules.

Martin

Ian Lance Taylor

unread,
Aug 29, 2016, 12:47:00 PM8/29/16
to Martin Strenge, golang-nuts
Oh, I see, sorry. I don't usually call that "plan9", I call it the Go
ABI or Go calling convention. Go also separately supports Plan 9, so
references to Plan 9 can be confusing.

Ian
Reply all
Reply to author
Forward
0 new messages