I’d like to do crash-reporting for programs that run in environments I don’t control (e.g. your laptop). The behavior I want is similar to what many production-grade desktop applications do when they crash: capture process state information, optionally prompt the user for permission, and send the crash report to a secure server.
How one would implement such a function for Go programs is tricky without cooperation from the runtime. The options I’m considering:
1. The strawman is to wrap every goroutine that I spawn with a function that defers and calls the panic handler. It would have no effect on third party libraries which spawn goroutines (or the standard library) which makes it pretty much a non-starter. It’s also extremely onerous and unidiomatic to write all of your code this way.
2. Automate the above behavior by parsing all of the Go code (including 3rd party libs) to rewrite all statements which spawn goroutines to wrap each goroutine with a panic handler. It’s messy, adds another stage to my build process, but could work well for all of my code and 3rd party code and possibly for the
3. Set GOTRACEBACK=crash and then use the operating-system native interfaces to recover the state of the program. This is a lot of work. This interface is defined differently on each OS. Recovering the state from these crash handlers would be challenging because it would happen outside the runtime and the existing tools for this like google’s breakpad are built for C applications. A minor point, but also GOTRACEBACK=crash isn’t implemented on some OS’s yet (notably Windows).
4. Fork immediately after startup and use the parent process to monitor the child for exit code 2 and a panic traceback on stderr. This is the approach taken by panicwrap[0] which is known to work, but has two issues. Dealing with signals becomes especially tricky. Any number of supervisor programs and system administration tools rely on sending signals to manipulate processes in production. The crash-handling parent process would need to handle these signals appropriately. Should it forward them to the children? Or rely on the signaling process to signal the whole process tree? Signal handling behavior is not consistent across platforms, which makes this difficult to get right. For example, Windows apparently sends CTRL+BREAK to the whole tree, but not CTRL+C. As a final point, this approach also fails on systems that disallow spawning additional processes (NaCl, maybe AppEngine, I’m unsure).
5. Fork immediately after startup and dup stderr through to the child process. This avoids all of the signal handling conundrums of approach #4 but does mean that you can no longer check the exit status of the program and would have to fall back just to looking for the ‘panic:’ header only. Still doesn’t work if you can’t spawn processes.
6. I’d like to modify the Go runtime to simply add an API which allows a developer to intercept runtime panics and choose how to handle them. Ideally, I would like to do this with an API like:
runtime.OnPanic(func(state *ProcessState) {
// send to crash report server
})
I’d even settle for:
runtime.OnPanic(func () {
// send to crash report server
runtime.Stack()
})
What is the best option among these? Would this be an API the Go team would consider adding to the language?
[0]
github.com/mitchellh/panicwrap