Catching panics in multiple goroutines

2,650 views
Skip to first unread message

ericce...@gmail.com

unread,
Mar 4, 2014, 7:34:23 PM3/4/14
to golan...@googlegroups.com
So, I have a network server. Each connection is handled in a goroutine, which may spawn many other goroutines.

Any of these goroutines might panic, so I need to put recover calls all over the place. If I miss one, that goroutine panicking will lead to the entire program crashing.

So the simplest demonstration would be goroutine A spawning goroutine B, which panicks. Can I somehow catch the panic with code in A?

The thing is that all the state my program keeps is connection-local basically, and I want a goroutine panic to kill the goroutines of one connection, instead of the whole program. This is feasible if each connection only has one goroutine (using recover), but not if there are multiple goroutines doing things in the background for one connection.

Thanks!

Ian Lance Taylor

unread,
Mar 4, 2014, 10:06:40 PM3/4/14
to ericce...@gmail.com, golang-nuts
On Tue, Mar 4, 2014 at 4:34 PM, <ericce...@gmail.com> wrote:
>
> So, I have a network server. Each connection is handled in a goroutine,
> which may spawn many other goroutines.
>
> Any of these goroutines might panic, so I need to put recover calls all over
> the place. If I miss one, that goroutine panicking will lead to the entire
> program crashing.
>
> So the simplest demonstration would be goroutine A spawning goroutine B,
> which panicks. Can I somehow catch the panic with code in A?

Not directly, but you can always use a mygo function instead of the go
statement.

func mygo(f func()) {
go func() {
defer func() {
if r := recover(); r != nil {
Log("panic")
}
}()
f()
}()
}

Now change all cases of
go f()
to
mygo(f)
and change all cases of
go f(a)
to
mygo(func() { f(a) })

Ian

Dave Cheney

unread,
Mar 5, 2014, 4:53:41 AM3/5/14
to golan...@googlegroups.com, ericce...@gmail.com
Why does your program panic so much ? It sounds like if you solved all these panics you wouldn't need to invent a mechanism to recover from them.

Panics are neither common place nor expected in Java, if your software is causing runtime panics you should address that root cause.

Dave Cheney

unread,
Mar 5, 2014, 4:55:48 AM3/5/14
to golang-nuts, ericce...@gmail.com
On Wed, Mar 5, 2014 at 8:53 PM, Dave Cheney <da...@cheney.net> wrote:
Why does your program panic so much ? It sounds like if you solved all these panics you wouldn't need to invent a mechanism to recover from them.

Panics are neither common place nor expected in Java, if your software is causing runtime panics you should address that root cause.

Sorry, what I meant to write was "Panics are neither common place nor expected in _Go_ (this isn't Java)."
 

On Wednesday, 5 March 2014 11:34:23 UTC+11, ericce...@gmail.com wrote:
So, I have a network server. Each connection is handled in a goroutine, which may spawn many other goroutines.

Any of these goroutines might panic, so I need to put recover calls all over the place. If I miss one, that goroutine panicking will lead to the entire program crashing.

So the simplest demonstration would be goroutine A spawning goroutine B, which panicks. Can I somehow catch the panic with code in A?

The thing is that all the state my program keeps is connection-local basically, and I want a goroutine panic to kill the goroutines of one connection, instead of the whole program. This is feasible if each connection only has one goroutine (using recover), but not if there are multiple goroutines doing things in the background for one connection.

Thanks!

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/Bi43JRE-c6w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nate Finch

unread,
Mar 5, 2014, 5:57:30 AM3/5/14
to golan...@googlegroups.com, ericce...@gmail.com
Yeah, that was my first reaction. Don't panic unless you want your program to crash.  They're not exceptions and shouldn't be used as exceptions. They're intentionally clumsy because they are discouraged for exactly this reason.  Panics / exceptions just don't work well in multithreaded applications.  The common go idiom is to return errors.  With multiple return values you can always return your actual data plus an error (which may be nil).  Errors are just data, so they're easy to pass around across goroutines, down through channels, etc.

Eric Dong

unread,
Mar 5, 2014, 6:57:56 AM3/5/14
to golan...@googlegroups.com
Oh of course most of my "unexceptional" errors are handled by returning error codes.

The thing is things such as protocol violations throw panics, and a network connection could unexpected drop at any time, and I want my daemon to stay up until the network comes up again. Malicious people can also send malformed protocol messages and try to crash the server with buffer out-of-bound errors etc, which are panics.

On Wednesday, March 5, 2014, Jsor <jrago...@gmail.com> wrote:
I do think it could be useful to have something like runtime.PanicCallback which would register a func(interface{}) that would be called with the panic info. I'd make the definition something like "if a panic reaches the top of any goroutine stack (and is not recovered from), the global panic callback is called if registered. Before the callback is called, the rest of the world is stopped and the thread is locked. No goroutines may be spawned from the panic callback, nor may you read from or write to a channel, nor will any garbage be collected. After the function returns (or panics again), the stack trace will be printed as normal and the program will exit. You cannot recover in the panic callback."

I think it could be very useful for scenarios with large numbers of different-acting Goroutines where you just want to do One Thing (save any program state you can verify sane, write a log file, display an error message in a new window, scream and cry) before your program dies from a stupid divide by zero error that only happens if you seed your RNG on the third Friday in August or something.

Péter Szilágyi

unread,
Mar 5, 2014, 7:05:08 AM3/5/14
to Eric Dong, golan...@googlegroups.com
Hi,

  You shouldn't consider protocol violations and connection drops as exceptional things. In a networked environment a link failure is the norm, not something unexpected. Simply close the connection, and report a failure upstream to an overseer process. As for malicious messages, if anyone's allowed to connect, then treat inbound network traffic as user input, i.e. untrusted needing of validation. All of these can nicely be handled with errors and events, no need for panics.

Cheers,
  Peter


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Eric Dong

unread,
Mar 5, 2014, 8:37:11 AM3/5/14
to golan...@googlegroups.com
The thing is, using returning errors for these things leads to code repeat, since after every operation I need to check whether the connection closed suddenly, etc. If I even forget it at one single place, a protocol violation can silently put my program into an undefined state.

What would be an "overseer process" though? I am curious.

--

Dave Cheney

unread,
Mar 5, 2014, 8:39:11 AM3/5/14
to Eric Dong, golan...@googlegroups.com
That is why you write tests.


To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/Bi43JRE-c6w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

egon

unread,
Mar 5, 2014, 8:53:00 AM3/5/14
to golan...@googlegroups.com, yd2...@uwaterloo.ca
On Wednesday, March 5, 2014 3:37:11 PM UTC+2, Eric Dong wrote:
The thing is, using returning errors for these things leads to code repeat, since after every operation I need to check whether the connection closed suddenly, etc. If I even forget it at one single place, a protocol violation can silently put my program into an undefined state.

You could separate those... one goroutine that reads in the commands, one that sanitizes, one that eventually executes. And also add an extra command "ConnectionDrop" that will be sent to the program when the connection dies.

Of course it'll still fail if you wrote buggy code, but it'll get rid of some of the checks.

+ egon

Eric Dong

unread,
Mar 5, 2014, 9:44:41 AM3/5/14
to egon, golan...@googlegroups.com, yd2...@uwaterloo.ca
Wouldn't that be somewhat expensive? I know goroutines are super cheap, but, still, all those implicit locking around channels, constructing structures to pass around in channels, etc? IDK though, I might try that approach. 

egon

unread,
Mar 5, 2014, 10:01:40 AM3/5/14
to golan...@googlegroups.com, egon, yd2...@uwaterloo.ca


On Wednesday, March 5, 2014 4:44:41 PM UTC+2, Eric Dong wrote:
Wouldn't that be somewhat expensive? I know goroutines are super cheap, but, still, all those implicit locking around channels, constructing structures to pass around in channels, etc? IDK though, I might try that approach. 

Yes, more goroutines introduce context switching, but it's quite cheap. The locking inside channels indeed isn't free, but it is quite cheap as well. Constructing structures won't use more memory than you currently use, unless you use buffered channels. I guess the bigger question is, what are your requirements for latency and memory usage. So you need to decide based on that.

Of course if you start bottle-necking some part of the system, it'll be much easier to throw multiple machines at it, because you already have the communication part built. e.g. one machine that deserializes, one that validates and multiple machines that handle the messages.

tl;dr; yes you'll lose 5% (randomly guessing) of performance, but gain cleaner/simpler code.

(Also think how difficult it would be to handle "reconnects" when you use structures over channels)

+ egon

Nate Finch

unread,
Mar 5, 2014, 10:54:45 AM3/5/14
to golan...@googlegroups.com, yd2...@uwaterloo.ca
welcome to Go code ;)   This is very common.  Yes, you need to check each time, but your code will be a *lot* more robust than it would be with exceptions, and it makes it a *lot* easier to handle errors across multiple goroutines.  As you can see, panicking on many different goroutines is very difficult to handle correctly... however, if they're just errors, you can handle them where they occur and/or pass them down an error channel, etc.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
Message has been deleted
0 new messages