Go - Why not recover() in a service handler

Amit Saha

unread,

Jul 3, 2021, 7:20:12 PM7/3/21

to grpc.io

Hi all,

When writing HTTP servers in Go, a panic() in a request handler doesn't abort the entire process as it has a recovery mechanism setup (https://golang.org/src/net/http/server.go?s=99233:99288#L81). With gRPC it seems that's not the case - so i assume there is a good reason we don't have a recover() in the goroutine executing the service handler. I was curious if anyone knew what the reason was?

Thanks,

Amit.

Lars Seipel

unread,

Jul 3, 2021, 8:31:46 PM7/3/21

to Amit Saha, grpc.io

The described behaviour in the net/http package is widely considered
(including by its original author) to be a misfeature. Therefore, it
only makes sense to not repeat the same mistake in gRPC.

Amit Saha

unread,

Jul 4, 2021, 6:20:30 AM7/4/21

to Lars Seipel, grpc.io

Thanks - do you have more details on this? I am interested to learn
why that's the case.

>

aar...@arista.com

unread,

Jul 9, 2021, 8:17:18 PM7/9/21

to grpc.io

I think the biggest issue is that you might not notice that your program was panicing because gRPC was handling it for you. I definitely want to know if panics are occurring in my code, because they are not expected.

Also by not catching the panic, you as the user gets to decide how to handle it. It's up to you if you want to put a 'recover()' in your handler or let a panic crash your program. If grpc were to catch panics in handlers there isn't a simple way to opt out of that behavior (without adding another option to gRPC).

Aaron

Amit Saha

unread,

Jul 16, 2021, 7:16:30 PM7/16/21

to aar...@arista.com, grpc.io

On Sat, Jul 10, 2021 at 10:17 AM 'aar...@arista.com' via grpc.io
<grp...@googlegroups.com> wrote:
>
> I think the biggest issue is that you might not notice that your program was panicing because gRPC was handling it for you. I definitely want to know if panics are occurring in my code, because they are not expected.
>
> Also by not catching the panic, you as the user gets to decide how to handle it. It's up to you if you want to put a 'recover()' in your handler or let a panic crash your program. If grpc were to catch panics in handlers there isn't a simple way to opt out of that behavior (without adding another option to gRPC).

Thanks for sharing your thoughts. I created a test HTTP server and
added demonstration of the behavior here:
https://gist.github.com/amitsaha/ec7ce96e47bbfca0e7a52d5bff53485b to
better illustrate what i am referring to and in case it becomes useful
to somebody else.

So my understanding is that even with the recovery() mechanism in
place, the user will get to know when a panic has occurred unless the
ErrAbortHandler value is used to panic. The user can still implement
a middleware and not let the stdlib's panic handling mechanism get
invoked at all.

>
> Aaron
>
> On Sunday, July 4, 2021 at 3:20:30 AM UTC-7 amits...@gmail.com wrote:
>>
>> On Sun, Jul 4, 2021 at 10:31 AM Lars Seipel <l...@slrz.net> wrote:
>> >
>> > On Sat, Jul 03, 2021 at 04:20:12PM -0700, Amit Saha wrote:
>> > >When writing HTTP servers in Go, a panic() in a request handler doesn't
>> > >abort the entire process as it has a recovery mechanism setup
>> > >(https://golang.org/src/net/http/server.go?s=99233:99288#L81). With gRPC
>> > >it seems that's not the case - so i assume there is a good reason we don't
>> > >have a recover() in the goroutine executing the service handler. I was
>> > >curious if anyone knew what the reason was?
>> >
>> > The described behaviour in the net/http package is widely considered
>> > (including by its original author) to be a misfeature. Therefore, it
>> > only makes sense to not repeat the same mistake in gRPC.
>>
>> Thanks - do you have more details on this? I am interested to learn
>> why that's the case.
>>
>> >
>

> --
> You received this message because you are subscribed to the Google Groups "grpc.io" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/80fb5b92-3f5b-438e-aa8d-2510c71bfce8n%40googlegroups.com.

Matt T. Proud

unread,

Jul 19, 2021, 4:32:25 AM7/19/21

to Amit Saha, aar...@arista.com, grpc.io

Our team maintains a framework that a bunch of client teams use to build RPC servers (it supports gRPC as a transport). We largely hold the view that a panic from foreign code can't meaningfully be recovered, because we don't know why it panicked, what the state of the code was in, etc. For instance, let's assume we want universal recovery behavior à la package HTTP (we decided that we don't, but more on that below):

was a lock held that later needs to be released (note: sometimes advanced code has locking behaviors that do not always match the semantics of defer, even if we avoid this type of advanced design by default)?
will some sort of state be corrupted silently as a result of the panic (e.g., a record that is to be serialized to a data store is only partially populated and with some measure of garbage)?
will panic recovery put the system into an any other invalid state?

For any of the considerations above (non-exhaustive), can we be reasonably sure that code external to the user's directly-written code doesn't have a problem area with above? So from our perspective, it is better to play it safe and let the code panic and crash (it elicits rapid operator and developer response).

A small number of users do ask for some measure of "graceful" panic handling in our framework as a middleware. We've considered offering it as a user *opt-in* foot cannon with a clear warning about risks. The one place I would consider it is potentially using it with a query of death reporting mechanism that exports the failure to telemetry and logs and then promptly crashes.

Hope this helps. My opinion is by no way official. Just providing some perspective about large scale code stewardship.

Amit Saha

unread,

Jul 19, 2021, 6:09:36 AM7/19/21

to Matt T. Proud, aar...@arista.com, grpc.io

Thank you. I found it insightful.

Reply all

Reply to author

Forward