grpc-java graceful channel shutdown

857 views
Skip to first unread message

dan....@rach.io

unread,
Apr 4, 2017, 10:05:36 PM4/4/17
to grpc.io
Is there a way to command a channel to gracefully shutdown, waiting for any active calls to complete before terminating the link?
I expected channel.shutdown() to work this way.  However, it appears to forcefully kill all active calls after 5 seconds with UNAVAILABLE (Channel requested transport to shut down) exceptions.
The 5 second delay appears to be due to a shutdown delay hack in SubchannelImplImpl::shutdown.

Is this expected behavior?

I am writing a custom LoadBalancer and I need a way to evict subchannels cleanly.
Currently, my solution is to remove the subchannel from the Picker then shutting it down once its state changes to IDLE in handleSubchannelState.  I don't expect this is the intended way to go about this?

Also, from what I can tell, this may manifest in the RoundRobinLoadBalancer.  When the resolved address list changes, any active calls that last longer than 5 seconds will be killed.

Eric Anderson

unread,
Apr 13, 2017, 1:38:45 PM4/13/17
to dan....@rach.io, grpc.io, Kun Zhang
+zhangkun, as FYI for RoundRobinLoadBalancer

On Tue, Apr 4, 2017 at 7:05 PM, <dan....@rach.io> wrote:
Is there a way to command a channel to gracefully shutdown, waiting for any active calls to complete before terminating the link?
I expected channel.shutdown() to work this way.

That's how it's intended.

However, it appears to forcefully kill all active calls after 5 seconds with UNAVAILABLE (Channel requested transport to shut down) exceptions.
The 5 second delay appears to be due to a shutdown delay hack in SubchannelImplImpl::shutdown.

So the subchannel shutdown should only prevent new RPCs from being created. The 5 second delay is just to avoid a race with new RPCs. It used to have zero delay.

There is a little bit of issue here is that our error handling can return the wrong error during shutdown, because it can be quite unclear what causes what.

It may be caused by Netty killing the connection earlier than we expected, since it seems we don't set gracefulShutdownTimeoutMillis to infinity. The current graceful shutdown logic in Netty seems server-oriented. We may want to just avoid calling Http2ConnectionHandler.close() until all the streams are closed.

I've opened issue 2907 to track this.

Is this expected behavior?

No, it's not.

I am writing a custom LoadBalancer and I need a way to evict subchannels cleanly.
Currently, my solution is to remove the subchannel from the Picker then shutting it down once its state changes to IDLE in handleSubchannelState.  I don't expect this is the intended way to go about this?

No, and I thought it can take quite a long time for it to transition to IDLE, since that only happens after the connection starts shutting down, which may not happen naturally unless done by the server.
Reply all
Reply to author
Forward
0 new messages