Graceful shutdown of Java GRPC server with streams

1,223 views
Skip to first unread message

Fedor Korotkov

unread,
Oct 18, 2017, 11:41:47 AM10/18/17
to grp...@googlegroups.com
Hey there,

I've asked this question on StackOverflow, but haven't got a response so decided to duplicate it here. :-)

I'm trying to add graceful shutdown for my GRPC service that has some streaming APIs. Basically I want to wait for all GRPC calls to complete before shutting down my application. Streaming calls can take up to several minutes(big file uploads) and it seems current GRPC Java implementation is not respecting such use case.

My service is implemented in Java and uses GRPC 1.7.0. So I simply call grpcServer.shutdown() which according to the docs should: "Initiates an orderly shutdown in which preexisting calls continue but new calls are rejected." After it I immediately call grpcServer.awaitTermination() to block until my services in TERMINATED state.

But I see that GRPC actually waits at most 5 seconds before sending service in a TERMINATED state even though there are active streams.

So is seems grpcServer.shutdown() is not doing what the docs says it does and I wonder what should I do to support my use case. I think GRPC should support it(seems like a pretty common use case). If not I will need to track active streams manually which is doable not looks more like a hack.

Best,
Fedor

Carl Mastrangelo

unread,
Oct 19, 2017, 1:28:23 PM10/19/17
to grpc.io
Can you show your shutdown code?  awaitTermination() should actually wait for shutdown to complete.  In fact, its expected that you call it, even when you don't call shutdown():


Server s = ....

s.start();
s.awaitTermination();


That blocks the current thread (usually main) while the server serves. 

Fedor Korotkov

unread,
Oct 19, 2017, 3:24:34 PM10/19/17
to Carl Mastrangelo, grpc.io
Hey Carl,

Thank you for your quick response! I've tried to extract the issue in a test case here. Note that if I change to an in-process server and `InProcessChannelBuilder` everything works as expected and the test waits 20 seconds. 

I suspect it's related to ManagedChannelImpl#SUBCHANNEL_SHUTDOWN_DELAY_SECONDS constants because it's the only 5 seconds constant and I see this magic number in logs. 

But in my case I'm basically running my services in Kubernetes and Kubernetes sends SIGTERM before killing containers so I have a shutdown hook like this:

Runtime.getRuntime().addShutdownHook(object : Thread() {
  override fun run() {
    println("Got shutdown hook!")
    val start =  System.currentTimeMillis()
    grpcServer?.shutdown()
    grpcServer?.awaitTermination()
    val duration = Duration.ofMillis(System.currentTimeMillis() - start)
    println("Shutdown GRPC service in ${duration.seconds} seconds.")
  }
})

And I call start() separately when the container starts.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/f1225a84-ede0-40fb-867b-828a0ba78483%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Carl Mastrangelo

unread,
Oct 20, 2017, 1:55:23 PM10/20/17
to grpc.io
Replies inline


On Thursday, October 19, 2017 at 12:24:34 PM UTC-7, Fedor Korotkov wrote:
Hey Carl,

Thank you for your quick response! I've tried to extract the issue in a test case here. Note that if I change to an in-process server and `InProcessChannelBuilder` everything works as expected and the test waits 20 seconds. 

InProcess is a special class, because it doesn't use threads.  In your example, you should shutdown and await the client first.  Only after that should you close the server, because only then do you know all the RPCs are done.
 

I suspect it's related to ManagedChannelImpl#SUBCHANNEL_SHUTDOWN_DELAY_SECONDS constants because it's the only 5 seconds constant and I see this magic number in logs. 

I believe this is unlikely, since that affects client behavior rather than server.

 

But in my case I'm basically running my services in Kubernetes and Kubernetes sends SIGTERM before killing containers so I have a shutdown hook like this:

Runtime.getRuntime().addShutdownHook(object : Thread() {
  override fun run() {
    println("Got shutdown hook!")
    val start =  System.currentTimeMillis()
    grpcServer?.shutdown()
    grpcServer?.awaitTermination()
    val duration = Duration.ofMillis(System.currentTimeMillis() - start)
    println("Shutdown GRPC service in ${duration.seconds} seconds.")
  }
}) 

Probably should use System.nanoTime(), since currentTimeMillis isn't very accurate.

Nit: Shutdown hooks race with each other.   It would be a safer option to install a signal handler (sun.misc.SignalHandler) which can more tightly control shutdown behavior, since it happens serially.  
 

And I call start() separately when the container starts.
On Thu, Oct 19, 2017 at 10:28 AM, 'Carl Mastrangelo' via grpc.io <grp...@googlegroups.com> wrote:
Can you show your shutdown code?  awaitTermination() should actually wait for shutdown to complete.  In fact, its expected that you call it, even when you don't call shutdown():


Server s = ....

s.start();
s.awaitTermination();


That blocks the current thread (usually main) while the server serves. 


On Wednesday, October 18, 2017 at 8:41:47 AM UTC-7, Fedor Korotkov wrote:
Hey there,

I've asked this question on StackOverflow, but haven't got a response so decided to duplicate it here. :-)

I'm trying to add graceful shutdown for my GRPC service that has some streaming APIs. Basically I want to wait for all GRPC calls to complete before shutting down my application. Streaming calls can take up to several minutes(big file uploads) and it seems current GRPC Java implementation is not respecting such use case.

My service is implemented in Java and uses GRPC 1.7.0. So I simply call grpcServer.shutdown() which according to the docs should: "Initiates an orderly shutdown in which preexisting calls continue but new calls are rejected." After it I immediately call grpcServer.awaitTermination() to block until my services in TERMINATED state.

But I see that GRPC actually waits at most 5 seconds before sending service in a TERMINATED state even though there are active streams.

So is seems grpcServer.shutdown() is not doing what the docs says it does and I wonder what should I do to support my use case. I think GRPC should support it(seems like a pretty common use case). If not I will need to track active streams manually which is doable not looks more like a hack.

Best,
Fedor

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages