I am using gRPC 1.55.1 and observing an issue similar to the one discussed in the below GitHub issue
I have set the KeepAlive on both the client and server-side as suggested in the above issue
I am creating the stub like below
var channel = ManagedChannelBuilder.forAddress(network.getIp(), network.getPort())
.keepAliveTime(130, TimeUnit.SECONDS)
.maxInboundMessageSize(maxInboundMessageSize)
.maxInboundMetadataSize(maxInboundMetadataSize)
.enableRetry()
.build();
var stub = HelloServiceGrpc.newBlockingStub(channel).withDeadline(Deadline.after(115, TimeUnit.SECONDS));
stub.sayHello();
stub.sayHello();
In server side also keepAliveTime is set as suggested in the above github issue.
Grpc.newServerBuilderForPort(port, InsecureServerCredentials.create())
.addService(new GreeterImpl())
.keepAliveTime(130, TimeUnit.SECONDS)
.build()
.start();
In my case, client calls server 1 then server 1 acts as a client to server 2.
I am observing that when the deadline is exceeded in server2 the same deadline error is coming to server 1 like below which is also coming to client. It is working as expected.
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: context timed out
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:271)
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:252)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:165)But on some rare occasions I am observing the client thread hanging like the below and not getting the deadline exceeded error
at jdk.internal.misc.Unsafe.park(java...@17.0.9/Native Method)
- parking to wait for <0x0000000767a53a00> (a io.grpc.stub.ClientCalls$ThreadlessExecutor)
at java.util.concurrent.locks.LockSupport.park(java...@17.0.9/LockSupport.java:211)
at io.grpc.stub.ClientCalls$ThreadlessExecutor.waitAndDrain(ClientCalls.java:748)
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:157)I waited for about 2 hours & it did not recover. The only way to recover from this is to restart the client application. It happened 3-4 times in the last couple of months for us.
Can someone let me know
- What I am doing wrong or is there any known issue in the grpc-java 1.55.1 that I am using?
- Is there any timeout config I can set on the gRPC client side so that the client threads do not hang indefinitely?