serverCallStreamObserver.isCancelled() cannot detect client disconnection for gRPC version higher than 1.41.0

200 views
Skip to first unread message

Bill Li

unread,
Apr 14, 2022, 1:31:52 PM4/14/22
to grpc.io
Hi,

I have recently discovered serverCallStreamObserver.isCancelled() cannot detect disconnection of a client when gRPC version is higher than 1.41.x. So, the server logic that I have contains a while loop that runs forever and if the client disconnects from the server, it will call onCompleted() before breaking from the loop. I am wondering what has changed in the new versions and what is the new way to implement a similar behaviour?

Regards,
Bill

sanjay...@google.com

unread,
Apr 14, 2022, 5:21:43 PM4/14/22
to grpc.io
I am assuming it is grpc-java you are talking about. Couple of questions:

- "when gRPC version is higher than 1.41.x" : can you pinpoint whether the client version or the server version causes this changed behavior?
- "...cannot detect disconnection of a client ..." do you mean that when a disconnection happens in the middle of a streaming RPC the server used to detect it as a cancellation and that doesn't happen anymore?

Same question about "...if the client disconnects from the server,..." : do you mean to say a TCP disconnection to be detected inside your while loop?



Bill Li

unread,
Apr 15, 2022, 4:47:33 PM4/15/22
to grpc.io
Yes, it is grpc-java.

How I detected this was that google-cloud-pubsub is one of our dependencies and I noticed that 1.114.6 works but not 1.114.7. After digging into it deeper, I found the grpc version was upgrade from 1.40.x to 1.41.x.

The server is written in Java and the client is written in Python. The server version is currently at 1.42.2. I am not sure what the client grpc version is. I am using Python 3.8 and protoc version is 3.6.1.

The disconnection happens in the middle of a streaming RPC and the cancellation is detected inside a while loop. Here is a sample code snippet:

Greeting greeting = request.getGreeting();
String firstName = greeting.getFirstName();
String lastName = greeting.getLastName();

int i = 0;

ServerCallStreamObserver<GreetResponse> serverCallStreamObserver =
(ServerCallStreamObserver<GreetResponse>)responseObserver;

try {
while (true) {
if (serverCallStreamObserver.isCancelled()) {
System.out.println("cancelled");
serverCallStreamObserver.onCompleted();
break;
}

String result = "Hello " + firstName + " " + lastName + ", response number: " + i;
GreetResponse response = GreetResponse.newBuilder()
.setResult(result)
.build();
responseObserver.onNext(response);

Thread.sleep(1000L);
i++;
}
} catch (InterruptedException e) {
e.printStackTrace();
}

Bill Li

unread,
Apr 15, 2022, 9:55:37 PM4/15/22
to grpc.io
I have just found that it's grpc-netty-shaded that is causing the behaviour described above. The latest working version is 1.40.2. It's working fine by setting grpc-protobuf, grpc-stub and grpc-testing to the latest release version.

Eric Anderson

unread,
Apr 16, 2022, 12:27:10 AM4/16/22
to Bill Li, grpc.io
I've reproduced similar behavior, but I'm missing a piece to get it working. I am getting the same (apparently broken) behavior on v1.30.0, v1.40.0, and v1.41.0. So I'll need to fitz with it more. I've attached my seems-broken-all-the-time "reproduction."

v1.41.0 has in its release notes:
  • core: ServerCall.isCancelled() and ServerCallStreamObserver.isCancelled() implementations no longer incorrectly return true at the end of every RPC (#8408)
So your behavior change is probably related to that. As a quick workaround, you can use Context.current().isCancelled() instead of serverCallStreamObserver.isCancelled().

I think I see the problem in the code; I believe we are setting the cancelled flag within the application's callback thread, which means if you block that thread with your while loop it won't ever be able to be set. Before it was also getting set from another thread, but that set it too often. But it also looks like we are doing some of that on purpose. There was a lot to the discussion when the change was made, and we'll need to reread it to remember what all was going on.

If you wouldn't mind, could you create an issue on github for grpc-java for this? If not, we can, but having the original reporter create the issue generally works best. Either way I'll comment on it with what I've discovered.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/be8566bb-39ac-4376-8578-c75f604bd682n%40googlegroups.com.
why-no-cancel.diff

Wen Bo (Bill) Li

unread,
Apr 16, 2022, 1:09:14 AM4/16/22
to grpc.io
Sure, I wouldn't mind creating an issue on github and here is the link: https://github.com/grpc/grpc-java/issues/9087

I have tested the code with the suggestion you provided above and it works. Based on the description you provided above, the design of setting the cancelled flag is not finalized yet. Am I correct?

If this is the new approach, what would be the new best practice in terms of code implementation on detecting client disconnection (using the workaround you provided above)?

Eric Anderson

unread,
Apr 18, 2022, 10:53:27 AM4/18/22
to Wen Bo (Bill) Li, grpc.io
On Fri, Apr 15, 2022 at 10:09 PM 'Wen Bo (Bill) Li' via grpc.io <grp...@googlegroups.com> wrote:
If this is the new approach, what would be the new best practice in terms of code implementation on detecting client disconnection (using the workaround you provided above)?

It's not really a new vs old approach. Both approaches co-exist. The ServerCall-based one has a callback that won't run concurrently with other ServerCall.Listener/StreamObserver callbacks. That makes it very convenient for non-thread-safe usage. The Context one is good for code needing a callback via another thread, and is more generic and can apply to non-gRPC and nested usages. It shouldn't ordinarily matter which of the two isCancelled() methods you use in your specific code, with the exception of this bug and that Context will always become cancelled at the end of the RPC while the ServerCall one only becomes cancelled if the RPC was cancelled.

Wen Bo (Bill) Li

unread,
Apr 18, 2022, 1:31:26 PM4/18/22
to grpc.io
Okay, thanks. It looks like Context is more generic in client disconnecting from the server compared to ServerCall. In this case, I will use Context for our service.
Reply all
Reply to author
Forward
0 new messages