InvalidOperationException on Server with Client Streaming gRPC

scf...@microsoft.com

unread,

Dec 20, 2017, 2:50:10 PM12/20/17

to grpc.io

Hi gRPC experts,

I have a Client Streaming gRPC (C# client, C# server) set up, and I ran into an issue where about 20% of requests would fail to complete. On the client, the error was DEADLINE_EXCEEDED, but on the server, our injected gRPC logger was seeing the following warnings. The server was successfully fulfilling the request - it just couldn't send a reply back to the client.

[gRPC Warning] Exception while handling RPC. System.InvalidOperationException: Operation is not valid due to the current state of the object. at Grpc.Core.Internal.AsyncCallServer`2.SendStatusFromServerAsync(Status status, Metadata trailers, Tuple`2 optionalWrite) at Grpc.Core.Internal.ClientStreamingServerCallHandler`2.<HandleCall>d__4.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Grpc.Core.Server.<HandleCallAsync>d__39.MoveNext()

I re-deployed the app with more logging, but lost the repro. I was going to start with capturing traces as noted in this SO question; is there anything else I can do?

I'm wondering if you have any pointers on what might be happening?

Thanks!

Scott

Jan Tattermusch

unread,

Jan 8, 2018, 9:17:42 AM1/8/18

to grpc.io

When deadlines are set for a call, there is an inherent race condition in which party first sees the deadline was exceeded - it can be either the client or the server (they see the same deadline for a call so the client side part of the call will expire at almost the same time as the server-side of the call).

I suspect in your case, the client-side of the call realizes the call has timed out first (and resets the HTTP2 stream to the server). Right after that, the server sees the timeout as well and tries to send the "Cancelled" status to the client - but the connection has just been reset by the client and it results in an error.

As said, this race is inherent and it is quite natural that it happens. On the other hand, we shouldn't really print a somewhat misleading error when that happens (= i.e. the problem is with too verbose logging in this situation, not with the RPC handling logic itself).

Specifically for your applications, if you have that many (>20%) or requests timing out, you should probably increase the call deadline.

Jan Tattermusch

unread,

Jan 8, 2018, 9:21:29 AM1/8/18

to grpc.io

I filed https://github.com/grpc/grpc/issues/13943 the excessive logging.

Reply all

Reply to author

Forward