Can't connect after server disconnects during streaming call

961 views
Skip to first unread message

rSam

unread,
Apr 2, 2018, 5:39:30 PM4/2/18
to grpc.io
Hi, 

Need some help figuring out if this is a bug in gRPC or improper code.
Thanks!

I have this scenario between Python and C#:
  • A python server with a streaming rpc
  • A c# client that constantly queries that streaming rpc 
  • Problem: if the server goes down while the client was asking (not sure exactly timing), recreating the channel in the client won't work for a new server 
    • The client will continue to create Idle channels that will timeout on deadline exceeded
  • Output and simplified code below

Normally it works fine, but if I terminate the server manually (ctrl+c) while the client was in the streaming call,
spawning a new server and recreating the channel in the client won't work. I believe whether I let the client wait
for the deadline with the server down or not has no effect on being able to reconnect later.

Calls to call.Dispose() or channel.ShutdownAsync() seem to have no effect. Adding some grace to server.stop() seemed to have no effect either.

When re-connection works, the channel state is TransientFailure during the server downtime. If somehow the channel reaches 'Idle' state during server downtime, it will no longer connect.
// client loop after failure..
[error] Status(StatusCode=DeadlineExceeded, Detail="Deadline Exceeded")
[info] CHANNEL STATE 'Idle'

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
CODE
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Server.py
    # streaming response
    def GetItemUpdates(self, request, context):
        for item in my_db.items():
            ...
            yield response

# server listening
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
server.start()
try 
    ...
# interrupted by ctrl+C
except KeyboardInterrupt:
    server.stop(0) 

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Client.cs

    // called from a thread periodically
    private async Task RefreshStudyStatusFromController()
    {
        Channel channel = new Channel("localhost:myport", ChannelCredentials.Insecure);
        var controller = new P.PClient(channel);
        try {
            // call the streaming rpc
            call = controller.GetItemUpdates(input, deadline: deadline);
            while (await call.ResponseStream.MoveNext())
            { ... }
        }
        catch (RpcException rpcEx) { ...
          call.Dispose();
          await channel.ShutdownAsync();
        }     
     }










Raul Sampedro

unread,
Apr 2, 2018, 6:03:29 PM4/2/18
to grpc.io
Adding some logs for additional information. It seems to be related to whether gRPC detects it as a connection failure vs a timeout.

- - - - - - - - - - - - - - - - - -
When re-connection works:

- - - - - - - - - - - - - - - - - -  
// updates working ...
[info] Asking at 21:54:39.0502473
[info]  Done at 21:54:39.0535079
// SERVER GOES DOWN HERE
[info] Asking at 21:54:40.0454472
Exception thrown: 'Grpc.Core.RpcException' in mscorlib.dll
[info]  Failed at 21:54:41.1077726
[error] Status(StatusCode=Unavailable, Detail="Connect Failed")
[info] CHANNEL STATE 'Connecting'
[info] Asking at 21:54:41.1107742
Exception thrown: 'Grpc.Core.RpcException' in mscorlib.dll
[info]  Failed at 21:54:42.0996787
[info] CHANNEL STATE 'TransientFailure'
// ... 
// eventually spawning a new server will make the client work again
[info] Asking at 21:54:50.0455668
[info]  Done at 21:54:50.0535684

- - - - - - - - - - - - - - - - - -
When re-connection fails:

- - - - - - - - - - - - - - - - - -  
// updates working ...
[info] Asking at 21:54:53.0486147
[info]  Done at 21:54:53.0536107
// SERVER GOES DOWN HERE
[info] Asking at 21:54:54.0490487
Exception thrown: 'Grpc.Core.RpcException' in mscorlib.dll
[info]  Failed at 21:55:04.0758809
[error] Status(StatusCode=DeadlineExceeded, Detail="Deadline Exceeded")
[info] CHANNEL STATE 'Idle'
[info] Asking at 21:55:04.0774122
Exception thrown: 'Grpc.Core.RpcException' in mscorlib.dll
[info]  Failed at 21:55:14.1062472
[error] Status(StatusCode=DeadlineExceeded, Detail="Deadline Exceeded")
[info] RS CHANNEL 'Idle'
// ... will never reconnect




--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/cb1b4560-765b-44cc-a95b-7913f0adaf91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jan Tattermusch

unread,
Apr 6, 2018, 4:59:29 AM4/6/18
to grpc.io
What gRPC version are you using? The issue seems to be identical to https://github.com/grpc/grpc/issues/14014, which seems to be resolved starting from gRPC v1.10.0. Can you please retest with the latest gRPC version and confirm?
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

Raul Sampedro

unread,
Apr 7, 2018, 12:00:19 AM4/7/18
to Jan Tattermusch, grpc.io
Thanks for the reply.  I'm running:
  • 1.9.1 python
  • 1.9.0 c# (via nuget)
I will try to reproduce it consistently with a previous version of my code, and retest with a 1.10.

Thanks.


To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.

To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
Reply all
Reply to author
Forward
0 new messages