Golang gRPC Client Invoke Hangs If Handling Server Crashes While Processing It

1,984 views
Skip to first unread message

brandon...@gmail.com

unread,
Jul 28, 2015, 11:14:45 AM7/28/15
to grpc.io
Folks,

I've been combing through the code in call.go and clientconn.go.  The basic scenario we're seeing, is if we use grpc.Invoke to invoke a gRPC method on a server, and that server crashes and restarts during the invocation, the grpc.Invoke never returns.  I've used kill -6 to get the callstack

goroutine 20 [select, 13 minutes]:
google.golang.org/grpc.(*ClientConn).wait(0xc208137040, 0x7f72290d6180, 0xc208156330, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
/vagrant/code/go/src/google.golang.org/grpc/clientconn.go:282 +0x3c4
google.golang.org/grpc.Invoke(0x7f72290d6180, 0xc208156330, 0xc2080f80c0, 0x33, 0x6ef6c0, 0xc20813acc0, 0x6ef720, 0xc20813acd0, 0xc208137040, 0x0, ...)
/vagrant/code/go/src/google.golang.org/grpc/call.go:159 +0x6d2
main.(*adapter).Call(0xc20805c360, 0x7f72290d6180, 0xc208156330, 0xc2081565d0, 0x22, 0xa97eb0, 0x0, 0x0, 0x0, 0x0, ...)

It seems like what happens, is that while waiting for the response the connection terminates, which triggers a connection error here


It then continues and gets stuck waiting here:


 The connection is dialed and connects, but it never proceeds past that point.  I noticed the failFast CallOption doesn't seem to be available yet.  What would the correct way be to ensure that Invoke returns and doesn't hang indefinitely if the first attempt results in a terminated connection?

Qi Zhao

unread,
Jul 28, 2015, 2:08:29 PM7/28/15
to brandon...@gmail.com, grpc.io
connection error is treated as transient error and grpc keeps retrying to complete the pending rpc. Please set a deadline for the rpc. Or you can cancel that rpc directly.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/b8bbe3e6-37d6-47aa-9d60-c9f248ee45e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
-Qi

brandon...@gmail.com

unread,
Jul 28, 2015, 2:44:51 PM7/28/15
to grpc.io, zh...@google.com
Thanks Qi.  From my experience, if the connection dies during an earlier attempt, it does reconnect but it hangs before making another attempt and just waits endlessly.  I have started to implement my own Dialer using WithDialer, and cancelling the context explicitly if a redial is done during an evaluation by making use of the CancellableContext.  This seems to help my state out.  

Qi Zhao

unread,
Jul 28, 2015, 3:21:00 PM7/28/15
to brandon...@gmail.com, grpc.io
On Tue, Jul 28, 2015 at 11:44 AM, <brandon...@gmail.com> wrote:
Thanks Qi.  From my experience, if the connection dies during an earlier attempt, it does reconnect but it hangs before making another attempt and
I am not sure what exactly happened your case. It would be good you can have a test to reproduce your observation. 



--
Thanks,
-Qi
Reply all
Reply to author
Forward
0 new messages