connection retry policy in 1.2

332 views
Skip to first unread message

Yaz Saito

unread,
Apr 14, 2017, 3:25:11 PM4/14/17
to grpc-io
I'm trying out 1.2.4 (I was using 1.0.1), and I noticed that the channel connection retry policy has changed. Looks like the newest code retries connection at intervals of 1s, 20s, 32s, 51.2s, etc. And I'm not happy with the huge jump from 1s to 20s. This means when I start a server then try to connect to it (which happens in unittests often), I often wait for 20s. Am I understanding the code right, and is there a way to force the channel to reconnect more quickly?

--
yaz

Penn (Dapeng) Zhang

unread,
Apr 18, 2017, 6:54:17 PM4/18/17
to grpc.io
In what language? There are (at least) two kinds of reties: connection attempt retry, and high level RPC call retry. I assume your question is about the former.

Eric Anderson

unread,
Apr 18, 2017, 7:01:37 PM4/18/17
to Penn (Dapeng) Zhang, grpc.io
Also, how did you get those reconnect attempt times?

All languages should be following our standard connection backoff algorithm. If one's not, that would be a bug. Note that 20s is permitted as a minimum connect time. Basically, we're willing to wait 20 seconds for DNS, TCP, TLS, and similar to complete. If they fail though, then there may be a delay until the next attempt.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/a4b82f09-4eb7-4e6b-aabd-71198e38b558%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Yaz Saito

unread,
Apr 21, 2017, 4:29:01 PM4/21/17
to Eric Anderson, Penn (Dapeng) Zhang, grpc.io
The client and servers are C++, and I'm talking about the connection retry, not an RPC retry. Also, the server is on a UNIX socket (unix:/path).



For more options, visit https://groups.google.com/d/optout.



--
yaz

ncte...@google.com

unread,
Apr 25, 2017, 8:34:58 PM4/25/17
to grpc.io, ej...@google.com, zda...@google.com
The behavior you are seeing is correct, the default values being used can be found at https://github.com/grpc/grpc/blob/master/src/core/ext/filters/client_channel/subchannel.c#L62.

If you would like to override the default behavior, you can use three channel args:
- GRPC_ARG_MIN_RECONNECT_BACKOFF_MS
- GRPC_ARG_MAX_RECONNECT_BACKOFF_MS
- GRPC_ARG_INITIAL_RECONNECT_BACKOFF_MS

Setting those to lower number should cause the reconnection to happen faster.


On Friday, April 21, 2017 at 1:29:01 PM UTC-7, Yaz Saito wrote:
The client and servers are C++, and I'm talking about the connection retry, not an RPC retry. Also, the server is on a UNIX socket (unix:/path).

On Tue, Apr 18, 2017 at 4:01 PM, 'Eric Anderson' via grpc.io <grp...@googlegroups.com> wrote:
Also, how did you get those reconnect attempt times?

All languages should be following our standard connection backoff algorithm. If one's not, that would be a bug. Note that 20s is permitted as a minimum connect time. Basically, we're willing to wait 20 seconds for DNS, TCP, TLS, and similar to complete. If they fail though, then there may be a delay until the next attempt.
On Tue, Apr 18, 2017 at 3:54 PM, 'Penn (Dapeng) Zhang' via grpc.io <grp...@googlegroups.com> wrote:
In what language? There are (at least) two kinds of reties: connection attempt retry, and high level RPC call retry. I assume your question is about the former.

On Friday, April 14, 2017 at 12:25:11 PM UTC-7, Yaz Saito wrote:
I'm trying out 1.2.4 (I was using 1.0.1), and I noticed that the channel connection retry policy has changed. Looks like the newest code retries connection at intervals of 1s, 20s, 32s, 51.2s, etc. And I'm not happy with the huge jump from 1s to 20s. This means when I start a server then try to connect to it (which happens in unittests often), I often wait for 20s. Am I understanding the code right, and is there a way to force the channel to reconnect more quickly?

--
yaz

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.



--
yaz

Eric Anderson

unread,
Apr 26, 2017, 11:31:47 AM4/26/17
to Noah Eisen, grpc.io, Dapeng Zhang
That behavior is the opposite of the spec. The initial backoff occurs during the 1 second after the connection attempt started, but the initial attempt should be given up to 20 seconds before it times out. If the initial attempt takes 100 ms, then the next attempt would happen 900 ms later. If the initial attempt takes 20 seconds, then the next attempt would happen immediately.

To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscribe@googlegroups.com.

To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
Reply all
Reply to author
Forward
0 new messages