Why keepAliveTime is infinite by default?

93 views
Skip to first unread message

Grpc learner

unread,
Jul 16, 2018, 9:53:58 PM7/16/18
to grpc.io
Why keepAliveTime is infinite by default?
That would be easy to get connections broken?

Eric Anderson

unread,
Jul 17, 2018, 10:33:18 AM7/17/18
to x...@autonomic.ai, grpc-io
I made the GRFC for keepalive. Yes, keepalive is disabled by default, and lots of people could benefit from it being enabled. Today those users have to enable it manually.

There's two pieces as to why.

First, I received a lot of push-back (in various forms) about keepalive in general, and the costs in incurs and the DDoS risk. Keeping it disabled by default avoided some of the concerns and so we could get the feature out to users sooner. The interoperability landscape has changed some here as well; previously we would have needed some hacks in the our spec because I was aware of proxies that didn't allow keepalive at all.

Second, the solution for "defaults" is currently incomplete, as IDLE_TIMEOUT is not implemented cross-language (only Java implements it today). For a full solution you need to handle the case when there are no RPCs. But trying to enable KEEPALIVE_WITHOUT_CALLS by default is a non-starter. Instead, IDLE_TIMEOUT is a more efficient solution when there's been a period of inactivity. I was waiting for IDLE_TIMEOUT to be implemented before coming back around and arguing that we should have some of this stuff on by default. Lack of IDLE_TIMEOUT in languages has been causing enough of a problem (there's workarounds but...) I was able to get agreement that we need to make progress on this this quarter.

On Mon, Jul 16, 2018 at 6:54 PM Grpc learner <x...@autonomic.ai> wrote:
Why keepAliveTime is infinite by default?
That would be easy to get connections broken?

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To post to this group, send email to grp...@googlegroups.com.
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/4bf7f290-9caa-4de1-8015-daed32a0a8c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Grpc learner

unread,
Jul 17, 2018, 4:42:34 PM7/17/18
to grpc.io
Hi Eric,

Thank you for your reply!
Yes, we disable keepalive by default. But I also noticed that we have `keepAliveTimeoutNanos`, which is 20 seconds by default.
If we disable keepAlive, then does `keepAliveTimeoutNanos` work without enabling keepalive?

Eric Anderson

unread,
Jul 18, 2018, 1:13:52 PM7/18/18
to x...@autonomic.ai, grpc-io
On Tue, Jul 17, 2018 at 1:42 PM Grpc learner <x...@autonomic.ai> wrote:
Yes, we disable keepalive by default. But I also noticed that we have `keepAliveTimeoutNanos`, which is 20 seconds by default.
If we disable keepAlive, then does `keepAliveTimeoutNanos` work without enabling keepalive?

No, it doesn't. The 20 second default is just so you don't need to specify it when you enable keepalive.

Grpc learner

unread,
Jul 18, 2018, 3:11:24 PM7/18/18
to grpc.io
Hi Eric,
What is a value of keepAliveTime you recommend?

Eric Anderson

unread,
Jul 18, 2018, 5:35:25 PM7/18/18
to x...@autonomic.ai, grpc-io
On Wed, Jul 18, 2018 at 12:11 PM Grpc learner <x...@autonomic.ai> wrote:
What is a value of keepAliveTime you recommend?

As high as you can tolerate. It varies a lot per-network though and is a cost/benefit thing. If in a data center, 2 hours could be fine; you frequently don't expect those networks to fail. Some networks will disconnect after 1 minute of inactivity. If you're your on one of those, use 55 seconds or similar, after getting agreement from the service owner. If going over the Internet, you may copy web browsers and use 30-45 seconds (I think Firefox is somewhere near 30 seconds, and Chrome is near 45; but that may be completely wrong or out-of-date).

Grpc learner

unread,
Jul 19, 2018, 3:57:29 PM7/19/18
to grpc.io
Hi Eric,

Thanks!
Another question is : will `keepAlive` use trigger re-use connections?
I noticed that `keepAlive` is in channel builder, but `keepAlive` will be applied in a connection (not a channel).
I was wondering what is the diff/relationship between connections and channel?

Eric Anderson

unread,
Jul 20, 2018, 5:29:12 PM7/20/18
to x...@autonomic.ai, grpc-io
On Thu, Jul 19, 2018 at 12:57 PM Grpc learner <x...@autonomic.ai> wrote:
Another question is : will `keepAlive` use trigger re-use connections?
I noticed that `keepAlive` is in channel builder, but `keepAlive` will be applied in a connection (not a channel).
I was wondering what is the diff/relationship between connections and channel?

Keepalive is connection-specific; each connection will have keepalive managed separately. Each connection will all have the same keepalive settings, but the timers and activity of each will be independent. That also means that if one connection fails because of keepalive, other connections will remain running if they are still healthy. 

Grpc learner

unread,
Jul 20, 2018, 5:50:32 PM7/20/18
to grpc.io
Thanks Eric!
If a connection is broken, what is the policy of gprc to recreate a new connection? and how grpc re-use/share one connection? 

Eric Anderson

unread,
Jul 27, 2018, 3:59:12 PM7/27/18
to x...@autonomic.ai, grpc-io
On Fri, Jul 20, 2018 at 2:50 PM Grpc learner <x...@autonomic.ai> wrote:
If a connection is broken, what is the policy of gprc to recreate a new connection?

gRPC will reconnect when needed. More details can be found at https://github.com/grpc/grpc/blob/master/doc/connectivity-semantics-and-api.md . Some of the details of that document aren't entirely in agreement, but it should be good from a user's perspective.

Also, the load balancer does get to influence some of those decisions, so things can vary depending on the load balancer. 

and how grpc re-use/share one connection?

When using pick-first load balancer (the default), all RPCs on a Channel use the same connection. C-based languages may also share connections across Channels. Round-robin load balancer and gRPC-LB will distribute the RPCs over more connections.

HTTP/2 allows multiple "streams" over the same connection. gRPC uses that to multiplex multiple RPCs on a single connection. Servers can limit the number of concurrent streams/RPCs.

Grpc learner

unread,
Jul 27, 2018, 4:06:25 PM7/27/18
to grpc.io
Thanks Eric! I will read the document.

Does grpc have something like connection-pool or channel-pool to limit the usage of resources?

Srini Polavarapu

unread,
Jul 29, 2018, 12:28:05 AM7/29/18
to grpc.io
Yes, some languages that are wrapped on C-core implementation (C++, python, ruby etc.) pool connections between channels if the channel args match. 
Reply all
Reply to author
Forward
0 new messages