Ping rate limiting is too aggressive

Damien Neil

unread,

Dec 2, 2024, 5:19:10 PM12/2/24

to grpc.io

(I'd file this as an issue, but so far as I can tell this spans all gRPC implementations and I can't figure out which GitHub tracker to use in that case.)

gRPC servers set a fairly aggressive limit on the number of pings clients can send. The algorithm is detailed here:

https://github.com/grpc/proposal/blob/master/A8-client-side-keepalive.md#server-enforcement

In essence, a client can send two PINGs per HEADERS or DATA frame sent by the server. Any more, and the server closes the connection with an ENHANCE_YOUR_CALM error. Technically, it's two pings per 5 minutes or 2 hours, depending on whether there's an outstanding call. However: The client can't tell if there's an outstanding call, because its PING frame can race with the server finishing a call, and even 5 minutes is essentially forever in computer terms. So it's effectively two PINGS per HEADERS/DATA sent by the server.

I learned of this in https://go.dev/issue/70575, which is an issue filed against Go's HTTP/2 client, caused by a new health check we'd added: When a request times out or is canceled, we send a RST_STREAM frame for it. Servers don't respond to RST_STREAM, so we bundle the RST_STREAM with a PING frame to confirm that the server is still alive and responsive. In the event many requests are canceled at once, we send only one PING for the batch.

This triggers gRPC servers' rate limiting when several requests are canceled in short succession.

Unfortunately, there's no good way for the client to avoid this: Consider the case where we send three requests, one minute apart, and cancel each request before the server begins responding. The third PING triggers the rate limit, and the server closes the connection.

I think that gRPC servers should reset the ping strike count when they *receive* a HEADERS or DATA frame. This limits clients to at most two pings per real frame sent, and essentially places pings under the umbrella of whatever rate limiting is being applied to HEADERS/DATA. PING frames should be cheap to process compared to HEADERS/DATA, so limiting them to a small multiple of the more expensive frames renders them ineffective as a DOS vector.

This approach would ensure that a client waiting for a response to a request may always send at least one PING frame to confirm that the server is still alive.

- Damien

Craig Tiller

unread,

Dec 2, 2024, 5:53:59 PM12/2/24

to Damien Neil, grpc.io

I forget the exact history here, but I'll note that a design whereby receiving HEADERS or DATA resets the counter allows the client to control resource usage - and that (at least in C++) the ping response path forces writes to be scheduled immediately, meaning that a repeated HEADERS/PING pattern blasted at line rate could very easily fill up a thread trying to respond to it.

--
You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/grpc-io/d3a9a4d5-ab4b-42ca-a2cd-c397fdcd41a3n%40googlegroups.com.

Eric Anderson

unread,

Dec 2, 2024, 5:57:43 PM12/2/24

to Damien Neil, grpc.io, Craig Tiller, Doug Fawley

On Mon, Dec 2, 2024 at 2:19 PM 'Damien Neil' via grpc.io <grp...@googlegroups.com> wrote:

I learned of this in https://go.dev/issue/70575, which is an issue filed against Go's HTTP/2 client, caused by a new health check we'd added: When a request times out or is canceled, we send a RST_STREAM frame for it. Servers don't respond to RST_STREAM, so we bundle the RST_STREAM with a PING frame to confirm that the server is still alive and responsive. In the event many requests are canceled at once, we send only one PING for the batch.

Our keepalive does something similar, but is time-based. If it has been X amount of time since the last receipt, then a PING checking the connection is fair. The problem is only the "aggressive" PING rate by the client. The client is doing exactly what the server was wanting to prevent: "overzealous" connection checking. I do think it is more appropriate to base it off a connection-level time instead of a per-request time, although you probably don't have a connection-level time to auto-tune to whereas you do get feedback from requests timing out.

I'm wary of tieing keepalive checks to resets/deadlines, as those are load-shedding operations and people can have aggressive deadlines or cancel aggressively as part of normal course. In addition, TCP_USER_TIMEOUT with the RST_STREAM gets you a lot of the same value without requiring additional ACK packets.

Note that I do think the 5 minutes is too large, but that's all I was able to get agreement for. Compared to 2 hours it is short... I really wanted a bit shy of 1 minute, as 1 minute is the magic inactivity for many home NATs and some cloud LBs.

I think that gRPC servers should reset the ping strike count when they *receive* a HEADERS or DATA frame.

I'm biased against the idea as that's the rough behavior of a certain server, and it was nothing but useless and a pain. HEADERS and DATA really have nothing to do with monitoring the connection, so it seems strange to let the client choose when to reset the counter. For BDP monitoring, we need it to be reset when the server sends DATA to use PINGs to adjust the client's receive window size. And I know of an implementation that sent unnecessary frames just to reset the counter so it could send PINGs.

I question if that gets you what you need. If you start three requests at the same time with timeouts of 1s, 2s, 3s, then you'll still run afoul the limit.

Damien Neil

unread,

Dec 2, 2024, 6:51:19 PM12/2/24

to Eric Anderson, grpc.io, Craig Tiller, Doug Fawley

Even one minute is really too long.

A common connection failure mode is for a server to become entirely unresponsive, due to a backend restarting or load balancing shifting traffic off a cluster entirely. For HTTP/1 traffic, this results in a single failed request on a connection. Abandoning an HTTP/1 request renders the connection unusable for future requests, so the connection is discarded and replaced with a new one. For HTTP/2 traffic, however, there is no natural limit to the number of requests which can be sent to a dead/unresponsive connection: When a request times out, the client sends an RST_STREAM, and the connection becomes immediately available to take an additional request. There's no acknowledgement of RST_STREAM frames, so sending one doesn't provide any information about whether the lack of response to a request is because the server is generally unresponsive, or because the request is still being processed.

Sending a PING frame along with an RST_STREAM allows a client to distinguish between an unresponsive server and a slow response.

Delay that check by one minute, and we have a one minute period during which we might be directing traffic to a dead server. That's an eternity.

I question if that gets you what you need. If you start three requests at the same time with timeouts of 1s, 2s, 3s, then you'll still run afoul the limit.

Send a PING along with the RST_STREAM for the first request to be cancelled, and the ping response confirms that all three requests have arrived at the server. We can then skip sending a PING when cancelling the remaining requests.

Yuri Golobokov

unread,

Dec 2, 2024, 7:02:32 PM12/2/24

to Damien Neil, Eric Anderson, grpc.io, Craig Tiller, Doug Fawley

A common connection failure mode is for a server to become entirely unresponsive

This should be caught by TCP_USER_TIMEOUT. If you enable gRPC keep-alive, then normally TCP_USER_TIMEOUT will be enabled to the value of keepAliveTimeout (at least in Java/GO AFAIK). Then you can set keepAliveTimeout to say 10 seconds to detect unresponsive connections within 10 seconds of sending any frame. But please note it is not recommended to set TCP_USER_TIMEOUT to such low values in an unreliable (e.g. mobile) network environment.

--

You received this message because you are subscribed to the Google Groups "grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+u...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/grpc-io/CAGgfL4tyN3y19Pj4NhzeMmXE5O1_merF01UjHfwGM7knx7gyoA%40mail.gmail.com.

Damien Neil

unread,

Dec 2, 2024, 8:04:14 PM12/2/24

to Yuri Golobokov, Eric Anderson, grpc.io, Craig Tiller, Doug Fawley

TCP_USER_TIMEOUT hard-closes the connection on too much unresponsiveness, aborting all in-progress requests. This makes it a dance between setting too short a timeout, and aborting valid requests, or too long a timeout, and using a dead connection. You allude to this when you say that TCP_USER_TIMEOUT must not be set to a low value in an unreliable environment.

We don't want to close a connection with an in-flight request on it. If we have N long-running requests with no response and the user cancels N-1 of them, we should maintain the connection until the final request receives a response or is canceled by the user. However, we might want to create a new connection for new requests if the existing connection appears to be unresponsive. TCP_USER_TIMEOUT does not provide a mechanism to do that.

Reply all

Reply to author

Forward