Hi team,
I’m encountering a problem with the Istio based envoy ingress
gateway that I have used as the entry point to services in my cluster.
The issue is that this ingress is using HPA to scale up/down pods, and the gRPC connections created during the minimum pod count duration are not rebalanced to other pods when the pod count is higher after scale up.
Because of this, few envoy pods end up serving most of the gRPC requests and P99 latencies on client is impacted negatively during peak throughout. The gRPC server has a max_connection_age of 5 minutes, after which it will send a GOAWAY and in my understanding, envoy is not propagating this to gRPC client and it is still maintaining the connection to envoy.
Is there a way to solve this rebalancing issue for gRPC connections when auto-scale events happen? Maybe some configuration that can help.
PS: Attaching some metric screenshots as well.