Weighted

50 views
Skip to first unread message

Ryan Li

unread,
Nov 20, 2023, 11:22:06 PM11/20/23
to grpc.io
Hello grpc community,

We are trying to figure out load balancing across weighted endpoints behind a primary domain from client-side. We use grpc-java 1.3

The context:
  • We have a single long running client which establishes a GRPC channel to a cluster of GRPC Services behind a single L7 Application-level Load Balancer which load balances round-robin style
  • Our client is high traffic streaming calls to the services behind each load balancer
  • A single load balancer and cluster can be considered a single single unit. We want to spin up more units, and map each load balancer behind a primary domain name, which the client would then call.
  • We have configured a weighted routing policy for this primary DNS, with a weight assigned to each load balancer
Example: 
  1. We have Loadbalancer A, Loadbalancer B, and DNS P
  2. We associate A and B behind P, each with a weight of .5 and .5 respectively
  3. If we query DNS P 10,000 times, then it resolves to LB A ~5000 times, and LB B ~5000 times
The issue comes up though when actually sending requests through the client, which establishes a long running grpc channel with the primary DNS.

Found that the behavior is not what we expected:
  • The amount of traffic actually routing to each cluster is not proportional to the weights we assigned to them. 
  • The client establishes a connection to a single cluster, then it doesn't change and all requests are sent to that single cluster rather than being balanced across multiple clusters accordingly

This seems like it already has a solution if we configure MAX_CONNECTION_AGE on the server side, but for certain reasons we cannot modify this property from the server side, and so want to investigate if there are possible solutions solely from a client side.
Some things we have found or are planning to look into:
  1. defaultLoadBalancingPolicy("round_robin")
  2. Weighted round robin
    • https://github.com/grpc/proposal/pull/343/files
    • Our weighted routing policy is configured externally to the grpc service, server or client side, so this doesn't seem exactly usable for our scenario, although its naming sounds very similar to what we are trying to do.
    • This seems like a newer change, and it is infeasible for us to upgrade to a newer version of grpc-java for now
  3. Recreating the channel on some interval
    • This can be ruled out since it goes against grpc best practices and creating channels so often is expensive
Any ideas or suggestions would be greatly appreciated, thanks

Ivy Zhuang

unread,
Dec 13, 2023, 7:12:10 PM12/13/23
to grpc.io
The plans you have all seem related.
1. It true that "round_robin" requires your dns to return a list of backend address.
2. WRR inherits "round_robin". It also requires your DNS server to return a list of backend address to round robin with. So not feasible in your case. (FWIW WRR would require backend servers to return metrics to the clients to determine the weights dynamically, not static weight distribution as you configure in DNS.)

On solution I can think of is to 
1. Write your own NameResolverProvider. It will periodically resolve dns and update the address to the load balancer so that new backend will kick in. The load balancer will kill the connection with the previous address.
It might not be horrible as your thought because you can delegate to grpc DnsNameResolverProvider with the same , just with the same scheme but higher priority.
2. Write your own LoadBalancerProvider that periodically do refreshNameResolution
 
For guide of writing your own custom name resolver or loadbalancer, see guide:
Reply all
Reply to author
Forward
0 new messages