Hey.
My organization is working on implementing a generic sharding mechanism where teams can configure mapping of keys to a set of shards, and shards to a set of containers, for a given service. This is analogous to Facebook's shard manager [1]. For what the routing layer is concerned, the idea is that:
- a control plane component provides information for selecting a shard based on information contained in the incoming request (e.g. a list of keys to a shard identifier, or list of ranges of keys to a shard identifier). Assignments for keys to shards do not have to change very frequently (e.g. monthly), but this explicit mapping can be quite large (e.g. millions of entries).
- a control plane component provides information on how to map shards to running containers (network endpoints).
- the data plane (gRPC or Envoy) uses that information to send requests to the appropriate endpoint.
I'm looking for guidance on how to implement the routing layer in gRPC, potentially using xDS. We also use Envoy for the same purpose, so ideally, the solution is similar for both. In our initial design for Envoy, we use:
1. A custom filter to map requests to shards. In gRPC, this could be done via an interceptor that injects the shard identifier in the request context.
2. Load Balancer Subsets [2] to route requests to individual endpoints within a CDS cluster. This assumes that all shards are part of the same CDS cluster, and identifiers are communicated from the control plane to the clients via EDS config.endpoint.v3.LbEndpoint metadata field [3]. The problem here is that as far as I can see, Load Balancer Subsets are not implemented in gRPC.
For 2., an alternative design would use a different CDS cluster for each shard. This has some advantages, such as more configurable options and detailed metrics for each shard, but is more costly, both in terms of Envoy memory footprint, metrics cost (Envoy generates a lot of metrics for each cluster -- but we could work around that one) and number of CDS and EDS subscriptions. So first, I'd like to know whether support for Load Balancer Subsets is something the gRPC team has considered and rejected, or whether it has not been implemented because it is low priority (I do not see any mention in all xDS related gRFCs).
If the team does not plan on supporting load balancing subsets, would you have recommendations on an alternative way to implement sharding within or on top of gRPC? There are a couple of ways I can think of:
1. implement sharding entirely on top of the stack, using multiple gRPC channels. Exposing an interface that looks like a single gRPC client is some amount of work on the data plane/library side but possible.
2. put each shard in a different CDS cluster rather than using load balancer subsets, simply using mechanisms described in A28 - gRPC xDS traffic splitting and routing [4] to route to the correct endpoint. This would increase the amount of xDS subscriptions by a factor 1000 for sharded services, which may become a problem. I think this would also increase the number of connections to each backend, since each container typically hosts multiple shards.
3. implement a custom load balancing policy that does the mapping for gRPC based on load balancer subsets [5]. I haven't dug too much into what that would look like in practice.
If that matters, we plan to use all Go and Java (we also use Python but typically use an egress Envoy proxy in that case). Has anyone successfully done any of those? (some teams here have done 1 to some extent, but we are aiming to make sharding more transparent and less work for users).
Thanks,
Antoine.