Load balancing using HTTP cache and gRPC Executor

608 views
Skip to first unread message

ulrik....@gmail.com

unread,
Sep 10, 2020, 6:17:24 AM9/10/20
to bazel-dev
Hi,

It is hard to load balance gRPC bazel caches, especially in remote execution scenarios:
  • Bazel client and Remote Executor must use the same cache instance/data.
  • Sharding based on CAS digest requires load balancer to understand REv2 messages. HTTP/2 or generic gRPC level is not enough.
  • A proxy would have to spread FindMissingBlobs requests to multiple shards and merge the result together.
  • gRPC can cause imbalance with Link Aggregation, compared to multiple HTTP/1.1 TCP connections.
But using the HTTP cache protocol instead, opens up possibilities for using generic load balancers, that already supports URI sharding, such as HAProxy.

Therefore I experiment combining HTTP caches with gRPC executors. It works and I get performance on par with gRPC caches when:
  • Pipelining individual HEAD requests for each digest from findMissingDigests, in same HTTP/1.1 connection, before waiting for responses.
  • Avoiding sending same HEAD request more than once.
  • ...
My changes are local in the lib/remote/http package (except allowing the cache combination in RemoteModule.java).

I consider multiple cache shards in the company’s internal network, each of them with 2 x 10 Gbps network interfaces. Instead of paying for load balancer proxy hardware, with same amount of network bandwidth capacity, it would be attractive to have lighter load balancers, simply redirecting with HTTP 3xx responses, and have the bulk traffic go directly from clients to the shards.

Are there more people in the community with interest in this area?

BR,
Ulrik

Ulf Adams

unread,
Sep 10, 2020, 3:08:07 PM9/10/20
to ulrik....@gmail.com, bazel-dev
Hi Ulrik,

I would try to avoid the complexity of using both protocols if possible. I would expect that all remote execution systems *also* support remote caching, and if the remote execution system comes with a load balancer, then that should cover both use cases. Have you considered that already?

I have a patch to reduce the chattiness of Bazel by caching findMissingDigest calls here, which I think applies to both HTTP and gRPC caches:

I was also wondering if switching to HTTP/2 might give us better performance.

Cheers,

-- Ulf

EngFlow GmbH
Fischerweg 51, 82194 Gröbenzell, Germany
Amtsgericht München, HRB 255664
Geschäftsführer (CEO): Ulf Adams

--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/d8cf5243-e50f-43e4-82c4-44892a9d0e8en%40googlegroups.com.

ulrik....@gmail.com

unread,
Sep 11, 2020, 11:35:15 AM9/11/20
to bazel-dev
Thanks Ulf!

Pipelining HTTP HEAD requests, and handling HTTP 3xx redirects, would add complexity. But only locally in com.google.devtools.build.lib.remote.http.* package. Do You think that could be acceptable?

Rest of bazel is not affected, since both gRPC and HTTP clients already implements the same java interface com.google.devtools.build.lib.remote.common.RemoteCacheClient, used by com.google.devtools.build.lib.remote.RemoteExecutionCache.

We deploy Buildbarn’s bb-scheduler and bb-worker. They have a generic blobstore interface, and already have implementations for gRPC, GCS, S3, Azure, Redis, as well as the HTTP cache protocol.

Buildbarn's bb-storage can act as a load balance proxy with sharding, but I have performance issues with bb-storage, both as storage backend and when only used as proxy. I can't get it to saturate a 20 Gbps network. So we use Buildbarn, but replace bb-storage with bazel-remote cache, which can saturate the network easily, and is using much less CPU. Buildbarn and bazel-remote works really well together in our deployment. Being able to use them with a generic HTTP load balancer would be great for us.

Yes, https://github.com/ulfjack/bazel/tree/blob-existence-cache will probably allow me to skip some of the extra functionality I play around with in the HTTP client. Thanks!

Perhaps the HTTP/2 header compression could be beneficial, but bazel does not send that many headers. I like the following with HTTP/1.1:
  • Many available alternative implementations, both for storage and load balancing.
  • HTTP connection pooling works well in practice, and minimize imbalance if using link aggregation of network interfaces.
  • Efficient, since allow using Linux's sendfile(2) for transferring directly in kernel space, between file and socket.

Thanks,
Ulrik

Ulf Adams

unread,
Sep 17, 2020, 5:51:59 PM9/17/20
to ulrik....@gmail.com, bazel-dev
Hi Ulrik,

On Fri, Sep 11, 2020 at 5:35 PM ulrik....@gmail.com <ulrik....@gmail.com> wrote:
Thanks Ulf!

Pipelining HTTP HEAD requests, and handling HTTP 3xx redirects, would add complexity. But only locally in com.google.devtools.build.lib.remote.http.* package. Do You think that could be acceptable?

Handling 3xx redirects seems fine. Pipelining in HTTP/1.1 is problematic due to head-of-line-blocking, which is why almost nobody does it AFAIK. If you want to restrict it to HEAD requests, which part of Bazel actually makes HEAD requests? The HttpCacheClient currently doesn't implement finding missing blobs, but maybe you have a patch for that to avoid duplicate uploads? If so, is that worth it?

(Note that I'm not part of the Bazel team anymore, and I do not have committer access. I can only make recommendations.)


Rest of bazel is not affected, since both gRPC and HTTP clients already implements the same java interface com.google.devtools.build.lib.remote.common.RemoteCacheClient, used by com.google.devtools.build.lib.remote.RemoteExecutionCache.

We deploy Buildbarn’s bb-scheduler and bb-worker. They have a generic blobstore interface, and already have implementations for gRPC, GCS, S3, Azure, Redis, as well as the HTTP cache protocol.

IIUC, the bb-scheduler can route gRPC CAS calls, so why not use that as a proxy and have it distribute the calls to a shared storage backend?
 

Buildbarn's bb-storage can act as a load balance proxy with sharding, but I have performance issues with bb-storage, both as storage backend and when only used as proxy. I can't get it to saturate a 20 Gbps network. So we use Buildbarn, but replace bb-storage with bazel-remote cache, which can saturate the network easily, and is using much less CPU. Buildbarn and bazel-remote works really well together in our deployment. Being able to use them with a generic HTTP load balancer would be great for us.

Do you have so many giant files or why do you need to saturate a 20 Gbps network?


Yes, https://github.com/ulfjack/bazel/tree/blob-existence-cache will probably allow me to skip some of the extra functionality I play around with in the HTTP client. Thanks!

Cool. I'm in the process of getting that merged.
 

Perhaps the HTTP/2 header compression could be beneficial, but bazel does not send that many headers. I like the following with HTTP/1.1:
  • Many available alternative implementations, both for storage and load balancing.
I would expect that most web servers do have support for HTTP/2 at this point. Not sure about load balancers.
  • HTTP connection pooling works well in practice, and minimize imbalance if using link aggregation of network interfaces.
  • Efficient, since allow using Linux's sendfile(2) for transferring directly in kernel space, between file and socket.
Right, that last one isn't possible with HTTP/2 because it requires splitting up files into smaller packets w/ stream headers.

Btw. I just sent out a patch to enable parallel uploads within an action for the HTTP cache: https://github.com/bazelbuild/bazel/pull/12115

Cheers,

-- Ulf
 

Josh Katz

unread,
Sep 18, 2020, 12:24:36 AM9/18/20
to Ulf Adams, ulrik....@gmail.com, bazel-dev
Hey Everyone,

> Do you have so many giant files or why do you need to saturate a 20 Gbps network?

I'm not the original author but on my very small monorepo I have a
single binary that is >500MB that is signed and packaged as a part of
my build pipeline in another part of my build. This 500MB binary is
often touched by our engineers. it's a binary pushed to an IoT device
that contains a lot of stuff. To hit 20Gbps I'd need to (assuming 0
overhead)....

(20 gbps) / (500 megabytes) =
5 hertz

... build this binary about ~5 times per second (assuming each build
modifies this massive binary). This is obviously the wrong way to use
bazel but it's easy to see how some more reasonable workflows could be
affected by bottlenecks in this area. For example, suppose you have a
`*_test` that depends on 10x 50mb binaries that you'd like to test on
an architecture that is different from your dev environment (dev x86
and test on arm for an android use case). Running bazel-watcher
against this `*_test` could pretty easily get you to 5
executions/second with just a few developers + CI running.

This is a bit contrived as it assumes no local disk cache on the RBEs
but as you scale in engineers and number of RBE nodes it's possible to
see how each time you schedule this test you could land on a node that
has never executed your job before and as such has never warmed its
local cache of this binary.

Also, while experimenting at home, I've experienced some pain points
with rules_docker where containers being used in integration tests can
end up being pretty big.

Thanks,
- Josh
> To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAO8_sf%3DyJ4fH_36Kf0DP2FU4LdL3PCuBM_3JTnV7o2CgM81LiQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages