Can you mention, on high level, some specific pain points in the other remote systems, thinking specifically about Bazel Remote, Buildbarn and Buildfarm? I'm curious because I don't want to do the same mistakes myself.Also, don't forget to add TurboCache to the remote-apis-testing repository: https://remote-apis-testing.gitlab.io/remote-apis-testing/Best regards,Fredrik--fredag 15 april 2022 kl. 05:04:02 UTC+2 skrev thegr...@gmail.com:Hi team,
TL;DRThere's a new bazel remote execution / cache project: https://github.com/allada/turbo-cache
I thought I'd put this out there and see if there is any early feedback on a new project I've been working on on-and-off for the last couple years and now actually have lots of time to bring it to completion. Currently it is unlicensed because I have not yet chosen what to do with it, but it is open source and will likely apply at least a LGPL or even more permissive license in the near future.
I'd love any feedback any of you have.
My BackgroundA few years ago at my previous job I built out our remote execution farm (BuildBarn) that serviced about:
~1.2 million unit tests per day (seconds to run)
~20k integration tests per day (median test duration was ~8 mins)
~300k build jobs per day.
~1 petabyte of cache / month
We spent a huge amount of time trying to keep things stable and to keep infra related issues to a minimum. Over winter break of 2020 I had some free time, so I decided to start a new project to start from scratch and build the server remote execution / cache server myself with all the hindsight available to me.
I decided to write the entire thing in Rust to help with stability (and I wanted to try the brand new Async/Await [which is awesome btw]). I also wanted to implement some cool features that I thought this space was lacking.
Current StateThere are 2 main parts of the project: Remote Cache and Remote Execution.
Remote CacheRemote cache is in the alpha stage. It currently supports:
Remote Execution
Memory store - Data will live in the same machine memory (with eviction policies)
S3 store - Services objects that live in a service that supports S3 calls
Compression store - Will compress data (lz4) then forward on to another store
Dedup (de-duplication) store - Uses a rolling hash algorithm to find parts of files that are the same and only process and store the parts that have changed (similar algorithm that rsync & bup uses). Very efficient when large files only a few changes in them.
FastSlow store - Tries one store first, then if not found tries the slow store (and then populates the fast store)
Filesystem store - Stores objects on disk
SizePartitioning store - Chooses a store to place objects based on the size field of the digest.
Retry logic - Some stores (like s3) you might be able to retry on an error. Retry & recovery is supported in these cases (without the client knowing).
Heavily tested - over 100 unit tests so far. Any bug detected always gets a regression test.
- Extremely small memory footprint & no garbage collection
GRPC only endpoint
Remote execution is still a work in progress. I estimate it will be in alpha stage sometime in May/June. Currently it has bazel properly talking to the scheduler & cas. The scheduler appears to properly schedule jobs w/ priorities, and the worker API to interact with the scheduler is all implemented. The next stage is to implement the workers.
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/fe9051ff-8ee1-496b-8732-7bc76dd65079n%40googlegroups.com.