ActionCache salting/epochs

34 views
Skip to first unread message

Erik Mavrinac

unread,
Jul 14, 2020, 12:17:13 PM7/14/20
to Remote Execution APIs Working Group
Hello all, a question I did not get to ask at the monthly today: Do implementations of ActionCache add in an arbitrarily sourced "salt" or "epoch" value to the basis of the hash? The intent is to be able to disown the contents of a cache universe because of bugs or cache entry poisoning by bad tools. We build these into our internal systems. At the operator level we can discard the entire cache - obviously a major event, and rarely used. At the repo level we allow a team to discard the cache for just their repo, either self-directed or directed by our support.

In actual implementation at the repo level we typically add the hash of a specific file's contents (if present), and the epoch in that file is typically just a counter "1", "2", "3", etc. incremented on each breakaway from a previous cache universe. At the operational level it's a similar type of operational configuration of another string that is hashed and added as another salt to the hash.

There is no specification for this possibility in the ActionCache hash generation algorithm described in protobuf. GetActionResultRequest.action_digest is of necessity the hash of the Action in the CAS and cannot be salted like this. The Action itself aside from instance_name (not dedicated for this purpose and dangerous to attempt to reuse) does not contain a salt that would change its hash in GetActionResultRequest.

Does anyone else do this? And if so, how is it implemented within the Remote Exec framework?

If not, would this make a suitable issue+PR-proposal?

Erik Mavrinac

Ulf Adams

unread,
Jul 14, 2020, 12:28:55 PM7/14/20
to Erik Mavrinac, Remote Execution APIs Working Group
We discussed doing something like this. We haven't implemented it yet, though. The action cache is not a CAS so we can choose action cache keys freely. Our plan was to just add the salt to the action cache key, e.g., "<salt>/<cache_key>".

--
You received this message because you are subscribed to the Google Groups "Remote Execution APIs Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to remote-execution...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/CAOpW8NAWQJhFX0gSa3z6OJLhYwdWJLuVppBMDU_bw5%3DiJVEuGA%40mail.gmail.com.

Erik Mavrinac

unread,
Jul 14, 2020, 12:36:36 PM7/14/20
to Ulf Adams, Remote Execution APIs Working Group
You mean using that prefixed string value as a replacement for the hash string in a Digest? Or somewhere else?

If in V3 we move from string hashes to binary (https://github.com/bazelbuild/remote-apis/issues/135) that would mean prefixing a binary digest with salt bytes, correct?

Might be worth an explicit extra field in Action since the hash of Action is the basis for ActionCache.

Ulf Adams

unread,
Jul 14, 2020, 1:11:56 PM7/14/20
to Erik Mavrinac, Remote Execution APIs Working Group
The syntax I used was based on the paths that we generate for Google Cloud Storage or AWS S3 - those use file paths as the key. We internally use a binary encoding of the digest for the key for caching and whatnot; in that case, we would just prefix the digest with the salt.

Eric Burnett

unread,
Jul 14, 2020, 2:15:10 PM7/14/20
to Erik Mavrinac, Remote Execution APIs Working Group
The way RBE uses it today (and recommend our cache-only bazel customers) is to add a Platform key/value with the salt (example), which affects the command digest and in turn the action digest. This functions exactly as you describe: it can be changed build-side to move to a new cache universe if poisoning is observed. It also serves as a mechanism to segregate cache entries between mutually-incompatible workers - you may want to have multiple universes in use at the same time, in effect.

I'd be fine moving to a different field in Action for that purpose, provided it remains easy for clients like bazel to pipe through. I'd not be comfortable using the action cache key for that purpose, as it's supposed to be a digest and implementations should be capable of validating that. (IIRC in RBE we actually pull the action proto on UploadActionResult, so we care about the digest being an actual digest).



Ed Schouten

unread,
Jul 15, 2020, 8:33:01 AM7/15/20
to Erik Mavrinac, Remote Execution APIs Working Group
Hey Erik,

Op di 14 jul. 2020 om 18:17 schreef Erik Mavrinac <erikma...@gmail.com>:
> If not, would this make a suitable issue+PR-proposal?

Sounds reasonable to me. Be sure to file a PR!

On a somewhat related note: I hope that at some point in the future,
REvX gains support for letting workers cryptographically sign the
ActionResults that they yield. This (combined with other changes)
would allow clients/workers to safely cooperate in environments where
centralized storage nodes/schedulers are not necessarily trustworthy.

Having cryptographic signing would automatically bring in a feature
like this. One could simply do a key rollover to invalidate previous
AC entries.

--
Ed Schouten <e...@nuxi.nl>

Erik Mavrinac

unread,
Jul 15, 2020, 3:55:17 PM7/15/20
to Ed Schouten, Remote Execution APIs Working Group

Sander Striker

unread,
Jul 16, 2020, 5:08:56 PM7/16/20
to Erik Mavrinac, Ed Schouten, Remote Execution APIs Working Group
Having caught up on this thread and seeing the PR I wonder if the salt is a substitute to more fully qualified environments.  Is that really the route we want to take?  Or is there better advice we can give?

Cheers,

Sander

--
You received this message because you are subscribed to the Google Groups "Remote Execution APIs Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to remote-execution...@googlegroups.com.

Daniel Wagner-Hall

unread,
Jul 16, 2020, 5:25:21 PM7/16/20
to Sander Striker, Erik Mavrinac, Ed Schouten, Remote Execution APIs Working Group
On Thu, 16 Jul 2020 at 22:08, Sander Striker <s.st...@striker.nl> wrote:
Having caught up on this thread and seeing the PR I wonder if the salt is a substitute to more fully qualified environments.  Is that really the route we want to take?  Or is there better advice we can give?

FWIW we built this into Pants by always setting an env var to a configurable salt. We did this for two reasons:
1. To be able to do something easily if we detected cache pollution.
2. To be able to run performance tests - we could replay previous builds sequentially and get "start from a clean cache, and then do other builds and see how effective the cache is" kind of metrics, or run multiple builds against the same cluster to compare performance. (Still needing to control for the CAS :))

I'd love to be able to get rid of the hacky env var and use something purpose-built for the job.
 

Sergio Campamá

unread,
Jul 16, 2020, 9:07:47 PM7/16/20
to Remote Execution APIs Working Group
In llbuild2 we use a "version" string that we prefix into the keys before checking the storage for the key, specifically for the file backed implementation of the cache (https://github.com/apple/swift-llbuild2/blob/master/Sources/llbuild2/FunctionCache/FileBackedFunctionCache.swift). This has allowed us on some environments to use the hash of the binary as the version for quick development iteration, and is also purposeful enough to use as a compatibility key that can be modified when some functionality has changed that should invalidate previous cache hits, or when there is cache pollution. For the file based implementation one it's also nice since if you can easily delete the directory corresponding to the version to wipe that part of the cache for development purposes. The remote version of this is designed in a similar vein, where the client controls the version and the backend implementation supports concurrent namespaces being stored.

We initially thought to salt the function keys for similar purposes, but that seemed unnecessary if it can be configured at the storage layer.

Steven Bergsieker

unread,
Jul 16, 2020, 10:24:44 PM7/16/20
to Sergio Campamá, Remote Execution APIs Working Group
I don't think that fully describing the build host environment can or should be defined by the RE API because doing so excludes some use cases. It's entirely possible to build tools that don't depend on the local environment at all, provided you have no expectation that remotely-built binaries would work locally (think: cross-compiling). I think the most that the API can do is what's defined here--offer a flexible field that the client can use to stuff in whatever aspects of the local environment it thinks are important.

Now, if you're trying to tell me that build tools should have a built-in way to capture the relevant data about the local system, THAT I can get behind (although I'll also note capturing ONLY the relevant data, such that the cache remains useful, is a hard task).

Sander Striker

unread,
Jul 17, 2020, 3:43:42 PM7/17/20
to Remote Execution APIs Working Group
Thanks for the responses; this sounds reasonable to me.

Cheers,

Sander

Reply all
Reply to author
Forward
0 new messages