Mutable Caches

69 views
Skip to first unread message

Stu Hood

unread,
May 28, 2020, 4:13:25 PM5/28/20
to remote-exe...@googlegroups.com
Wait! I know mutability is blasphemy, but please don't stop reading quite yet.

Hey folks!

We'd like to propose an extension to the v2 remote execution protocol to allow for use of well-behaved, global, append-only (or otherwise concurrency safe), mutable caches.

The reasoning behind this approach is explained in Mutable Caching for Processes, but, in short: many well behaved and widely used processes can not trivially be invoked recursively (which would allow them to use digest-based process caching). I've listed a few, but I'm certain that you can think of a few more!

Please see the document for more information, but the gist is that the Command struct would gain approximately two fields:
  1. a set of string cache names that would be provided at a well-known location within an execution sandbox,
  2. a cache namespace id that would allow a client to disambiguate itself, or to ignore existing caches.
Thanks for your consideration!

Stu Hood

unread,
Jun 2, 2020, 3:48:53 PM6/2/20
to Remote Execution APIs Working Group
Would still love feedback on this! Have suggested adding it to the agenda for next week's meeting.

Eric Burnett

unread,
Jun 2, 2020, 5:23:21 PM6/2/20
to Stu Hood, Remote Execution APIs Working Group
Hiya Stu,

Thanks for the proposal! I've read the linked doc, but I don't think I fully understand what you're proposing, so I'll take a stab at it but you might have to correct a bunch of misconceptions here. It sounds like there are two possibilities:
  1. Worker-local caches, where an action may ask for a named cache of the form "name, namespace", to be mounted at .cache/$name, and persisted across actions run on that worker so that if it still exists when a subsequent action asks for it, it's provided again with contents intact.
  2. "Global" caches, which I gather to be something akin to mounting a shared volume read/write (e.g. a NFS share) that all actions concurrently leverage?
Am I correct in interpreting your doc as proposing #2?

I've heard requests before for #1 (worker-local named caches), and think it could well have sufficient utility to add to the protocol broadly. It's "somewhat" dangerous, in that a shared cache can cause spurious failures for subsequent actions run on that worker, but through deleting the cache on any action not run successfully (nonzero exit code or error) that touches it I think that can be decently mitigated, and I can understand cases where it'd be quite useful for things like append-only download caches or built-wheel caches or whatnot as you describe.

I've not heard any requests before for #2 though (concurrently accessed, non-local cache), and I'm a little dubious about it. The cache being non-local removes most of the performance wins vs explicit inputs and outputs, and "shared mutable" at scale seems like it'd be extremely hard to make safe and reliable. (What filesystem operations are required to be atomic? What happens if a writer dies partway? How does garbage collection work? Etc.) I'd suggest that if this is in fact what you want to standardize, more clarity around the semantics of what exactly you're proposing, and an argument for how much of a performance boon this would be to the use-cases of interest, would be helpful in evaluating it.

Of course, you are free to experiment with such things yourself absent anyone else's involvement - platform properties are a good place to put in a side-channeled feature, leverageable from may clients (like bazel) without code change. I find a prototype is often a good way to shake out the details, after which it's then easier to standardize changes.

Hope that helps!
--E

--
You received this message because you are subscribed to the Google Groups "Remote Execution APIs Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to remote-execution...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/d09e6605-5483-40ac-a315-5d1939bb78ae%40googlegroups.com.

Ulf Adams

unread,
Jun 3, 2020, 7:20:50 AM6/3/20
to Eric Burnett, Stu Hood, Remote Execution APIs Working Group
Hi all,

same here wrt. understanding what is proposed. I think you are thinking of execution as a process that recursively visits a DAG, and you're using recursively in that sense. From the point of view of the remote execution system, there is no action graph, and there is no visible connection between the actions that are requested. In addition, independent actions can be executed in parallel. I'm missing two things from the proposal:

1. a discussion of what cache is available to what action, taking into account that actions from the same and different builds may execute in parallel; why is it safe to reuse caches like this (semantics)
2. a discussion of how that would be represented in the protocol, and maybe why it would even need to be represented in the protocol (mechanics)

I have been independently looking at the combination of persistent workers and remote execution, and I have a prototype for 'remote persistent workers' that does not require any protocol changes. This provides a significant improvement to Java builds (I've seen ~3-4x faster on a trivial benchmark, haven't tested larger builds yet). However, this is probably different from what Stu's looking at - in the Javac case, the primary benefit is avoiding Jvm startup time, and this sounds like you're trying to avoid certain expensive computations.

Cheers,

-- Ulf

Stu Hood

unread,
Jun 3, 2020, 12:56:06 PM6/3/20
to Ulf Adams, Eric Burnett, Remote Execution APIs Working Group
Thanks folks!

Eric:
Am I correct in interpreting your doc as proposing #2?

The document is proposing #1. These caches would be mounted or symlinked into a worker's sandbox before execution... where they actually lived (local FS, NFS, etc) would be an implementation detail that wouldn't affect the API.
 
I've heard requests before for #1 (worker-local named caches), and think it could well have sufficient utility to add to the protocol broadly. It's "somewhat" dangerous, in that a shared cache can cause spurious failures for subsequent actions run on that worker, but through deleting the cache on any action not run successfully (nonzero exit code or error) that touches it I think that can be decently mitigated, and I can understand cases where it'd be quite useful for things like append-only download caches or built-wheel caches or whatnot as you describe.

Yea, that's what is being proposed.
 
I've not heard any requests before for #2 though (concurrently accessed, non-local cache), and I'm a little dubious about it. The cache being non-local removes most of the performance wins vs explicit inputs and outputs, and "shared mutable" at scale seems like it'd be extremely hard to make safe and reliable. (What filesystem operations are required to be atomic? What happens if a writer dies partway? How does garbage collection work? Etc.) I'd suggest that if this is in fact what you want to standardize, more clarity around the semantics of what exactly you're proposing, and an argument for how much of a performance boon this would be to the use-cases of interest, would be helpful in evaluating it.

I think the word "global" is what might be confusing here (I should avoid that word in the future!). I did not mean "global" with regard to the cluster: only global on a worker. So, as you said: "worker-local" is a better way to describe it.
--

Stu Hood

unread,
Jun 3, 2020, 1:15:22 PM6/3/20
to Ulf Adams, Eric Burnett, Remote Execution APIs Working Group
Ulf:
same here wrt. understanding what is proposed. I think you are thinking of execution as a process that recursively visits a DAG, and you're using recursively in that sense. From the point of view of the remote execution system, there is no action graph, and there is no visible connection between the actions that are requested. In addition, independent actions can be executed in parallel. I'm missing two things from the proposal:

I've responded to Eric's question in the document, which might clarify.

Yes, agreed that in the context of remote execution there is no action graph. But the relevance of the dependencies between rules/targets/modules is still relevant. A tool that can efficiently be invoked "recursively" would avoid doing redundant work at each level of the graph.

To inline some of my comment from the doc: imagining modules A, B, C where A depends on B depends on C: if C is something being compiled, then we can expect that the execution of B will consume some or all of C's output in order to avoid "re-doing" any of the work that went into building C in the first case.

This is notably _not_ the case with resolvers: if you invoked a dependency resolver on the same A, B, C graph in "C then B then A" order, you could do lots of wasted work: C would choose a set of dependencies, and then B might change the resolution entirely. While it's possible that "some" of C's work could be reused, it's definitely not reliable, and you're still wasting CPU time.

So resolvers tend not to be invoked recursively like this: instead, you invoke them once with all of A, B, C at once. But that is a very large cache key... it makes action caching effectively useless. You would nonetheless like to run it on your cluster in order to avoid downloading things to your client and then having to uploading to the cluster.

And that's where the proposal comes in.

1. a discussion of what cache is available to what action,

This is discussed near the bottom of the proposal: the `Process` sketch corresponds to the REAPI Command struct. So processes would only have caches made available that they had asked for.
 
taking into account that actions from the same and different builds may execute in parallel; why is it safe to reuse caches like this (semantics)

Right: a lot hinges on this. Tools that opted-in to use these caches would have to be known to be well-behaved, and (generally) append only.
 
2. a discussion of how that would be represented in the protocol, and maybe why it would even need to be represented in the protocol (mechanics)

This is discussed in the very last paragraph of the proposal.

I have been independently looking at the combination of persistent workers and remote execution, and I have a prototype for 'remote persistent workers' that does not require any protocol changes. This provides a significant improvement to Java builds (I've seen ~3-4x faster on a trivial benchmark, haven't tested larger builds yet). However, this is probably different from what Stu's looking at - in the Javac case, the primary benefit is avoiding Jvm startup time, and this sounds like you're trying to avoid certain expensive computations.
This is also very interesting, and yea: the solution is very different. It's the second of the Classification of caches in the document: tools for which something mutable is needed per-target (in this case, the in-memory state of a worker). Would be interested in discussing that in a different thread though!

On Wed, Jun 3, 2020 at 4:20 AM Ulf Adams <ulf...@gmail.com> wrote:


--

Ulf Adams

unread,
Jun 3, 2020, 2:26:05 PM6/3/20
to Stu Hood, Eric Burnett, Remote Execution APIs Working Group
On Wed, Jun 3, 2020 at 7:15 PM Stu Hood <stu...@toolchain.com> wrote:
Ulf:
same here wrt. understanding what is proposed. I think you are thinking of execution as a process that recursively visits a DAG, and you're using recursively in that sense. From the point of view of the remote execution system, there is no action graph, and there is no visible connection between the actions that are requested. In addition, independent actions can be executed in parallel. I'm missing two things from the proposal:

I've responded to Eric's question in the document, which might clarify.

Yes, agreed that in the context of remote execution there is no action graph. But the relevance of the dependencies between rules/targets/modules is still relevant. A tool that can efficiently be invoked "recursively" would avoid doing redundant work at each level of the graph.

To inline some of my comment from the doc: imagining modules A, B, C where A depends on B depends on C: if C is something being compiled, then we can expect that the execution of B will consume some or all of C's output in order to avoid "re-doing" any of the work that went into building C in the first case.

This is notably _not_ the case with resolvers: if you invoked a dependency resolver on the same A, B, C graph in "C then B then A" order, you could do lots of wasted work: C would choose a set of dependencies, and then B might change the resolution entirely. While it's possible that "some" of C's work could be reused, it's definitely not reliable, and you're still wasting CPU time.

Is it possible to only do it once for each 'binary' or deployable unit?
 

So resolvers tend not to be invoked recursively like this: instead, you invoke them once with all of A, B, C at once. But that is a very large cache key... it makes action caching effectively useless. You would nonetheless like to run it on your cluster in order to avoid downloading things to your client and then having to uploading to the cluster.

I don't see how you can avoid the large cache key. What makes it safe to reuse the cache if the inputs change? And if it is safe to do so, what process ensures that the cache doesn't grow unbounded?
 

Stu Hood

unread,
Jun 3, 2020, 2:36:12 PM6/3/20
to Ulf Adams, Eric Burnett, Remote Execution APIs Working Group
Is it possible to only do it once for each 'binary' or deployable unit?

Yep, exactly. I inlined some of this explanation into the document with a footnote about this: I've generally called these "roots" of the graph, since tests also count.
 

I don't see how you can avoid the large cache key. What makes it safe to reuse the cache if the inputs change? And if it is safe to do so, what process ensures that the cache doesn't grow unbounded?

The proposal is that these caches do not have a key beyond their name (which is stable) and a namespace (which a client could change to discard or disambuigate cache clients). So as Eric said, a lot hinges on the tools having trustworthy on-disk cache implementations.
--

Stu Hood

unread,
Jun 9, 2020, 12:30:24 PM6/9/20
to Ulf Adams, Eric Burnett, Remote Execution APIs Working Group
Thanks for the feedback at today's meeting!

I've updated the API and Remote sections of the document to indicate that clients should send a mapping/dict from cache-name to cache-relative-path-within-the-sandbox (rather than only a name), and described how to map these fields into the Platform.properties in the interim. I'll implement the prototype in our client, and we'd be interested in working with server implementations to gain experience with this facility (and might submit patches to that end!).

Stu Hood

unread,
Jul 1, 2020, 4:10:28 PM7/1/20
to Ulf Adams, Eric Burnett, Remote Execution APIs Working Group
Hey folks! A very brief update on this: the prototype of including this information in the platform_properties caused some issues because a few different server implementations exact-match the platform properties in order to locate workers (RBE does not exact match, but Buildbarn and Buildgrid both did).

We are continuing to use the local prototype, but including the information in remote executions is on hold for a bit. Our next attempt to include this information will likely put it in the action's environment variables instead.
--
Reply all
Reply to author
Forward
0 new messages