--
You received this message because you are subscribed to the Google Groups "Remote Execution APIs Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to remote-execution...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/d09e6605-5483-40ac-a315-5d1939bb78ae%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/CAJZTgz1i%2B5%2BJmtYX-cawk59tVv6pLZwfWAK0zzOpy3hWmd%3DbyA%40mail.gmail.com.
Am I correct in interpreting your doc as proposing #2?
I've heard requests before for #1 (worker-local named caches), and think it could well have sufficient utility to add to the protocol broadly. It's "somewhat" dangerous, in that a shared cache can cause spurious failures for subsequent actions run on that worker, but through deleting the cache on any action not run successfully (nonzero exit code or error) that touches it I think that can be decently mitigated, and I can understand cases where it'd be quite useful for things like append-only download caches or built-wheel caches or whatnot as you describe.
I've not heard any requests before for #2 though (concurrently accessed, non-local cache), and I'm a little dubious about it. The cache being non-local removes most of the performance wins vs explicit inputs and outputs, and "shared mutable" at scale seems like it'd be extremely hard to make safe and reliable. (What filesystem operations are required to be atomic? What happens if a writer dies partway? How does garbage collection work? Etc.) I'd suggest that if this is in fact what you want to standardize, more clarity around the semantics of what exactly you're proposing, and an argument for how much of a performance boon this would be to the use-cases of interest, would be helpful in evaluating it.
same here wrt. understanding what is proposed. I think you are thinking of execution as a process that recursively visits a DAG, and you're using recursively in that sense. From the point of view of the remote execution system, there is no action graph, and there is no visible connection between the actions that are requested. In addition, independent actions can be executed in parallel. I'm missing two things from the proposal:
1. a discussion of what cache is available to what action,
taking into account that actions from the same and different builds may execute in parallel; why is it safe to reuse caches like this (semantics)
2. a discussion of how that would be represented in the protocol, and maybe why it would even need to be represented in the protocol (mechanics)
I have been independently looking at the combination of persistent workers and remote execution, and I have a prototype for 'remote persistent workers' that does not require any protocol changes. This provides a significant improvement to Java builds (I've seen ~3-4x faster on a trivial benchmark, haven't tested larger builds yet). However, this is probably different from what Stu's looking at - in the Javac case, the primary benefit is avoiding Jvm startup time, and this sounds like you're trying to avoid certain expensive computations.
Ulf:same here wrt. understanding what is proposed. I think you are thinking of execution as a process that recursively visits a DAG, and you're using recursively in that sense. From the point of view of the remote execution system, there is no action graph, and there is no visible connection between the actions that are requested. In addition, independent actions can be executed in parallel. I'm missing two things from the proposal:I've responded to Eric's question in the document, which might clarify.Yes, agreed that in the context of remote execution there is no action graph. But the relevance of the dependencies between rules/targets/modules is still relevant. A tool that can efficiently be invoked "recursively" would avoid doing redundant work at each level of the graph.To inline some of my comment from the doc: imagining modules A, B, C where A depends on B depends on C: if C is something being compiled, then we can expect that the execution of B will consume some or all of C's output in order to avoid "re-doing" any of the work that went into building C in the first case.
This is notably _not_ the case with resolvers: if you invoked a dependency resolver on the same A, B, C graph in "C then B then A" order, you could do lots of wasted work: C would choose a set of dependencies, and then B might change the resolution entirely. While it's possible that "some" of C's work could be reused, it's definitely not reliable, and you're still wasting CPU time.
So resolvers tend not to be invoked recursively like this: instead, you invoke them once with all of A, B, C at once. But that is a very large cache key... it makes action caching effectively useless. You would nonetheless like to run it on your cluster in order to avoid downloading things to your client and then having to uploading to the cluster.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/CAKGmv%3DLe_0p_zLtUJVk6-xHnJ3Cj%3DNRRVzxB%2Bz%3DJGHnVuuTcSA%40mail.gmail.com.
Is it possible to only do it once for each 'binary' or deployable unit?
I don't see how you can avoid the large cache key. What makes it safe to reuse the cache if the inputs change? And if it is safe to do so, what process ensures that the cache doesn't grow unbounded?