Linking remote build actions

42 views
Skip to first unread message

Peter Ebden

unread,
Mar 2, 2021, 1:21:42 PM3/2/21
to Remote Execution APIs Working Group
Hi all,

Wondering if anyone's done any work (or any thinking even) in the space of linking remotely executed actions together? It seems like at present the client has the knowledge of what actions are dependent on others and that doesn't exist server-side, except in a very implicit state in that some output blobs may also be inputs to other actions, but that's very difficult and expensive to trace.

I can think of a few cases where it would be useful to be able to trace back from a final action through all the dependent actions that contribute to it - for example to verify signatures (if they were signed, which is a separate discussion :D ), or to identify actions that are "needed" to contribute to an output.

Any thoughts on this? If not I can start thinking more about how it might look, but wanted to see if there was any prior art in this space first.

Peter

Ulf Adams

unread,
Mar 2, 2021, 6:32:27 PM3/2/21
to Peter Ebden, Remote Execution APIs Working Group
When I worked at Google, we looked into dumping the full action graph from Bazel. Unfortunately, that turned out to be impossible because - in its expanded form - it's way too large for most builds at Google. Note that Bazel internally uses a compressed format which is much more compact.

I'll also say that Bazel doesn't run all actions remotely, most importantly symlink actions (but also a few other things), which means the remote execution system doesn't have the full picture. Again, there was an attempt at Google to reconstruct the action graph from the remote execution logs, and there were too many holes in the graph.

What are you trying to do here?

Cheers,

-- Ulf

--
You received this message because you are subscribed to the Google Groups "Remote Execution APIs Working Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to remote-execution...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/remote-execution-apis/f3a619d5-1ced-4d55-9c09-756c53c50301n%40googlegroups.com.

Peter Ebden

unread,
Mar 3, 2021, 4:01:27 AM3/3/21
to Ulf Adams, Remote Execution APIs Working Group
I've got several things in mind right now:
 - When we're garbage collecting the backing storage for some of our servers we attempt to infer which build actions should be kept or removed. Ideally we would be able to identify dependencies of actions we're keeping and retain those too; currently it is conceptually OK to lose them - but this space is still not flawless, it's very hard to get right, and one reason for that is the limited view available from the server side.
 - I'm thinking about options around cryptographic signing of actions; an interesting possibility during verification would be tracing back that all dependencies were also signed (although this might be hindered for pseudo-actions like the symlink ones you mention that don't exist remotely, I'll need to check what we do there). I could possibly dump this out on the client side, of course. This is still pretty speculative...
 - It would be helpful to have this info when debugging. Given a case like "blob xyz is missing", often that means it came from some dependency which is now incomplete for some reason (probably the GC process I mentioned above); it's not very easy to work out what those dependencies are though (requires tracking stuff back through client-side logs etc).

Eric Burnett

unread,
Mar 3, 2021, 10:28:56 AM3/3/21
to Peter Ebden, Ulf Adams, Remote Execution APIs Working Group
For garbage collection we use essentially TTL+extend-on-touch semantics. As long as you see clean builds at a faster rate than your TTL this should have the effect that all relevant actions are seen at least once and won't be collected.

We've considered doing something with transitive data dependencies a couple times, but it never seemed tractable to pursue - along with the issues Ulf raised there's a data volume problem, where each blob has probably 10s or hundreds of direct ancestors and inordinately many indirect ones; many blobs (including toolchains and the empty blob) will be a data dependency for millions or possibly billions of others. So a forward graph would be nearly impossible to store.

A reverse graph of blob to known ancestors sounds more plausible, but I don't think it is in practice - consider a linked binary that takes a bunch of object files, strips out most of the data, and produces a relatively small output binary. Various changes to source files can be made that don't affect the final output, since their contents gets stripped out of the final binary. Which means this output blob now has a logical reverse dependency on every possible link action since there was a last "meaningful" change, possibly going arbitrarily far back in time...any one of those actions could produce the blob, but you don't know which is relevant to current builds just from data dependencies alone.

(Or worse, imagine an action that takes a binary as input and produces a blob containing 'OK'. That 'OK' blob would have a potential data dependency on every version of that binary ever, and all transitive actions above it).

 For signing purposes, the transitive closure of data available to the builder is more informative anyways - the source tree at whatever hash, versions of any repositories pulled, etc. Reverse engineering that from blobs seems way harder than propagating trustable information forward...at the end of the day the blob dependency closure is going to end at "pure inputs" uploaded by bazel anyways, and you still need something to map those back to what they logically are and where they came from to get a useful signal for your signing system to infer anything from.


Reply all
Reply to author
Forward
0 new messages