Summary:
A new verb, autopkg build constructs a single recipe chain made up of an override and all its parents. The chain does not need any recipe repos present to run the full process, thus reducing the overall dependencies.
Abstract/High level description:
AutoPkg's greatest strengths are rooted in its flexibility. The relatively decentralized nature of AutoPkg recipes and the way its GitHub organization is structured allow authors to create individual components of common processes - downloading a file, assembling it into a package / archive, uploading it to a distribution tool like Munki or Jamf. Authors can create "complete" processes by utilizing AutoPkg's concept of "parent recipes", which chains together those individual components in a determined order. This allows recipes to be deterministic, but also flexible, and different environments can customize the recipes to a very large degree with custom recipes, custom processors, override files changing the Inputs, etc.
AutoPkg treats these components and variables like Lego building blocks, and only uses them at recipe runtime. This allows authors to be flexible and to customize their environment.
The downside to a flexibility based on a Lego-like approach is that you need access to all those building blocks at runtime. As more organizations move to push AutoPkg into a CI-like environment - a structured and carefully designed running environment meant to offload the responsibility for software management from someone's personal desktop macOS computer - this flexibility actually becomes really challenging to get right.
In a CI environment, you need to have all the parent recipes for an override to run. Sometimes these recipes are in different repos, and sometimes they're not always in the same repo. So in your CI environment, you now have to either pre-download/pre-add all of your AutoPkg repos (which is a list you have to maintain somewhere), or you fetch the repos just-in-time using AutoPkg's "-p" argument, which attempts to search GitHub for them as needed.
Unfortunately for everyone, GitHub's API search functionality has gotten increasingly less reliable over time. Big organizations will also struggle with the API's rate limiting - where you get different/random results from the GitHub API depending on whether you're lucky or not. This is completely useless for a CI environment, because now you get non-deterministic results. Here at Meta, where we're constantly being limited by GitHub's API due to our sheer size, our AutoPkg jobs are experiencing severe hit-or-miss success trying to figure out recipe parentage.
This gave rise to my idea that maybe we should pivot away from AutoPkg's "just-in-time" assembly of its blocks to construct a pre-packaged "solution". This is what the idea of the "recipe chain" is - chain together all the recipes and its dependencies into a single file, and that file is the only thing needed to run a full AutoPkg process (download, assemble, upload, etc.).
Current implementation:
Right now, in its early stages, it's very simple and naive:
autopkg build <recipe name>
This will output a file named "RecipeName.type.chain.recipe" into a folder creatively titled "RecipeChains." Example:
```
% ./autopkg build GoogleChrome.munki
Assembling chain for GoogleChrome.munki...
Recipe chain verified!
Chain stored at /Users/nmcspadden/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe
```
You can then run the chain directly:
```
% ./autopkg run -vvvv ~/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe
```
And it works just as if you'd run the override in traditional AutoPkg methodology.
Future ideas (OPINIONS WANTED!)
There's a lot of implementation details to consider in this design.
Trust/security/updating:
How do we handle trust verification, and updating trust? We already know all the parents used to assemble it, so we can utilize the existing mechanism to hash all the component files (parent recipes, processors, etc.). But how do we tell when it's out of date?
Maybe AutoPkg should have more and faster security by design. What if every single recipe repo had to list the hashes of all files contained within, like an index? This is a lot like what traditional package managers do. Then it would be very easy to determine if something was out of date: compare the hash of the file we used when we built the chain to what it is now, and if there's a mismatch, we can ask the operator to rebuild the chain.
Signing/trusting chains:
Could we sign chains, using actual codesigning commands, on all platforms? This could be better for security because then we would know for sure the chain can't be touched aside from the admin/author that created it (i.e. someone with the private key), although this could add additional complexity in environments because now you have to handle key validation. And once signed, the recipe chains would likely no longer be plain text, so that would make reading/auditing them more challenging.
I'd love to hear opinions on this in particular.