New concept: Recipe Chains

Nick McSpadden

unread,

Apr 25, 2022, 11:16:06 AM4/25/22

to autopkg...@googlegroups.com

Summary:

A new verb, autopkg build constructs a single recipe chain made up of an override and all its parents. The chain does not need any recipe repos present to run the full process, thus reducing the overall dependencies.

Abstract/High level description:

AutoPkg's greatest strengths are rooted in its flexibility. The relatively decentralized nature of AutoPkg recipes and the way its GitHub organization is structured allow authors to create individual components of common processes - downloading a file, assembling it into a package / archive, uploading it to a distribution tool like Munki or Jamf. Authors can create "complete" processes by utilizing AutoPkg's concept of "parent recipes", which chains together those individual components in a determined order. This allows recipes to be deterministic, but also flexible, and different environments can customize the recipes to a very large degree with custom recipes, custom processors, override files changing the Inputs, etc.

AutoPkg treats these components and variables like Lego building blocks, and only uses them at recipe runtime. This allows authors to be flexible and to customize their environment.

The downside to a flexibility based on a Lego-like approach is that you need access to all those building blocks at runtime. As more organizations move to push AutoPkg into a CI-like environment - a structured and carefully designed running environment meant to offload the responsibility for software management from someone's personal desktop macOS computer - this flexibility actually becomes really challenging to get right.

In a CI environment, you need to have all the parent recipes for an override to run. Sometimes these recipes are in different repos, and sometimes they're not always in the same repo. So in your CI environment, you now have to either pre-download/pre-add all of your AutoPkg repos (which is a list you have to maintain somewhere), or you fetch the repos just-in-time using AutoPkg's "-p" argument, which attempts to search GitHub for them as needed.

Unfortunately for everyone, GitHub's API search functionality has gotten increasingly less reliable over time. Big organizations will also struggle with the API's rate limiting - where you get different/random results from the GitHub API depending on whether you're lucky or not. This is completely useless for a CI environment, because now you get non-deterministic results. Here at Meta, where we're constantly being limited by GitHub's API due to our sheer size, our AutoPkg jobs are experiencing severe hit-or-miss success trying to figure out recipe parentage.

This gave rise to my idea that maybe we should pivot away from AutoPkg's "just-in-time" assembly of its blocks to construct a pre-packaged "solution". This is what the idea of the "recipe chain" is - chain together all the recipes and its dependencies into a single file, and that file is the only thing needed to run a full AutoPkg process (download, assemble, upload, etc.).

Current implementation:

Right now, in its early stages, it's very simple and naive:

autopkg build <recipe name>

This will output a file named "RecipeName.type.chain.recipe" into a folder creatively titled "RecipeChains." Example:

```

% ./autopkg build GoogleChrome.munki
Assembling chain for GoogleChrome.munki...
Recipe chain verified!
Chain stored at /Users/nmcspadden/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe

```

You can then run the chain directly:

```

% ./autopkg run -vvvv ~/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe

```

And it works just as if you'd run the override in traditional AutoPkg methodology.

Future ideas (OPINIONS WANTED!)

There's a lot of implementation details to consider in this design.

Trust/security/updating:

How do we handle trust verification, and updating trust? We already know all the parents used to assemble it, so we can utilize the existing mechanism to hash all the component files (parent recipes, processors, etc.). But how do we tell when it's out of date?

Maybe AutoPkg should have more and faster security by design. What if every single recipe repo had to list the hashes of all files contained within, like an index? This is a lot like what traditional package managers do. Then it would be very easy to determine if something was out of date: compare the hash of the file we used when we built the chain to what it is now, and if there's a mismatch, we can ask the operator to rebuild the chain.

Signing/trusting chains:

Could we sign chains, using actual codesigning commands, on all platforms? This could be better for security because then we would know for sure the chain can't be touched aside from the admin/author that created it (i.e. someone with the private key), although this could add additional complexity in environments because now you have to handle key validation. And once signed, the recipe chains would likely no longer be plain text, so that would make reading/auditing them more challenging.

I'd love to hear opinions on this in particular.

--

--
Nick McSpadden
nmcsp...@gmail.com

nmcsp...@gmail.com

unread,

Apr 25, 2022, 11:17:35 AM4/25/22

to autopkg-discuss

I forgot to include the link to the current AutoPkg branch that contains this functionality:

https://github.com/autopkg/autopkg/tree/dev_recipe_chain_build

Greg Neagle

unread,

Apr 25, 2022, 11:26:39 AM4/25/22

to autopkg...@googlegroups.com

Comments in-line below.

On Apr 25, 2022, at 8:15 AM, Nick McSpadden <nmcsp...@gmail.com> wrote:

Summary:

A new verb, autopkg build constructs a single recipe chain made up of an override and all its parents. The chain does not need any recipe repos present to run the full process, thus reducing the overall dependencies.

<snip>

This gave rise to my idea that maybe we should pivot away from AutoPkg's "just-in-time" assembly of its blocks to construct a pre-packaged "solution". This is what the idea of the "recipe chain" is - chain together all the recipes and its dependencies into a single file, and that file is the only thing needed to run a full AutoPkg process (download, assemble, upload, etc.).

Does this chain also include custom processors? If so, then it’s more than a “complied” recipe. If not, then there are still external dependencies...

Current implementation:

Right now, in its early stages, it's very simple and naive:
autopkg build <recipe name>

This will output a file named "RecipeName.type.chain.recipe" into a folder creatively titled "RecipeChains." Example:
```
% ./autopkg build GoogleChrome.munki
Assembling chain for GoogleChrome.munki...
Recipe chain verified!
Chain stored at /Users/nmcspadden/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe
```

You can then run the chain directly:
```
% ./autopkg run -vvvv ~/Library/AutoPkg/RecipeChains/GoogleChrome.munki.chain.recipe

```
And it works just as if you'd run the override in traditional AutoPkg methodology.

Would be nice if the chains were in the search path so you could do

`autopkg run GoogleChrome.munki.chain`

Future ideas (OPINIONS WANTED!)

There's a lot of implementation details to consider in this design.

Trust/security/updating:
How do we handle trust verification, and updating trust? We already know all the parents used to assemble it, so we can utilize the existing mechanism to hash all the component files (parent recipes, processors, etc.). But how do we tell when it's out of date?

How do you tell when a current non-chain recipe is out of date?

Maybe AutoPkg should have more and faster security by design. What if every single recipe repo had to list the hashes of all files contained within, like an index? This is a lot like what traditional package managers do. Then it would be very easy to determine if something was out of date: compare the hash of the file we used when we built the chain to what it is now, and if there's a mismatch, we can ask the operator to rebuild the chain.

This would seem to re-implement (probably poorly) a lot of what git does. I’d be more in favor of leveraging git for this.

Signing/trusting chains:
Could we sign chains, using actual codesigning commands, on all platforms? This could be better for security because then we would know for sure the chain can't be touched aside from the admin/author that created it (i.e. someone with the private key), although this could add additional complexity in environments because now you have to handle key validation. And once signed, the recipe chains would likely no longer be plain text, so that would make reading/auditing them more challenging.

I'd love to hear opinions on this in particular.

I think this adds significant complexity without really making anything more secure. And if chains don’t contain the contents of custom processors, signing them gives you a false sense of security.

Erik Gomez

unread,

Apr 25, 2022, 11:59:12 AM4/25/22

to autopkg...@googlegroups.com

This is an interesting idea but seems like a significant amount of non-trivial work for not wanting to maintain a single autopkg provisioning tool that just sets up autopkg on a CI runner with all of the recipe sources before running.

That's what we do and the only operational overhead is updating the provisioning tool when we add a new repo and the trust hashes.

How would this address downloading custom processors that are within the repo itself and all of the inter dependencies (one python script depends on another, etc).

It seems like a non trivial amount of work on the author of the chain recipe and perhaps the other of the repo to track all of these build dependencies when you could just maintain one provisioning tool that requires zero maintenance for anyone outside of Meta.

At Uber, that's exactly what we do and we don't require ANY GitHub api calls.

On Apr 25, 2022, at 10:26 AM, 'Greg Neagle' via autopkg-discuss <autopkg...@googlegroups.com> wrote:

Comments in-line below.

--
You received this message because you are subscribed to the Google Groups "autopkg-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to autopkg-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/autopkg-discuss/B94BAA7D-1680-4D4C-A3EC-FE5C3BF1EF2C%40mac.com.

william...@gusto.com

unread,

Apr 27, 2022, 10:49:23 AM4/27/22

to autopkg-discuss

Maybe Github can remove the rate limiting/throttling for larger orgs. It's worth asking their support team.

Graham Pugh

unread,

Apr 27, 2022, 4:18:12 PM4/27/22

to autopkg-discuss

To echo Erik, this does seem like a problem solved by just cloning your repo-list at the start of the CI run. Is it really difficult to maintain a repo-list?

Are these chain recipes supposed to replace overrides? What would happen with these "chain" recipes when you need to add your own override keys? If I understand correctly, that's going to be difficult to do unless the chain recipe also includes a recipe override file. Especially if it's signed.

For example, in my recipe override files, I delete all the Input keys except for the ones I need to override. How would that look in a chain? Is each recipe separated in a dict with its own `Input` dict, or are all Input keys gathered up as they currently are with RecipeOverrides? If so, deleting them will not be possible, and therefore when updating trust, it will be impossible to update the Input keys automatically.

I see that you have a desire to maintain just a recipe list and nothing else, but even this requires the repo(s) of the recipes in the list to have been added somehow. If it's somehow not feasible to maintain a repo-list in your CI repo, can't you just add more repos to that setup? You could even trivially clone ALL the GitHub AutoPkg organisation repos (see https://gist.github.com/caniszczyk/3856584 and comments for various ways) which would achieve the same thing as all the `info -p` calls (since `-p` only covers repos in that org), and arguably quicker (a whole chain of `-p` calls are sloooow in my experience).