Rethinking Collections

136 views
Skip to first unread message

Eric Sorenson

unread,
Dec 5, 2016, 4:25:52 PM12/5/16
to puppe...@googlegroups.com
Hi all. tl;dr: We are proposing moving the open source package repositories back to a single repository for Puppet-owned projects and their dependencies. This represents a shift from our stated plan to release major-version releases that might contain backwards incompatibilities into their own Puppet Collection repositories, but as a result it will be less confusing to use the packages and easier to stay current.

Long version: When we released Puppet 3.0 in 2013, backward incompatibilities between it and Puppet 2.7 broke a number of sites who had configured their provisioning or package updates to install the latest version of Puppet from our repositories. In order to prevent similar breakage when we released Puppet 4 in April 2015, we introduced it into a new repository called Puppet Collection 1 (PC1), so users had to opt in rather than opt out. The idea was that future backward-incompatible updates would trigger new Puppet Collections, which would also be opt-in, so that a user could stay on PC1 and only move to PC2 when they were ready (background reading: https://puppet.com/blog/welcome-to-puppet-collections ). In practice, the switching costs to get everyone onto a new repository seemed really high and for the most part the impact of releasing into the existing collection was low, so instead we either shipped releases like PuppetDB 4.0 into PC1 or deferred shipping versions with big changes, such when we rolled back from Ruby 2.3 to 2.1 for puppet-agent-1.7.0.

We've been exploring our options to balance between the following criteria:

- avoid breaking sites, to not repeat the Puppet 2 to 3 pain
- provide a set of component packages that are known to work with each other, and provide a basis for Puppet Enterprise platform releases
- encourage rapid adoption of new releases by the open source community
- provide commercial differentiation on support lifecycle, similar to the RHEL / Fedora model

We talked through a number of options in pretty exhaustive detail and have tentatively settled on this as the best – or maybe "least bad" – course of action:

- make a release package with a new name (probably "puppet-release"), eliminating the public face of "Collections"
- move the existing repository directory structure over to a top-level "puppet" repo, leaving links in place for current PC1 users to avoid breaking them.
- publish and promote the plan (probably including re-visiting that blog post above and making a new one to advertise what's happening), including instructions on how to avoid incompatible updates if you don't want them, and updating https://docs.puppet.com/puppet/latest/reference/puppet_collections.html#puppet-collection-contents
- continue publishing any and all open-source releases to the "puppet" repo, including major-version releases.

The patching/update policy will remain as it is today, where only the latest series receives patches. For instance, once Puppet 4.9.0 is out, there will be no more 4.8.x releases. The package repositories which contain Long Term Support Puppet Enterprise point releases will continue to be private, but the branches/tags of the components that comprise these point releases will remain public, so people could rebuild them if they wanted to.

Speaking of community upstream, we want to enable builds of Puppet that behave reliably, stay current with our bugfixes and release cadence, and run on OSes that Puppet Inc. doesn't commercially support. We've been working to enable outside folks to rebuild and distribute our software and are going to continue to focus energy on this. As a few examples, we are:
- working to get Puppet 4.x and Facter 3 built as standalone packages for Solaris
- investigating the OS-native build toolchain for OSes with current compilers like Ubuntu Yakkety and Fedora 25 (to avoid having to rebuild the world to get the C++ packages built)
- making facter-3 installable via gem for testing and distro packaging (FACT-1523)
- working on including the Docker-ized Puppet Server stack into CI so new versions are automatically built and uploaded to docker hub along with traditional packages.

I'd love to hear your feedback (just reply on this thread) on the proposal overall and additional steps that would make your lives easier (with respect to packaging and repos, that is). Although the next major versions won't be out for a few more months, we're looking to make the infrastructure and policy changes before the end of the year, so please chime in.

--eric0

Rob Nelson

unread,
Dec 5, 2016, 4:39:56 PM12/5/16
to puppe...@googlegroups.com
Eric, what IS the rough outline on how to avoid incompatible updates with a consolidated repo? I'm particularly interested in how it would work with puppetlabs/puppet_agent (since `latest` would suddenly have a much different meaning).

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/A53809BB-4B46-4EC1-86BD-3BAD1EC71C2E%40puppet.com.
For more options, visit https://groups.google.com/d/optout.

Trevor Vaughan

unread,
Dec 5, 2016, 5:11:57 PM12/5/16
to puppe...@googlegroups.com
Hi Eric,

Unfortunately, you've now built a distribution (a very minor distribution, but one nonetheless). As such, I would *highly* recommend following the time honored (and massively tedious) method of having something like collections for rolling major release updates and snapshots in time as you roll forward.

This is how we do it and we modeled it off of the CentOS, Ubuntu, Debian, RedHat, Suse, etc... models where a lot of disparate moving parts need to have some level of stability over time.

So, what I would find easiest to deal with would be something like keeping the PC1, PC2, etc.... nomenclature but taking that whole hog.

So:

PC1 -> Latest everything in the PC1 distribution
PC1.0.0 -> Package snapshot at PC1.0.0
PC1.0.1 -> Package snapshot at bugfix release PC1.0.1
PC2 -> Latest everything in the PC2 distribution
PC2.0.0 -> Major breaking change from PC1.X

PCX -> Insane stuff that might break, basically Fedora/Tumbleweed for Puppet

Etc....

I would definitely not recommend making it difficult for users that need to mirror repositories for offline networks. In terms of high compliance environments, this is pretty much all of them.

Thanks,

Trevor

On Mon, Dec 5, 2016 at 4:25 PM, Eric Sorenson <eric.s...@puppet.com> wrote:
--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/A53809BB-4B46-4EC1-86BD-3BAD1EC71C2E%40puppet.com.
For more options, visit https://groups.google.com/d/optout.



--
Trevor Vaughan
Vice President, Onyx Point, Inc

-- This account not approved for unencrypted proprietary information --

Toni Schmidbauer

unread,
Dec 6, 2016, 2:26:44 PM12/6/16
to Eric Sorenson, puppe...@googlegroups.com
Eric Sorenson <eric.s...@puppet.com> writes:
> - working to get Puppet 4.x and Facter 3 built as standalone packages
> for Solaris

this is great news. i'm the current maintainer of the opencsw
(opencsw.org) puppet packages. my plan was to look into building puppet
4 and facter 3 for opencsw in the beginning of next year.

just to be sure:

- does this mean puppet.com is going to release a package for solaris
(11 i suppose) to the community?

- is there a time plan for releasing such a package?

if this is not the case, would it be possible to get help on building
these two packages via this list?

thanks for your time and help
toni

Eric Sorenson

unread,
Dec 7, 2016, 3:50:55 PM12/7/16
to puppe...@googlegroups.com
The bare minimum is to not 'ensure => latest' on production systems across against a repo you don't maintain. I've seen two patterns to implement this - assuming you are managing puppet components via puppet itself, obviously:

1. use the upstream repos but 'ensure' to known-good versions. Test new upstream releases on canary nodes and roll them out in a controlled deployment
2. use ensure => latest but control the *repos* that hosts are pointed to. I did this at a previous job because there ended up being a number of upstream repositories that we wanted to mirror for bandwidth and availability reasons, and it just became a question of pointing hosts at 'canary' vs 'production' repositories for all packaged software to test upgrades.

Clearly the second one requires a little more setup. I took a swing at documenting the first one on the (drafted but now outdated) collections docs page, would welcome any wordsmithing or additional patterns you'd suggest: https://docs.puppet.com/puppet/latest/puppet_collections.html

--eric0


To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CAC76iT9mW6hWBwo3rb9kzQ3DvieEqLHXb8bRxE__CNPmu%2ByVnQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Eric Sorenson - eric.s...@puppet.com 
director of product, puppet ecosystem

Eric Sorenson

unread,
Dec 7, 2016, 4:16:00 PM12/7/16
to puppe...@googlegroups.com
On Dec 5, 2016, at 2:11 PM, Trevor Vaughan <tvau...@onyxpoint.com> wrote:

Hi Eric,

Unfortunately, you've now built a distribution (a very minor distribution, but one nonetheless). As such, I would *highly* recommend following the time honored (and massively tedious) method of having something like collections for rolling major release updates and snapshots in time as you roll forward.

This is how we do it and we modeled it off of the CentOS, Ubuntu, Debian, RedHat, Suse, etc... models where a lot of disparate moving parts need to have some level of stability over time.

But you control which agents are pointed at a given repo, right? That is kind of a fundamentally different design constraint: we want nearly everyone to start out from the same place and to carry as many people forward with new releases without them having to revisit their repository setup.

So, what I would find easiest to deal with would be something like keeping the PC1, PC2, etc.... nomenclature but taking that whole hog.

So:

PC1 -> Latest everything in the PC1 distribution
PC1.0.0 -> Package snapshot at PC1.0.0
PC1.0.1 -> Package snapshot at bugfix release PC1.0.1
PC2 -> Latest everything in the PC2 distribution
PC2.0.0 -> Major breaking change from PC1.X


Do mean making each one of these things an entire separate repository? With its own release package, directory structure, and package contents for every operating system supported at that point? 

PCX -> Insane stuff that might break, basically Fedora/Tumbleweed for Puppet

I kicked around the idea of a smaller version of this for a while, but got stuck on the problem of how to get people using the latest-and-greatest *off* it if they wanted to stabilize onto a numbered release, and conversely what they'd do if a repo they were on was EOL'ed and no longer receiving security updates.


Etc....

I would definitely not recommend making it difficult for users that need to mirror repositories for offline networks. In terms of high compliance environments, this is pretty much all of them.

Is there something in the proposal that makes mirroring harder or easier than it is today?

Trevor Vaughan

unread,
Dec 8, 2016, 10:37:26 AM12/8/16
to puppe...@googlegroups.com
On Wed, Dec 7, 2016 at 4:15 PM, Eric Sorenson <eric.s...@puppet.com> wrote:

On Dec 5, 2016, at 2:11 PM, Trevor Vaughan <tvau...@onyxpoint.com> wrote:

Hi Eric,

Unfortunately, you've now built a distribution (a very minor distribution, but one nonetheless). As such, I would *highly* recommend following the time honored (and massively tedious) method of having something like collections for rolling major release updates and snapshots in time as you roll forward.

This is how we do it and we modeled it off of the CentOS, Ubuntu, Debian, RedHat, Suse, etc... models where a lot of disparate moving parts need to have some level of stability over time.

But you control which agents are pointed at a given repo, right? That is kind of a fundamentally different design constraint: we want nearly everyone to start out from the same place and to carry as many people forward with new releases without them having to revisit their repository setup.

Yes, but nowhere that I've been, or worked, would allow an upstream vendor to dictate the update schedule. Every update had to be mirrored, approved, tested, and scheduled.
 

So, what I would find easiest to deal with would be something like keeping the PC1, PC2, etc.... nomenclature but taking that whole hog.

So:

PC1 -> Latest everything in the PC1 distribution
PC1.0.0 -> Package snapshot at PC1.0.0
PC1.0.1 -> Package snapshot at bugfix release PC1.0.1
PC2 -> Latest everything in the PC2 distribution
PC2.0.0 -> Major breaking change from PC1.X


Do mean making each one of these things an entire separate repository? With its own release package, directory structure, and package contents for every operating system supported at that point? 

Yes, this would mirror most projects that I've used and mirrors the expectation that was met by the PackageCloud team.

If you're self hosting, it's usually just symlink management.
 

PCX -> Insane stuff that might break, basically Fedora/Tumbleweed for Puppet

I kicked around the idea of a smaller version of this for a while, but got stuck on the problem of how to get people using the latest-and-greatest *off* it if they wanted to stabilize onto a numbered release, and conversely what they'd do if a repo they were on was EOL'ed and no longer receiving security updates.

The same as Fedora/Tumbleweed/etc... Front page announcements and mailing list posts that you're going to have a bad time if you stay on release X.

A lot of us patch our code from the repos on the fly because we need new features and/or want to test things. Being able to just point our test system at a repo and let it fly would be nice.
 


Etc....

I would definitely not recommend making it difficult for users that need to mirror repositories for offline networks. In terms of high compliance environments, this is pretty much all of them.

Is there something in the proposal that makes mirroring harder or easier than it is today?


Maybe I'm reading it wrong, but what I need is to guarantee that I'm downloading the 'collection formerly known as PC1.0.0'. Which is difficult right now so I have a bunch of janky scripts to cobble it together.

If I had snapshot repos, People that were approved to use PC1.0.0 would have a known, stable, place to start from.

People that could upgrade to PC1.1.0 could point to that repository, etc...

Alternatively, for YUM systems at least, you could put together a groups file that would coalesce everything in a given version. Makes it more difficult for non-yum base system sync tools to deal with though.
 

Eric Sorenson - eric.s...@puppet.com 
director of product, puppet ecosystem

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages