Jira (PUP-8710) Smarter cached catalogs

Smarter cached catalogs

Change By:	Branan Riley
Component/s:	puppet-runtime
Key:	PA PUP - 2022 8710
Project:	Puppet Agent

This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)

Henrik Lindberg (JIRA)

unread,

May 11, 2018, 6:44:02 AM5/11/18

to puppe...@googlegroups.com

Henrik Lindberg commented on

With the addition of Deferred it is possible to add a call to a function that happens before the catalog is applied on an agent. (Each time the catalog is applied). What is needed is then a function that will invalidate the cached catalog and start over.

More is needed to be able to write some logic (like a hysteris around 80% free disk) which should probably be implemented in the form of a function that can be run on the agent.

# in the catalog

Deferred('request_new_calalog', Deferred('is_disk_almost_full'))

Not sure where that should go though - it needs to be in a resource attribute so we need a resource representing the catalog itself which means we have the opportunity to implement everything as a type/provider instead of using a Deferred value.

Eric Sorenson (JIRA)

unread,

May 14, 2018, 4:58:04 PM5/14/18

to puppe...@googlegroups.com

Eric Sorenson commented on

this doesn't feel like a Deferred problem to me, and I don't think it ought to be added on the agent directly. It's more related to the concept of event-driven infrastructure, where the input event is async facts submission (now possible with ~~PUP-7779~~) on facts that the adminstrator cares about, the condition is the threshold management (subject to hysteresis, as henrik notes, hold-down is a key property of event-response frameworks), and the action is to trigger a fresh puppet run invalidating the local cache.

Craig Gomes (JIRA)

unread,

May 29, 2018, 12:50:12 PM5/29/18

to puppe...@googlegroups.com

Craig Gomes updated an issue

Puppet /

Smarter cached catalogs

Change By:	Craig Gomes
Team:	Coremunity

Neil Binney (JIRA)

unread,

Jun 27, 2018, 3:20:03 AM6/27/18

to puppe...@googlegroups.com

Neil Binney updated an issue

Puppet /

Smarter cached catalogs

Change By:	Neil Binney
CS Priority:	Normal
CS Impact:	There is a workaround where facts can be uploaded, but requires some configuration to get working. With the agent running in cron run facts locally on the agent and compare against threshold, If the threshold has not been breached the agent can continue to run against the cached catalog. Increasing compile masters in the setup would be recommend to handle additional load rather than relying on cached catalogs. It it a good idea but would require considerable update to the Puppet Agent. further user cases will be needed to make business case for this feature to be resourced. Number of large scale customers using cached catalogs is quite low.
CS Severity:	2 - Annoyance
CS Business Value:	2 - $$$
CS Frequency:	1 - 1-5% of Customers

Josh Cooper (Jira)

unread,

Apr 1, 2020, 2:46:04 AM4/1/20

to puppe...@googlegroups.com

Josh Cooper commented on

It would also be really nice if there was at least an ability for nodes in cached catalog to pluginsync new facts.

Pluginsync creates/updates/deletes all facts/types/providers from all modules in that environment. Doing a pluginsync, but not updating the catalog, can break puppet the next time it tries to apply your cached catalog. So I'd recommend against doing that.

I also agree with what Eric Sorenson said, that it ought not be added to the agent directly. I'm going to close this as such.

This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)

William Rodriguez (Jira)

unread,

Apr 1, 2020, 8:53:03 PM4/1/20

to puppe...@googlegroups.com

William Rodriguez commented on

Ok, so yeah, I agree, pluginsync doesn't make sense. You'd lose half the stuff that's making the cached catalog work in the first place. That's fine. But, as the large customer in question, I don't agree that this wouldn't be a valuable feature. Yes, thanks to ~~PUP-7779~~ we are able to now get fresh facts on all of our nodes in cached catalog, however, if we wanted to act on those facts in some sort of capacity we'd have to feed the facts to some sort of event management system, then write rules there to trigger a call back to the agent to replace the catalog. Now don't get me wrong. There's easier ways too, like I could make an exec that runs a facter command every puppet run, uses some sketchy bash logic, and then flips the value of use_cached_catalog to false for the next run, but that's not at all performant, and is, of course, a total hack.

The problem for us is that with the amount of change control nightmares and regulatory scrutiny that we have, we can't avoid running our fleet in cached catalog. And when you run a fleet of tens of thousands in cached catalog, you quickly find yourself quite limited. Being able to leverage some sort of control over those nodes running cached catalogs would make our lives so much easier, and give us back at least some of the flexibility we lost when we made the move to managing most of our fleet in cached_catalog.

Another thing this could potentially enable is things like phased deployments. If we could set rules in facts that determine when a system is supposed to get a new catalog, which some of our teams are currently doing for their own servers via other hacks, we could do things like planned catalog refreshes, instead of our nodes sitting on an old catalog until the end of time because they got missed in the last rollout. This is probably the biggest use-case for us at the moment, but with the ability for puppet to respond to facts returned to us, I'm sure we'd find others very quickly.

In short, can we can keep this around?