Force Fact within manifest

1,450 views
Skip to first unread message

Paul Oyston

unread,
Oct 7, 2013, 6:10:25 AM10/7/13
to puppet...@googlegroups.com
I have a requirement where I want a Fact to be stored in PuppetDB during the manifest run and not during the initial fact gathering phase.

I know I can, in my manifests, create a file in /etc/facter/facts.d or I can write a Ruby script that will then be distributed by plugin sync. But both of these methods will only publish the fact during the initial phase of the puppet agent run. What I want to be able to do is set a fact during the manifest run portion that will then be stored in the PuppetDB without an additional puppet run.

For example I'm wanting to use it to create a SSH key for the system, then to immediately publish this ssh_root_key fact. But given the way it works at the moment it would only occur during the initial phase of the puppet run, so would require two runs before that fact was available and ready for other servers to pick up using puppetdbquery.

Does anyone know if this can be forced in some way?

Wolf Noble

unread,
Oct 7, 2013, 10:36:01 AM10/7/13
to puppet...@googlegroups.com
Hi Paul,

Here's a diagram showing how the puppet run process flows:

as you can see, facter is run exactly once, before the catalog is created.

facter is not invoked again until the next run. I suppose you could have your sshkey resource notify the puppet service, which would subsequently trigger another run.

Paul Oyston

unread,
Oct 8, 2013, 5:31:17 AM10/8/13
to puppet...@googlegroups.com
Hi Wolf,

Thanks for that diagram, that's incredibly helpful.

It seems a bit of an oversight not to allow facts to be updated during the manifest phase since manifests are making changes to the system and therefore potentially modifying facts during their run. I might look into modifying the agent run so that facts can be updated during the manifest phase using a custom function of some description.

I was looking into calling exec to use "puppet facts upload". But this for some reason errors out during the puppet run even after auth.conf has been modified to allow the save call from nodes, it does work directly on the command line though which I assume is something to do with a lock or environment variable that differs during the agent run.

Paul Oyston

unread,
Oct 8, 2013, 10:19:54 AM10/8/13
to puppet...@googlegroups.com
For anyone else wanting to do something similar:

For now I've just used the postrun_command on the puppet agents so that the facts will be uploaded to the server once modifications have been made.

e.g. postrun_command "puppet facts upload"

This will then re-upload the facts once the puppet agent has run, giving it the most up to date information for anything querying PuppetDB. I really cannot understand for the life of me why this isn't the default functionality, it seems ludicrous that a tool designed to report facts back to the master would have out of date information once puppet has modified the system.

jcbollinger

unread,
Oct 8, 2013, 3:27:34 PM10/8/13
to puppet...@googlegroups.com


On Tuesday, October 8, 2013 4:31:17 AM UTC-5, Paul Oyston wrote:
Hi Wolf,

Thanks for that diagram, that's incredibly helpful.

It seems a bit of an oversight not to allow facts to be updated during the manifest phase since manifests are making changes to the system and therefore potentially modifying facts during their run.


You need to better understand the operational model.  Puppet evaluates the manifests provided to it in light of a target node identity and a set of facts to compute, all in advance, details of the desired state of the target node.  The result is packaged in a compiled form -- a catalog -- and handed off to the client-side Puppet runtime for those computed state details to be ensured.

Puppet uses fact values only during catalog compilation, before it changes anything about the target node.  They can be interpolated into resource properties, such as file names or content, but their identity is thereby lost.  Changing or setting fact values in PuppetDB during catalog application will not have any effect on the catalog application process, except that in principle you could apply Exec resources that perform PuppetDB queries and do something dependent on the result.

 
I might look into modifying the agent run so that facts can be updated during the manifest phase using a custom function of some description.


Puppet functions run during catalog compilation, not during catalog application, so that particular approach wouldn't work.  Even if you could make it work, you could not thereby modify the catalog run during catalog application.

These are not the hooks you're looking for.
 

John

Paul Oyston

unread,
Oct 8, 2013, 7:08:56 PM10/8/13
to puppet...@googlegroups.com
I'm really only wanting an up to date set of facts once the puppet agent has finished making changes to the system, I'm not wanting to modify the catalog run process by modifying facter values during the run process. I'm aware that the facts are evaluated and the manifests compiled at the beginning of the agent run. 

It's simply a matter of PuppetDB not containing the most up to date facts once that catalog has been completed.

In the simplest possible example you might push a fact using pluginsync that checks if mongo is installed, at the start of the puppet agent run the fact is pulled in and evaluated to be false. The manifests then go away and make changes to that node, it installs mongo. The puppet agent then finishes, if you ran the fact on the system now it would return true, but the puppet master still contains false because it was only evaluated at the start. You then can't rely on the PuppetDB information until the puppet agent runs again which could be another hour from the first run, then another period of time before other nodes pick up on this new fact value.

What you explained about using Exec (in actuality I'm querying PuppetDB using puppetdbquery) is exactly what I'm wanting to do and have nodes perform actions based on the facts about other nodes stored in PuppetDB.

As I say though, I've already provided a workaround by having a postrun script that updates the facts using "puppet facts upload", I just need to do some additional checking to see if there were actually changes in the run and then only update the facts if there were changes.

jcbollinger

unread,
Oct 9, 2013, 9:59:06 AM10/9/13
to puppet...@googlegroups.com


On Tuesday, October 8, 2013 6:08:56 PM UTC-5, Paul Oyston wrote:
I'm really only wanting an up to date set of facts once the puppet agent has finished making changes to the system, I'm not wanting to modify the catalog run process by modifying facter values during the run process. I'm aware that the facts are evaluated and the manifests compiled at the beginning of the agent run. 

It's simply a matter of PuppetDB not containing the most up to date facts once that catalog has been completed.

In the simplest possible example you might push a fact using pluginsync that checks if mongo is installed, at the start of the puppet agent run the fact is pulled in and evaluated to be false. The manifests then go away and make changes to that node, it installs mongo. The puppet agent then finishes, if you ran the fact on the system now it would return true, but the puppet master still contains false because it was only evaluated at the start. You then can't rely on the PuppetDB information until the puppet agent runs again which could be another hour from the first run, then another period of time before other nodes pick up on this new fact value.

What you explained about using Exec (in actuality I'm querying PuppetDB using puppetdbquery) is exactly what I'm wanting to do and have nodes perform actions based on the facts about other nodes stored in PuppetDB.

As I say though, I've already provided a workaround by having a postrun script that updates the facts using "puppet facts upload", I just need to do some additional checking to see if there were actually changes in the run and then only update the facts if there were changes.



I still say you are trying to put the recorded facts to a use for which they were not intended and are not well suited.  The recorded facts necessarily capture a snapshot of fact values for the target node, and all you're doing is changing time point for that snapshot.  On most machines, most of the time, node facts will be the same after a catalog run as they were before anyway, except for those that change continuously (e.g. uptime*).  And that brings me to my next point: to the extent that there is a risk of fact values changing during a Puppet run, you cannot rely on them remaining constant between Puppet runs, either.

The normal interpretation of the node facts stored in PuppetDB is "these are the facts that informed compilation of the node's latest catalog".  What you are describing changes that meaning, which presents a moderate risk given that PuppetDB is primarily an internal service for Puppet.  Among other things, you may cause the master occasionally to serve up a stored catalog when it really needed to compile a fresh one.


John

Jeff Sparrow

unread,
Feb 29, 2016, 2:24:44 PM2/29/16
to Puppet Users
Im going to second this guy.  I believe puppet should ALWAYS collect facts after it is done.  The amount of changes it can make to a server, during a catalog run, can be massive.  Without reporting those changes, bad things can occur.  It really really limits the potential of having a complete and correct inventory system.

As Paul has done, I too make the last thing to run on all my catalogs, another upload of puppet facts.  Now I know what changes have been made.  There are many other people out there that have requested this, so we can't be the only ones.

On a most basic level, I dont see any possible way to argue for the current method.

Current method:
Collect facts -> change system config -> Don't collect those changes

Our method:
Collect facts -> change system config -> recollect change facts


jcbollinger

unread,
Mar 1, 2016, 11:16:22 AM3/1/16
to Puppet Users


On Monday, February 29, 2016 at 1:24:44 PM UTC-6, JS wrote:
Im going to second this guy.


Which guy?  The OP?

 
 I believe puppet should ALWAYS collect facts after it is done.


That's not what anyone else in this necro'd thread said back in its previous life, so exactly what and who are you agreeing with?

 
 The amount of changes it can make to a server, during a catalog run, can be massive.


Yes, they can, but normally they are zilch (nada, bumpkus, zero).  Moreover, even when massive changes are applied, they rarely change the values of any of the standard facts.  Of course, all bets are off with custom facts.

 
 Without reporting those changes, bad things can occur.


What? Why?  How?

 
 It really really limits the potential of having a complete and correct inventory system.


Well that's not what Puppet is, or ever was intended to be.

 

As Paul has done, I too make the last thing to run on all my catalogs, another upload of puppet facts.  Now I know what changes have been made.


Communicating what changes have been made is the purpose of Puppet's reports.

I suppose, though, that you mean you know nodes' current state, inasmuch as node state is reflected by the node facts recorded in PuppetDB.  As I observed in my previous response to this thread, however, node facts only ever give you a snapshot of node state, and Puppet is by no means the only thing that might change that state.  Even if you upload fresh facts at the end of each Puppet run, it would be a potentially dangerous mistake to rely on PuppetDB to hold an accurate representation of node state between Puppet runs.  Furthermore, it is difficult to determine at the time you query PuppetDB whether the node in question actually is between runs, as for each node there will be from seconds to minutes out of each catalog run interval in which a catalog run is in progress.

 
 There are many other people out there that have requested this, so we can't be the only ones.


I'm not so sure that yours is a commonly requested feature.  It is far more common to hope that Puppet will recognize changes to fact values during a run, so as to affect what it will apply later in that run.  I don't see that particular behavior being implemented any time in the foreseeable future.  Your request is at least more feasible.

 

On a most basic level, I dont see any possible way to argue for the current method.



Is that a solicitation?

 
Current method:
Collect facts -> change system config -> Don't collect those changes

Our method:
Collect facts -> change system config -> recollect change facts



That characterization seems calculated to support your position, based, moreover, on the faulty premise that node facts as recorded in PuppetDB can or should be useful out-of-band as indicators of current node state.

Collecting and recording extra node facts puts a little additional load on every machine, and a moderate amount of additional load on PuppetDB.  This is a scalability consideration.

In the standard configuration, a node's current facts describe the basis on which its most recent catalog was built.  This could, in principle, contribute to evaluating whether cached node catalogs are stale.  Uploading facts after a run makes such a determination unreliable, thereby foreclosing an opportunity to reduce the work of the most stressed part of the Puppet system.  For similar reasons, it also makes troubleshooting harder.

Puppet is a configuration management system, not an inventory system.  To the extent that it can also serve incidentally as a poor man's inventory system, that's great, but not of much import to me.  As far as I am concerned, Puppet is better suited to working alongside or even under an inventory system than it is to working as an inventory system.

You are in luck, however: Puppet's source is open, and PuppetLabs (sometimes) accepts community contributions.  You are free to make your proposed changes yourself, and to submit them to PL for consideration.  You would want to do so in the context of a feature-request ticket.


John

JS

unread,
Mar 1, 2016, 12:09:04 PM3/1/16
to Puppet Users
Which guy?  The OP?
Yep, OP, per the rest of the post.

That's not what anyone else in this necro'd thread said back in its previous life, so exactly what and who are you agreeing with?
Yes, it was actually this necro'd threads main point of emphasis, including subject, introduction, middle and final answer from OP.  Even the word "requirement" was used.

Yes, they can, but normally they are zilch (nada, bumpkus, zero).  Moreover, even when massive changes are applied, they rarely change the values of any of the standard facts.  Of course, all bets are off with custom facts.
"Normally" and Puppet dont really go hand in hand with the vast customizable use of it, especially when custom facts come into the equation.

Well that's not what Puppet is, or ever was intended to be.
Many products start out not intending to be what they become. 
It may not be the purpose of Puppet, but Puppet uses Facter, which does report facts, so they are basically bonded to each other.  If something becomes what it wasn't intended to be, should the designer and creator continue to tell users they are using it wrong?  Seems to me a lot of things have failed that way.

it would be a potentially dangerous mistake to rely on PuppetDB to hold an accurate representation of node state between Puppet runs.  Furthermore, it is difficult to determine at the time you query PuppetDB whether the node in question actually is between runs, as for each node there will be from seconds to minutes out of each catalog run interval in which a catalog run is in progress.
Querying isn't an issue with mcollective. Nor is puppet going to run with a statelock.  I even have a custom fact that says when the facts were gathered, so I have exact time stamps.

I'm not so sure that yours is a commonly requested feature. 
The word "common" means something different to each individual.  However, I have had 3 customers request this feature, which have led to searches, which have led to quite a few requests from others, over the years.  Just as the OP has requested here.
I wouldnt say its common to "hope" puppet recognizes fact values during a run, I would almost say that is expected.

I think the best debate I read against our wishes or hopes, was that facts should only be viewed as what they were prior to a catalog run.  I guess that makes sense.  However, since they CAN and ARE used as a reporting method or "inventory", there should be some form of seeing what they have changed to.
The other side of this debate though too, is users that want to see what the facts are BEFORE a puppet run.  Current reported facts would only show what they were before the previous run, which is also not an "accurate representation".  

Puppet is a configuration management system, not an inventory system.  To the extent that it can also serve incidentally as a poor man's inventory system, that's great, but not of much import to me.  As far as I am concerned, Puppet is better suited to working alongside or even under an inventory system than it is to working as an inventory system
I suppose most of what I said could use subbing of the word "Puppet" with "Facter".   I do agree that Puppet doesn't need to, and probably even shouldn't always grab the changed facts at the end of the run.  However, Facter itself is widely used as a reporting or inventory system (and even marketed by puppetlabs as so).  Thus, I do agree what you say, in regards to Puppet.  However, they are two separate systems that work together.  I think most people just want to see Facter expand on the whole gathering of inventory.  No need to pull in another inventory management system, when Facter can do it.   Facter and Puppet allow you to get new facts at the end, but don't provide any native way of doing so.  I think that is the main point people that request this are trying to say. 

You are in luck, however: Puppet's source is open
Yep, and thats what makes it such an amazing tool and great community.  As well as allows users like myself, OP, and others, that want this kind of reporting feature, to be able to actually do so.

All in all, I truly understand what you're saying, and even agree when it comes to Puppet.  Although, I also believe all things can be made better. Giving users the ability to query changed or custom facts, after a puppet run (or heck even without Puppet at all, just via Facter), seems like something that could be made better.

Right now, (especially since postrun_command is broke in windows) I run a quick powershell script in its own new session that calls puppet upload facts, or a nohup in the background (if nix), after every puppet run.  This keeps it in its own environment and outside of puppets ruby process and env vars.  I then use a ENC script to pull in the latest facts and push them to reporting.  If I need to see the facts prior to Puppets run, I can also view those, as I store them on the master as well.  Best of both worlds; latest correct and latest prior to puppet.  Or if need be, we can even go back 2 weeks and see every fact for each time puppet was ran.  

Thanks for your input and its always good to hear others reasonings and opinions.




Kylo Ginsberg

unread,
Mar 2, 2016, 3:03:57 PM3/2/16
to puppet...@googlegroups.com
FYI this same suggestion came up recently in https://tickets.puppetlabs.com/browse/PUP-5934 (and I've linked this thread there). 

This strikes me as an ecosystem concern: i.e. I'm not convinced offhand we'd want to address the base use cases here by submitting facts with the report, because I wouldn't want to lose the current fact-set-to-catalog linkage in puppetdb. So I think we'd want to consider how this would be supported in puppetdb if/as we consider something like this.

Also, in the spirit of linking related conversations: this idea may also dovetail with a current thread on puppet-dev: https://groups.google.com/d/msg/puppet-dev/bebmBUyRETg/v0VFTogWCgAJ. Specifically, one idea there is to specify fact ttl's, which might in turn impact fact submission times and fact storage schema, etc.

Kylo
 




--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/d36b749c-7717-49e2-a057-8faccafb2dc5%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Kylo Ginsberg | ky...@puppetlabs.com | irc: kylo | twitter: @kylog

JS

unread,
Mar 7, 2016, 12:55:11 PM3/7/16
to Puppet Users
I like the idea.  Personally I've always wished to just see facts in the form of:

pre_puppet => post_puppet

You now have a record of what it was before going into puppet as well as a record when it comes out.  Thats about the best accuracy one can hope for.

Trevor Vaughan

unread,
Mar 7, 2016, 1:11:28 PM3/7/16
to puppet...@googlegroups.com
Personally, I've never come across a case where I needed a post-mortem fact collection. But, I do tend to write custom types when I need to get things into the catalog so that may be my particular way of doing things coming out.

While this could be useful in some cases for reporting, I would very much like to have the (thread related) ability to add one-off values to the fact DB from either the master or the client.

Outside of that, a simple method for extending the PuppetDB schema to store items of interest would be most welcome.

Trevor


For more options, visit https://groups.google.com/d/optout.



--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699

-- This account not approved for unencrypted proprietary information --
Reply all
Reply to author
Forward
0 new messages