Puppet 'node data' when using common node_names?

101 views
Skip to first unread message

Matt W

unread,
Aug 14, 2014, 1:39:16 PM8/14/14
to puppet...@googlegroups.com
We noticed that our puppet reports and our puppet node data stored on our puppet servers is always written out in the form of the 'node name'. So when we use a node name like 'prod_webserver' across many webserver machines, we get a tree of reports and node data like this:

/var/lib/puppet/yaml/node/prod_web.yaml
/var/lib/puppet/yaml/facts/prod_web.yaml
/var/lib/puppet/reports/prod_web
/var/lib/puppet/reports/prod_web/201408130200.yaml
/var/lib/puppet/reports/prod_web/201408140811.yaml
/var/lib/puppet/reports/prod_web/201408121328.yaml
/var/lib/puppet/reports/prod_web/201408130743.yaml
/var/lib/puppet/reports/prod_web/201408140454.yaml

Where each of those reports likely reflects a compilation run for a different host... and the facts/node files at the top are getting constantly re-written as new clients come in.

Is there a way to change the behavior of the data there to be written out based on the ${::fqdn} of the host (or certname) rather than its node name?

(our client puppet configs ...)
[main]
...
    node_name = facter
    node_name_fact = puppet_node

(a client puppet fact file...)
puppet_node=prod_web
puppet_environment=production
package=frontend=some-version-here
app_group=us1

Matt W

unread,
Aug 22, 2014, 10:37:31 AM8/22/14
to puppet...@googlegroups.com
Anyone have any thoughts on this?

Wil Cooley

unread,
Aug 22, 2014, 10:11:57 PM8/22/14
to puppet-users group


On Aug 22, 2014 7:37 AM, "Matt W" <ma...@nextdoor.com> wrote:
>
> Anyone have any thoughts on this?
>

I have to say, using an identical node name as a way of assigning the node's role is an "interesting" approach. I would not be surprised if you run into other difficulties with this approach; some even harder to find. Even something like an appended unique identifier, such as from the host ID, MAC address, serial number, hashed SHA1, etc would have been better.

Be that as it may, life would be dull if we didn't have to live with the sins of the past. You might check the config guide https://docs.puppetlabs.com/references/3.6.latest/configuration.html but in thinking about it, if you found a setting and tried to use a fact in it, you'd probably just get the master's fact.

The reports, at least, should be easy - since they're pluggable, you could copy the existing "lib/puppet/reports/store.rb" to a new name & module and tweak the storage location.

Wil

> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/40c0048d-fc90-4006-99da-98bfa9ba94a7%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

Matt Wise

unread,
Aug 23, 2014, 1:46:59 PM8/23/14
to puppet...@googlegroups.com
Will,
  Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar .. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix. For example, I can quickly boot up a "prod-mwise-dev-test-web-sever-thingy" using the same node definition as our "prod-frontend-host" for some testing, without worrying about the hostname regex structure.

  Anyways that said ... what I'm really interested in knowing is why the puppet-agents are pulling DOWN their "node information" from the puppet masters? Is it possible that they do an upload of node information, then ask for that information back, then somehow use the downloaded information for their catalog request? I could see some interesting race conditions if that was the case.

Matt Wise
Sr. Systems Architect
Nextdoor.com


--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/adxt68xO210/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/CAMmm3r5MwNDV%3DCEnxVrr4pL1w_Xi3byR5xphPxPZH3%3D2XgJdXQ%40mail.gmail.com.

jcbollinger

unread,
Aug 25, 2014, 9:55:14 AM8/25/14
to puppet...@googlegroups.com


On Saturday, August 23, 2014 12:46:59 PM UTC-5, Matt W wrote:
Will,
  Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar


And therein is one of the key problems: "similar", not "identical".  If any node facts (including $hostname, $fqdn, etc.) vary among these hosts that are identifying themselves to the master as the same machine, then you are putting yourself at risk for problems.  Moreover, if security around your puppet catalogs is a concern for you, then be aware that positioning your node-type certificates as a shared resource makes it far more likely that they will be breached.  Additionally, you cannot limit which machines can get configuration from your master.

Lest it didn't catch your eye as it went by, I re-emphasize that Puppet is built around the idea that a machine's SSL certname is a unique machine identifier within the scope of your certificate authority.  What you are doing can work with Puppet, but you will run into issues such as the file naming effects you asked about.

 
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.


Classifying based on a fact instead of based on host name is a fine idea, provided that you are willing to trust clients to give their type accurately to the server.  Having accepted that risk, however, you do not by any means need the node-type fact to be expressed to the master as the node's identity.  It could as easily be expressed via an ordinary fact.

In particular, your site manifest does not need a separate node block for each node [identity], nor even to enumerate all the known node names.  In fact, it doesn't need any node blocks at all if you are not going to classify based on node identity.  Even if you're using an ENC, it is possible for it to get the node facts to use for classification.

 
For example, I can quickly boot up a "prod-mwise-dev-test-web-sever-thingy" using the same node definition as our "prod-frontend-host" for some testing, without worrying about the hostname regex structure.


And you could do that, too, with a plain fact.

 

  Anyways that said ... what I'm really interested in knowing is why the puppet-agents are pulling DOWN their "node information" from the puppet masters?


Can you say a bit more about that?  What do you see that suggests agents are pulling down "node information" other than their catalogs (and later, any 'source'd files)?


John

Matt Wise

unread,
Aug 25, 2014, 12:13:40 PM8/25/14
to puppet...@googlegroups.com
Comments inline

Matt Wise
Sr. Systems Architect
Nextdoor.com


On Mon, Aug 25, 2014 at 6:55 AM, jcbollinger <John.Bo...@stjude.org> wrote:


On Saturday, August 23, 2014 12:46:59 PM UTC-5, Matt W wrote:
Will,
  Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar


And therein is one of the key problems: "similar", not "identical".  If any node facts (including $hostname, $fqdn, etc.) vary among these hosts that are identifying themselves to the master as the same machine, then you are putting yourself at risk for problems.  Moreover, if security around your puppet catalogs is a concern for you, then be aware that positioning your node-type certificates as a shared resource makes it far more likely that they will be breached.  Additionally, you cannot limit which machines can get configuration from your master.

To be very clear, we do not share certs across nodes. We absolutely use independent certs and sign them uniquely -- in fact, bug #7244 was opened by me specifically for improving the security around SSL certs and auto signing. We make heavy use of dynamic CSR facts to securely sign our keys. 

More specifically, we've been waiting for the CSR attribute system to allow us to embed the puppet 'node type' (note, not identifier) in the SSL certs so that clients can't possibly retrieve a node type that isn't their own. (Bug #7243). It looks like this has been finally implemented, so we'll be looking into using it very soon (here). 
  

Lest it didn't catch your eye as it went by, I re-emphasize that Puppet is built around the idea that a machine's SSL certname is a unique machine identifier within the scope of your certificate authority.  What you are doing can work with Puppet, but you will run into issues such as the file naming effects you asked about.

 
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.


Classifying based on a fact instead of based on host name is a fine idea, provided that you are willing to trust clients to give their type accurately to the server.  Having accepted that risk, however, you do not by any means need the node-type fact to be expressed to the master as the node's identity.  It could as easily be expressed via an ordinary fact.

In particular, your site manifest does not need a separate node block for each node [identity], nor even to enumerate all the known node names.  In fact, it doesn't need any node blocks at all if you are not going to classify based on node identity.  Even if you're using an ENC, it is possible for it to get the node facts to use for classification.

Using a combination of both our nodes self-identifying themselves as well as the puppet node name architecture allows us to leverage the security of the 'auth' config file, while also having dynamically configured nodes where hostname doesn't matter. Realistically, hostnames are a terrible method for security ... someone could always break into a 'www' server and rename it to 'prod-db-thingy' and have it match the regex and subsequently get the database puppet manifest. (Just as a stupid simple example). 

For what its worth, our old model was a single 'default' node type and a simple fact ('base_class=my_web_server'). This worked extremely well, but left us more open to basically any client being able to request any catalog compilation. The auth-file in this world was effectively useless for preventing already-verified nodes from doing bad things.

 
For example, I can quickly boot up a "prod-mwise-dev-test-web-sever-thingy" using the same node definition as our "prod-frontend-host" for some testing, without worrying about the hostname regex structure.


And you could do that, too, with a plain fact.

 

  Anyways that said ... what I'm really interested in knowing is why the puppet-agents are pulling DOWN their "node information" from the puppet masters?


Can you say a bit more about that?  What do you see that suggests agents are pulling down "node information" other than their catalogs (and later, any 'source'd files)?


With nearly every puppet catalog compile, we also see GET requests like this:

10.216.61.76 - XXX - puppet "GET /production/node/xyz? HTTP/1.1" 200 13733 "-" "-" 0.021
 
Where 10.216.61.76 is *not* the local IP of the puppet master... its the remote IP of the ELB, which indicates that its remote traffic from our puppet clients.


John

--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/adxt68xO210/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.

jcbollinger

unread,
Aug 26, 2014, 2:22:46 PM8/26/14
to puppet...@googlegroups.com


On Monday, August 25, 2014 11:13:40 AM UTC-5, Matt W wrote:
Comments inline

Matt Wise
Sr. Systems Architect
Nextdoor.com


On Mon, Aug 25, 2014 at 6:55 AM, jcbollinger <John.Bo...@stjude.org> wrote:


On Saturday, August 23, 2014 12:46:59 PM UTC-5, Matt W wrote:
Will,
  Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar


And therein is one of the key problems: "similar", not "identical".  If any node facts (including $hostname, $fqdn, etc.) vary among these hosts that are identifying themselves to the master as the same machine, then you are putting yourself at risk for problems.  Moreover, if security around your puppet catalogs is a concern for you, then be aware that positioning your node-type certificates as a shared resource makes it far more likely that they will be breached.  Additionally, you cannot limit which machines can get configuration from your master.

To be very clear, we do not share certs across nodes. We absolutely use independent certs and sign them uniquely -- in fact, bug #7244 was opened by me specifically for improving the security around SSL certs and auto signing. We make heavy use of dynamic CSR facts to securely sign our keys. 

More specifically, we've been waiting for the CSR attribute system to allow us to embed the puppet 'node type' (note, not identifier) in the SSL certs so that clients can't possibly retrieve a node type that isn't their own. (Bug #7243). It looks like this has been finally implemented, so we'll be looking into using it very soon (here). 
  

Lest it didn't catch your eye as it went by, I re-emphasize that Puppet is built around the idea that a machine's SSL certname is a unique machine identifier within the scope of your certificate authority.  What you are doing can work with Puppet, but you will run into issues such as the file naming effects you asked about.

 
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.


Classifying based on a fact instead of based on host name is a fine idea, provided that you are willing to trust clients to give their type accurately to the server.  Having accepted that risk, however, you do not by any means need the node-type fact to be expressed to the master as the node's identity.  It could as easily be expressed via an ordinary fact.

In particular, your site manifest does not need a separate node block for each node [identity], nor even to enumerate all the known node names.  In fact, it doesn't need any node blocks at all if you are not going to classify based on node identity.  Even if you're using an ENC, it is possible for it to get the node facts to use for classification.

Using a combination of both our nodes self-identifying themselves as well as the puppet node name architecture allows us to leverage the security of the 'auth' config file, while also having dynamically configured nodes where hostname doesn't matter.


 
Realistically, hostnames are a terrible method for security ... someone could always break into a 'www' server and rename it to 'prod-db-thingy' and have it match the regex and subsequently get the database puppet manifest. (Just as a stupid simple example). 



No, they couldn't.  By default, Puppet relies on nodes' certnames to identify them.  That the agent uses hostname as certname by default is a convenient irrelevance once the master signs the node's certificate.  Changing a node's hostname does not enable that node to get a different node's configuration, at least not under Puppet's ordinary configuration, because its certname does not change.  A node's certname cannot change without the CA's participation.

 
For what its worth, our old model was a single 'default' node type and a simple fact ('base_class=my_web_server'). This worked extremely well, but left us more open to basically any client being able to request any catalog compilation. The auth-file in this world was effectively useless for preventing already-verified nodes from doing bad things.


So you want to maintain information on the master that informs it what configuration(s) your nodes are permitted to request, but within those limits you want nodes to be able to request different configurations?  It seems like you could have added such a validation pretty easily to your old scheme, and it might still be to your advantage to do so.  It would be easy to record in Hiera either which configurations each node is permitted to request, or which nodes (by certname) are permitted to request each configuration.

 
Can you say a bit more about that?  What do you see that suggests agents are pulling down "node information" other than their catalogs (and later, any 'source'd files)?


With nearly every puppet catalog compile, we also see GET requests like this:

10.216.61.76 - XXX - puppet "GET /production/node/xyz? HTTP/1.1" 200 13733 "-" "-" 0.021
 
Where 10.216.61.76 is *not* the local IP of the puppet master... its the remote IP of the ELB, which indicates that its remote traffic from our puppet clients.



That traffic might be coming from nodes, but all you know for sure is that it is traversing the ELB.  Surely the master could send requests through the ELB that end up coming back to it.  For all I know, the ELB might preferentially route such requests back to the originating host.

From the perspective of the Puppet service lifecycle, the two most likely sources of such traffic are (1) an ENC retrieving node facts, and (2) the master determining nodes' environments.  I don't know any reason why nodes would be requesting their own node information, and even if they did, I can't see how that would affect the catalog the master serves to them.


John

Erik Dalén

unread,
Aug 26, 2014, 3:27:56 PM8/26/14
to puppet...@googlegroups.com
The reason for them to do this is to be able to use the environment that was configured on the master to fetch the plugins from. So first it tries to fetch its node info from the master to see if the environment in that is different than what it had configured locally. This is a new feature since puppet 3.0.

In puppet 2.7 it used the fact plugins from the agent configured environment and the catalog from the master configured if I remember correctly.

--
Erik Dalén

Nigel Kersten

unread,
Aug 26, 2014, 7:24:57 PM8/26/14
to public puppet users
Yes, and also file resources came from the agent-configured environment even though the resource was from the master-configured environment, which resulted in much hilarity.


jcbollinger

unread,
Aug 27, 2014, 11:10:51 AM8/27/14
to puppet...@googlegroups.com


I am well aware of all the old hilarity surrounding determining the environment from which to serve various bits, but I was unaware that the resolution involved agents requesting their environment from the master.  That implies that the master still relies on the agent to correctly specify (echo back) the environment from which to serve those bits, else why would the agent need to know?

If that's really what's happening then it's a poor design (which I guess is why I supposed it wasn't what was happening).  If the master is authoritative for a piece of information -- as it is for nodes' environments -- then it should not rely on relaying that information back to itself through an external actor -- that undermines its authoritativeness for the information.  Moreover, to the extent that the master does have such a reliance, it leaves Puppet open to malicious manipulation of the requested environment.

So, um, are you sure?


John

Nigel Kersten

unread,
Sep 2, 2014, 12:14:27 PM9/2/14
to public puppet users
Yes.  The bit of info we haven't mentioned is that if the client and server environments don't match, and the server is set to be authoritative, then it triggers the client to do a new pluginsync and run with the server environment. 

Tracking back to older tickets, there's a succinct description here from Daniel Pittman:

(which has related tickets for the rest of the change)

"The reason this was removed was to support the changes that made the ENC authoritative over the agent environment. As part of that we had a bootstrapping problem: the agent had an idea of the environment to request, used that in pluginsync, and then as part of the request for the catalog.

If that idea was wrong, the catalog would be returned from the correct, ENC specified environment, but it would have been generated with the wrong set of plugins – including custom facts. So, the agent would detect that, pluginsync to the new environment in the catalog, and compile a new catalog.

That fixed the problem, but was inefficient – every agent run with an incorrect environment would mean two catalog compilations, and doubling master load in a common situation (ENC says !production, agent run from cron) was pretty unacceptable.

So, instead, the agent was changed to query the master for node data about itself – and to use the environment that came back from that."

Brian Wilkins

unread,
Sep 2, 2014, 12:37:37 PM9/2/14
to puppet...@googlegroups.com
Matt,

There is a better way and that is to use the roles and profiles pattern. I use that and I have a custom facter ruby script that reads the fqdn from a yaml and assigns it's role. Puppet takes over from there.

jcbollinger

unread,
Sep 2, 2014, 5:27:23 PM9/2/14
to puppet...@googlegroups.com


On Tuesday, September 2, 2014 11:14:27 AM UTC-5, Nigel Kersten wrote:



On Wed, Aug 27, 2014 at 8:10 AM, jcbollinger <John.Bo...@stjude.org> wrote:


On Tuesday, August 26, 2014 6:24:57 PM UTC-5, Nigel Kersten wrote:

[...]

 

I am well aware of all the old hilarity surrounding determining the environment from which to serve various bits, but I was unaware that the resolution involved agents requesting their environment from the master.  That implies that the master still relies on the agent to correctly specify (echo back) the environment from which to serve those bits, else why would the agent need to know?

If that's really what's happening then it's a poor design (which I guess is why I supposed it wasn't what was happening).  If the master is authoritative for a piece of information -- as it is for nodes' environments -- then it should not rely on relaying that information back to itself through an external actor -- that undermines its authoritativeness for the information.  Moreover, to the extent that the master does have such a reliance, it leaves Puppet open to malicious manipulation of the requested environment.

So, um, are you sure?

Yes.  The bit of info we haven't mentioned is that if the client and server environments don't match, and the server is set to be authoritative, then it triggers the client to do a new pluginsync and run with the server environment. 

Tracking back to older tickets, there's a succinct description here from Daniel Pittman:

(which has related tickets for the rest of the change)

"The reason this was removed was to support the changes that made the ENC authoritative over the agent environment. As part of that we had a bootstrapping problem: the agent had an idea of the environment to request, used that in pluginsync, and then as part of the request for the catalog.

If that idea was wrong, the catalog would be returned from the correct, ENC specified environment, but it would have been generated with the wrong set of plugins – including custom facts. So, the agent would detect that, pluginsync to the new environment in the catalog, and compile a new catalog.

That fixed the problem, but was inefficient – every agent run with an incorrect environment would mean two catalog compilations, and doubling master load in a common situation (ENC says !production, agent run from cron) was pretty unacceptable.

So, instead, the agent was changed to query the master for node data about itself – and to use the environment that came back from that."



What I'm hearing is that the master, when it is set authoritative, does rely on the agent's self-specified environment for plugin sync, but for catalog requests it uses that data only to verify that the agent knows the correct environment to request.  That's better, but it still means that plugins cannot be secured against access from other environments.

I suppose the issue there is that the determination of a node's environment may depend on its facts, which may depend on its environment....  I guess it was judged better to open the possibility of infinite looping than to foreclose the possibility of choosing an environment based on custom fact values.

And I also suppose that the agent requests its last-assigned environment prior to plugin-sync, to avoid syncing twice every time when the ENC overrides the agent's self-specified environment.

I still don't like it.  If the master desires that the agent first sync plugins from its last-assigned environment, then why does it make the agent jump through hoops and create extra network traffic to do that?  It should look like this instead:

Agent: Please give me the plugins for your best guess as to my correct environment [which I think should be 'bar']

Master: Here you are.  These are the plugins for environment 'foo'.

Agent: Here are my facts for environment 'foo' [even though I think my environment should be 'bar'].  Please give me my catalog.

(option 1) Master: Here you are.
(option 2) Master: Oops, you're right, your environment should be 'bar'.  Please sync again.  [...]
(option 3) Master: Oops, your environment should be 'plugh'.  Please sync again. [...]

Agent: Thank you.  Now please give me the content of File['/etc/example'] in environment 'foo'|'bar'|'plugh'

(option 1) Master: Here you are.
(option 2) Master: You're daft.  I already told you your environment was <other environment>.  Go away.


Not only does that eliminate one network request, but it also allows plugins to be better secured.


John

Reply all
Reply to author
Forward
0 new messages