/var/lib/puppet/yaml/node/prod_web.yaml
/var/lib/puppet/yaml/facts/prod_web.yaml
/var/lib/puppet/reports/prod_web
/var/lib/puppet/reports/prod_web/201408130200.yaml
/var/lib/puppet/reports/prod_web/201408140811.yaml
/var/lib/puppet/reports/prod_web/201408121328.yaml
/var/lib/puppet/reports/prod_web/201408130743.yaml
/var/lib/puppet/reports/prod_web/201408140454.yaml
[main]
...
node_name = facter
node_name_fact = puppet_node
puppet_node=prod_web
puppet_environment=production
package=frontend=some-version-here
app_group=us1
On Aug 22, 2014 7:37 AM, "Matt W" <ma...@nextdoor.com> wrote:
>
> Anyone have any thoughts on this?
>
I have to say, using an identical node name as a way of assigning the node's role is an "interesting" approach. I would not be surprised if you run into other difficulties with this approach; some even harder to find. Even something like an appended unique identifier, such as from the host ID, MAC address, serial number, hashed SHA1, etc would have been better.
Be that as it may, life would be dull if we didn't have to live with the sins of the past. You might check the config guide https://docs.puppetlabs.com/references/3.6.latest/configuration.html but in thinking about it, if you found a setting and tried to use a fact in it, you'd probably just get the master's fact.
The reports, at least, should be easy - since they're pluggable, you could copy the existing "lib/puppet/reports/store.rb" to a new name & module and tweak the storage location.
Wil
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/40c0048d-fc90-4006-99da-98bfa9ba94a7%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/adxt68xO210/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/CAMmm3r5MwNDV%3DCEnxVrr4pL1w_Xi3byR5xphPxPZH3%3D2XgJdXQ%40mail.gmail.com.
Will,Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.
For example, I can quickly boot up a "prod-mwise-dev-test-web-sever-thingy" using the same node definition as our "prod-frontend-host" for some testing, without worrying about the hostname regex structure.
Anyways that said ... what I'm really interested in knowing is why the puppet-agents are pulling DOWN their "node information" from the puppet masters?
On Saturday, August 23, 2014 12:46:59 PM UTC-5, Matt W wrote:Will,Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar
And therein is one of the key problems: "similar", not "identical". If any node facts (including $hostname, $fqdn, etc.) vary among these hosts that are identifying themselves to the master as the same machine, then you are putting yourself at risk for problems. Moreover, if security around your puppet catalogs is a concern for you, then be aware that positioning your node-type certificates as a shared resource makes it far more likely that they will be breached. Additionally, you cannot limit which machines can get configuration from your master.
Lest it didn't catch your eye as it went by, I re-emphasize that Puppet is built around the idea that a machine's SSL certname is a unique machine identifier within the scope of your certificate authority. What you are doing can work with Puppet, but you will run into issues such as the file naming effects you asked about.
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.
Classifying based on a fact instead of based on host name is a fine idea, provided that you are willing to trust clients to give their type accurately to the server. Having accepted that risk, however, you do not by any means need the node-type fact to be expressed to the master as the node's identity. It could as easily be expressed via an ordinary fact.
In particular, your site manifest does not need a separate node block for each node [identity], nor even to enumerate all the known node names. In fact, it doesn't need any node blocks at all if you are not going to classify based on node identity. Even if you're using an ENC, it is possible for it to get the node facts to use for classification.
For example, I can quickly boot up a "prod-mwise-dev-test-web-sever-thingy" using the same node definition as our "prod-frontend-host" for some testing, without worrying about the hostname regex structure.
And you could do that, too, with a plain fact.
Anyways that said ... what I'm really interested in knowing is why the puppet-agents are pulling DOWN their "node information" from the puppet masters?
Can you say a bit more about that? What do you see that suggests agents are pulling down "node information" other than their catalogs (and later, any 'source'd files)?
10.216.61.76 - XXX - puppet "GET /production/node/xyz? HTTP/1.1" 200 13733 "-" "-" 0.021
John
--
You received this message because you are subscribed to a topic in the Google Groups "Puppet Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/puppet-users/adxt68xO210/unsubscribe.
To unsubscribe from this group and all its topics, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/b93f6baa-6433-4773-b647-a06b1f1c602c%40googlegroups.com.
Comments inlineMatt WiseSr. Systems ArchitectNextdoor.comOn Mon, Aug 25, 2014 at 6:55 AM, jcbollinger <John.Bo...@stjude.org> wrote:
On Saturday, August 23, 2014 12:46:59 PM UTC-5, Matt W wrote:Will,Thanks for the response. I know its a bit of a unique model -- but when you think about it, it makes a decent amount of sense. We run hundreds of nodes that are fundamentally similar
And therein is one of the key problems: "similar", not "identical". If any node facts (including $hostname, $fqdn, etc.) vary among these hosts that are identifying themselves to the master as the same machine, then you are putting yourself at risk for problems. Moreover, if security around your puppet catalogs is a concern for you, then be aware that positioning your node-type certificates as a shared resource makes it far more likely that they will be breached. Additionally, you cannot limit which machines can get configuration from your master.
To be very clear, we do not share certs across nodes. We absolutely use independent certs and sign them uniquely -- in fact, bug #7244 was opened by me specifically for improving the security around SSL certs and auto signing. We make heavy use of dynamic CSR facts to securely sign our keys.More specifically, we've been waiting for the CSR attribute system to allow us to embed the puppet 'node type' (note, not identifier) in the SSL certs so that clients can't possibly retrieve a node type that isn't their own. (Bug #7243). It looks like this has been finally implemented, so we'll be looking into using it very soon (here).
Lest it didn't catch your eye as it went by, I re-emphasize that Puppet is built around the idea that a machine's SSL certname is a unique machine identifier within the scope of your certificate authority. What you are doing can work with Puppet, but you will run into issues such as the file naming effects you asked about.
.. i.e. "this is a web server, it gets the XYZ package installed" and "this is a web server, it gets the ABC package installed". Using hostnames to identify the systems node-definition makes very little sense and leaves quite a bit of room for error. Explicitly setting the node-type as a fact allows us to re-use the same node types but for many different environments and keeps host-names out of the mix.
Classifying based on a fact instead of based on host name is a fine idea, provided that you are willing to trust clients to give their type accurately to the server. Having accepted that risk, however, you do not by any means need the node-type fact to be expressed to the master as the node's identity. It could as easily be expressed via an ordinary fact.
In particular, your site manifest does not need a separate node block for each node [identity], nor even to enumerate all the known node names. In fact, it doesn't need any node blocks at all if you are not going to classify based on node identity. Even if you're using an ENC, it is possible for it to get the node facts to use for classification.
Using a combination of both our nodes self-identifying themselves as well as the puppet node name architecture allows us to leverage the security of the 'auth' config file, while also having dynamically configured nodes where hostname doesn't matter.
Realistically, hostnames are a terrible method for security ... someone could always break into a 'www' server and rename it to 'prod-db-thingy' and have it match the regex and subsequently get the database puppet manifest. (Just as a stupid simple example).
For what its worth, our old model was a single 'default' node type and a simple fact ('base_class=my_web_server'). This worked extremely well, but left us more open to basically any client being able to request any catalog compilation. The auth-file in this world was effectively useless for preventing already-verified nodes from doing bad things.
Can you say a bit more about that? What do you see that suggests agents are pulling down "node information" other than their catalogs (and later, any 'source'd files)?With nearly every puppet catalog compile, we also see GET requests like this:10.216.61.76 - XXX - puppet "GET /production/node/xyz? HTTP/1.1" 200 13733 "-" "-" 0.021Where 10.216.61.76 is *not* the local IP of the puppet master... its the remote IP of the ELB, which indicates that its remote traffic from our puppet clients.
On Wed, Aug 27, 2014 at 8:10 AM, jcbollinger <John.Bo...@stjude.org> wrote:
On Tuesday, August 26, 2014 6:24:57 PM UTC-5, Nigel Kersten wrote:
I am well aware of all the old hilarity surrounding determining the environment from which to serve various bits, but I was unaware that the resolution involved agents requesting their environment from the master. That implies that the master still relies on the agent to correctly specify (echo back) the environment from which to serve those bits, else why would the agent need to know?
If that's really what's happening then it's a poor design (which I guess is why I supposed it wasn't what was happening). If the master is authoritative for a piece of information -- as it is for nodes' environments -- then it should not rely on relaying that information back to itself through an external actor -- that undermines its authoritativeness for the information. Moreover, to the extent that the master does have such a reliance, it leaves Puppet open to malicious manipulation of the requested environment.
So, um, are you sure?Yes. The bit of info we haven't mentioned is that if the client and server environments don't match, and the server is set to be authoritative, then it triggers the client to do a new pluginsync and run with the server environment.
Tracking back to older tickets, there's a succinct description here from Daniel Pittman:(which has related tickets for the rest of the change)"The reason this was removed was to support the changes that made the ENC authoritative over the agent environment. As part of that we had a bootstrapping problem: the agent had an idea of the environment to request, used that in pluginsync, and then as part of the request for the catalog.If that idea was wrong, the catalog would be returned from the correct, ENC specified environment, but it would have been generated with the wrong set of plugins – including custom facts. So, the agent would detect that, pluginsync to the new environment in the catalog, and compile a new catalog.That fixed the problem, but was inefficient – every agent run with an incorrect environment would mean two catalog compilations, and doubling master load in a common situation (ENC says !production, agent run from cron) was pretty unacceptable.So, instead, the agent was changed to query the master for node data about itself – and to use the environment that came back from that."