|
The agent issues a node request via the indirector among the first things it does. The agent sets its environment of the environment of the returned node. However, the agent does not send its facts with this node request. This means the classifier queries the facts indirection, which loads them from a facts yaml cache for the node if available, or (as would be the case if this the first node check-in) queries puppetdb for the facts (possibly returning none).
This server-side facts search is possibly slow, especially if querying PDB under load.
The agent doesn't do anything with the facts (parameters) returned with the node request. Next the agent downloads plugins/plugin-facts, runs facter to resolve facts, and then issues the catalog request with the facts.
The agent could instead query local facts before the node request, and send the agent's facts with the node request so that the classifier does not query them. The agent would still have to subsequently download plugins/plugin-facts and evaluate these, but we could explicitly only search the plugin paths when evaluating facts the second time, and then merge these with the recently queried facts used in the node request. This would be a backwards-compatible change likely unseen by users.
Note that with this change, we would effectively eliminate the need for a yaml facts cache, as neither the node nor catalog request would rely on a facts lookup. Per Josh Cooper's comment in PUP-7662, we could add a new setting value to disable the facts cache terminus by default (though users could still enable it).
The upside to this change is we eliminate the last of the potentially costly facts lookups by the classifier during an agent run. A potential downside is that we would also need to (be able to) deserialize the submitted facts in the node terminus. While not necessarily complicated from a code perspective, I'm not sure what the performance hit of doing so would be, if any. I assume the PDB terminus already does something similar so it seems reasonable that this would equal out/offset.
In Scope
-
Modify Puppet::Configurer so that local facts are queried and then submitted with the node request
-
These facts should be merged with those subsequently downloaded via plugin-sync. To evaluate plugin-synced facts, facter should explicitly only evaluate the plugin sync paths, not the default facts so that we avoid evaluating the core facts twice
-
The node classifier terminus should be modified to accept a serialized facts body if submitted with the request, and deserialize them for use during the node lookup. This should be done in a backwards-compatible way so that newer agents can still talk to older masters and older agents can talk to newer masters
-
Disable caching yaml facts in puppet server by default
-
This is accomplished by setting Puppet[:facts_cache_terminus] setting to none by default and then setting Puppet::Node::Facts.indirection.cache_class to the setting value, here https://github.com/puppetlabs/puppetserver/blob/master/src/ruby/puppetserver-lib/puppet/server/puppet_config.rb.
|