Puppet new deployment questions - deployment patterns, sensitivity to network errors, and certificate headaches.

Stephen Morton

unread,

Jun 16, 2014, 3:33:12 PM6/16/14

to puppet...@googlegroups.com

I've got some newbie puppet questions.
My team has a tremendous amount of linux/computer knowledge, but we're new to Puppet.
We recently started using puppet to manage some 100 servers. Their configs are all pretty similar with some small changes.

----
History

Prior to Puppet, we already had a management system that involved having config files under revision control and the config file repo checked out on every server and the repo config files symlinked into the appropriate place in the filesystem. Updating the repo would update these files.This was mostly just great, with the following limitations:

If the symlink got broken, it didn't work.
Some files require very specific ownership, or were required not to be symlinks (e.g. /etc/sudoers. /etc/vsftpd/ files I think)
Updating a daemon's config file does not mean that the daemon is restarted. e.g. updating /etc/httpd/conf/httpd.conf does not do a "service httpd reload"
You can't add a new symlink.
All files must be in revision control to link to. Some security-sensitive files we want to only be available to some servers and something like puppet that can send files over the network is a good solution to this.

----

Puppet to the rescue?

So we've tried a very conservative Puppet implementation. We've left our existing infrastructure and we just add new rules in Puppet. So far, we have a single site.pp file and only a dozen or so rules. But already we're seeing problems.

Puppet is good for configuring dynamic stuff that changes. But it seems silly to have rules for stuff that will be configured just one time and then will not change. If we set up some files, we don't expect them to disappear. In fact if they do disappear we might not want them silently fixed up we probably want to know what's going on. Doing everything in puppet results in ever-growing manifests. I don't know of a way to specify different manifests, e.g. every 30 minutes I want Puppet to run and request the lean and mean regular manifest and then once a week I want it to run the "make sure everything is in the right place" manifest.
Puppet seems very sensitive to network glitches. We run puppet from a cron job and errors were so frequent that we just started sending all output to /dev/null.
Endless certificate issues. It's crazy. So sometimes hosts would get "dropped"... for unknown reasons their certificates were no longer accepted. Because we'd already stopped output (see previous bullet point) we would not know this and the server would be quietly not updated. And when you get a certificate problem, often simply deleting the cert on the agent and master won't fix it. Sometimes a restart of the master service (or more?) is required.

The solution to this to me is not "you should run puppet dashboard, then you'd know". This shouldn't be failing in the first place. If something is that flaky, I don't want to run it.

(We're running version 3.4.2 on CentOS 6.5, 64-bit.)

---

Questions.

So my questions for the above three issue are I guess as follows

Is there a common Puppet pattern to address this? Or am I thinking about things all wrong.
Is there a way to get puppet to be more fault-tolerant, or at least complain less?
Are endless certificate woes the norm? Once an agent has successfully got its certificates working with the server, is it a known issue that it should sometimes start to subsequently fail?

Thanks,

Steve

Rich Burroughs

unread,

Jun 17, 2014, 12:45:49 AM6/17/14

to puppet...@googlegroups.com

I'm not sure about your #2 and #3. I've not really experienced either of those and I wouldn't expect they are regular for most people. It would probably be more helpful if you could post more specifics when one of those things happens.

As to #1, it maybe depends on how you administer your systems. Even if files should not be changing under normal circumstances, managing them with Puppet can make sure that doesn't happen (or at least that it's corrected the next agent run). It sounds like you were already putting your configs in version control, from that point it's not a ton of extra overhead to write Puppet code to manage them. If you're just starting out with it, you'll get faster/better at that part too. And if those resources aren't changing, then it's not a lot of overheard for Puppet to deal with them. Agent runs where no changes need to be applied should be pretty fast.

I'd also encourage you to think about scenarios like losing a node and having to rebuild it from scratch. Suddenly all of those files that don't normally change are gone and need to be replaced. Puppet can do that very quickly. Or needing to spin up additional nodes to do the same task, that can become very easy.

One of the other things I see as a huge benefit with Puppet is that it's self-documenting. If you want to know what's going on with your systems you can just look at the code. I was in a situation about a year ago where I inherited a Puppet install when a co-worker left, and it was a huge advantage, that the code had so much information. And since it's running all the time you don't have to worry about whether it's stale like a Wiki page.

That all said, if there's something you feel it's not necessary to manage, then that's up to you and your team. I've definitely found that the more I've used Puppet the easier it's gotten, which means it's less of a burden to take the time to manage extra things. Puppet isn't the best tool for everything, but it's a great one for managing files.

There's not really a way to tell an agent to run against just a subset of the manifest sometimes and others other times, without doing something pretty goofy. You can make manging files a bit easier in a few ways, like specifying default attributes/values and using arrays of filenames, if you want to apply the same settings to multiple files.

Rich

--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/08b72832-d18a-4397-9587-a769f0ee2d6e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Doug Forster

unread,

Jun 17, 2014, 1:23:22 AM6/17/14

to puppet...@googlegroups.com

Steve,

I think you said you put all your configuration in a single site.pp. This is often bad form and limits the flexibility of your deployment. Something we do is layout modules.

The common pattern is:

/etc/puppet/environments/production

-> Manifests/site.pp

-> Hieradata/*.yaml

-> Modules/foo

This keeps everything in the production environment. All of your clients will do this normally. One of the beauties of doing this is you may want to do a cron job weekly that runs with a different environment like "weekly". To take advantage of the flexibility you may want to group servers into buckets by sticking custom facts and including modules based off them.

As for the Network issues. I think you may be running your puppetmaster with the built in webbrick server. In my experience this offers a single threaded approach to the puppetmaster with it failing if more than one client connects at a time. Setup apache and run Passenger to allow for multiple threads.

Info on Environments:
http://docs.puppetlabs.com/puppet/latest/reference/environments.html#enabling-directory-environments (note this was introduced in 3.5 so you may want to checkout the legacy way to achieve the same thing. )
Passenger with Apache:
http://docs.puppetlabs.com/guides/passenger.html

Finally to get visiblity into the network I would strongly suggest setting up PuppetDB with Puppetboard as puppet Dashboard is effectivly dead.

Info on PuppetDB
http://docs.puppetlabs.com/puppetdb/2.0/install_from_packages.html
PuppetBoard module on the forge.
https://forge.puppetlabs.com/nibalizer/puppetboard

Last tip I would give to someone new is use an IDE that helps you code. Puppetlabs maintains Geppetto for this purpose.
http://docs.puppetlabs.com/geppetto/4.0/

--

Stephen Morton

unread,

Jun 17, 2014, 9:43:01 AM6/17/14

to puppet...@googlegroups.com

Thanks Doug and Rich.

Yes, I guess I am using the default webrick server. I just enabled the puppet-master service in init.d and assumed that was enough. We already have an Apache instance and I will look into passenger. Based on what I see on the Passenger page, this alone could be the cause of all my network issues.

I do know all about organizing puppet manifests into modules. I didn't really want to get into this but, here you go. The idea behind currently having just one site.pp file and that's it is that we're evaluating puppet and don't have many rules. It is just not a good use of anybody's time to have 12 rules split up into 6 modules with 18 (24?) different files when you could do it all in one file. As our puppet rules grow (along with our puppet experience) we will refactor.
   Good to know that Puppet Dashboard is on the outs and puppetdb+puppetboard is the way to go. Should we need such functionality in the future, I'll
   (IDE? Vim and emacs do my syntax highlighting just fine thank you. ;-)
   The last three comments could be summarized by our philosophy here at work that we're lean and mean and we'd never install a "framework" when a few lines of bash or perl code would do the trick just as well.

Yes, we do use custom facts to define a server's geographical location (e.g. important for our NTP and SNMP config) and its internal purpose. Doug hints at using some kind of dynamically generated fact to get a different manifest from the master (e.g. daily vs. weekly manifest); I will investigate that.

Puppet being useful for deploying servers. Absolutely, that's a great point. Thing is, we already have a fully functional kickstart + post-install bash script process to do just that already. We may replace this with puppet, if we decide we love puppet and can't live without it. But for now, if it ain't broke don't spend weeks coding and debugging up a replacement for it that's no better.

Thanks again,

Steve

jcbollinger

unread,

Jun 17, 2014, 10:39:42 AM6/17/14

to puppet...@googlegroups.com

On Monday, June 16, 2014 2:33:12 PM UTC-5, Stephen Morton wrote:

I've got some newbie puppet questions.
My team has a tremendous amount of linux/computer knowledge, but we're new to Puppet.
We recently started using puppet to manage some 100 servers. Their configs are all pretty similar with some small changes.

----
History

Prior to Puppet, we already had a management system that involved having config files under revision control and the config file repo checked out on every server and the repo config files symlinked into the appropriate place in the filesystem. Updating the repo would update these files.This was mostly just great, with the following limitations:

If the symlink got broken, it didn't work.
Some files require very specific ownership, or were required not to be symlinks (e.g. /etc/sudoers. /etc/vsftpd/ files I think)
Updating a daemon's config file does not mean that the daemon is restarted. e.g. updating /etc/httpd/conf/httpd.conf does not do a "service httpd reload"
You can't add a new symlink.
All files must be in revision control to link to. Some security-sensitive files we want to only be available to some servers and something like puppet that can send files over the network is a good solution to this.

----

Puppet to the rescue?

So we've tried a very conservative Puppet implementation. We've left our existing infrastructure and we just add new rules in Puppet. So far, we have a single site.pp file and only a dozen or so rules. But already we're seeing problems.

Puppet is good for configuring dynamic stuff that changes. But it seems silly to have rules for stuff that will be configured just one time and then will not change. If we set up some files, we don't expect them to disappear. In fact if they do disappear we might not want them silently fixed up we probably want to know what's going on.

Puppet is fine for stuff that changes from time to time, but it is even more for stuff that, once configured, is stable for a long time. The core concept around which it is designed is that you describe the state you want your machines to be in, and Puppet will both put them in that state and make sure they stay there (on a per-run basis).

If you want Puppet just to check the resources declared for the target node without syncing them, then you can run it in --noop mode, and Puppet will flag resources that are out of sync. Alternatively, your manifests can declare individual resources to managed in noop mode if you want finer granularity. In any case, Puppet certainly notifies you when it syncs an out of sync resource, both in its output and in the reports it sends back to the master (if you enable those). Additionally, you can use the --detailed-exitcodes option to make the agent's return code yield information about whether anything changed and/or whether there were any failed resources.

Doing everything in puppet results in ever-growing manifests. I don't know of a way to specify different manifests, e.g. every 30 minutes I want Puppet to run and request the lean and mean regular manifest and then once a week I want it to run the "make sure everything is in the right place" manifest.

Yes, everything you configure for Puppet to manage must be described in a manifest file, therefore the more you bring under Puppet management, the larger the volume of your manifests. That's like saying "every time I want a new feature in my program, I have to add source code!"

Puppet does offer facilities for limiting the scope of runs. The main ones are the --tags agent option to select a subset of the resources that normally would be applied, and schedules to declare master-side limits on when and how frequently particular resources and groups of resources should be applied.

Puppet seems very sensitive to network glitches. We run puppet from a cron job and errors were so frequent that we just started sending all output to /dev/null.

I'm not sure I understand. What sort of network glitches are we talking about? Are these frequent in your environment? And what sort of errors?

Endless certificate issues. It's crazy. So sometimes hosts would get "dropped"... for unknown reasons their certificates were no longer accepted. Because we'd already stopped output (see previous bullet point) we would not know this and the server would be quietly not updated. And when you get a certificate problem, often simply deleting the cert on the agent and master won't fix it. Sometimes a restart of the master service (or more?) is required.

The solution to this to me is not "you should run puppet dashboard, then you'd know". This shouldn't be failing in the first place. If something is that flaky, I don't want to run it.

(We're running version 3.4.2 on CentOS 6.5, 64-bit.)

---

Questions.

So my questions for the above three issue are I guess as follows

Is there a common Puppet pattern to address this? Or am I thinking about things all wrong.
Is there a way to get puppet to be more fault-tolerant, or at least complain less?

If you are not running in --verbose mode (also implied by --test), do not have --debug messages enabled, and do not have the 'show_diff' option enabled (defaults to disabled, unless you are using --test), then you are getting the minimum messages that the agent emits. You can, however, configure Puppet to send them to a logfile or to syslog (--logdest), and if you send them to syslog then you can use that subsystem's facilities to filter what messages are actually recorded, and where.

Are endless certificate woes the norm? Once an agent has successfully got its certificates working with the server, is it a known issue that it should sometimes start to subsequently fail?

No, they are not the norm. Once a client has a cert signed by a CA that the master recognizes (most often the CA provided by the master itself), it is normally good until the certificate expires. The default certificate lifetime is <mumble> years.

Is it possible that something is occasionally damaging or removing your clients' certificates? If a client's certificate is occasionally removed, then on its next run after such an event the agent will generate a new one. The master will not accept or sign that new cert, however, because it already has a signed cert for the requested certname (else the system would be wide open to spoofing). The client will in that case log the new certificate generation, but only on the run when it is generated, and that could be easy to miss.

We might have other ideas if you provided some additional detail about the SSL issues you are seeing.

Overall, though, I wonder whether you might find puppets "apply" face to be a more comfortable fit for you than the "agent" face. You already have an infrastructure by which you could distribute the manifests and data to each server, and you're already running under a separate scheduler rather than running Puppet in daemon mode. Puppet apply does not depend on SSL (since it builds catalogs locally, from local files), and it provides more direct, file-based mechanisms for selecting which resources to apply.

John

Ramin K

unread,

Jun 17, 2014, 11:31:54 AM6/17/14

to puppet...@googlegroups.com

On 6/16/2014 12:33 PM, Stephen Morton wrote:
> I've got some newbie puppet questions.
> My team has a tremendous amount of linux/computer knowledge, but we're
> new to Puppet.
> We recently started using puppet to manage some 100 servers. Their
> configs are all pretty similar with some small changes.
>
> ----
> History
>
> Prior to Puppet, we already had a management system that involved having
> config files under revision control and the config file repo checked out
> on every server and the repo config files symlinked into the appropriate
> place in the filesystem. Updating the repo would update these files.This
> was mostly just great, with the following limitations:
>

> * If the symlink got broken, it didn't work.
> * Some files require very specific ownership, or were required not to

> be symlinks (e.g. /etc/sudoers. /etc/vsftpd/ files I think)

> * Updating a daemon's config file does not mean that the daemon is

> restarted. e.g. updating /etc/httpd/conf/httpd.conf does not do a
> "service httpd reload"

> * You can't add a new symlink.
> * All files must be in revision control to link to. Some

> security-sensitive files we want to only be available to some
> servers and something like puppet that can send files over the
> network is a good solution to this.
>
> ----
>
> Puppet to the rescue?
>
> So we've tried a very conservative Puppet implementation. We've left our
> existing infrastructure and we just add new rules in Puppet. So far, we
> have a single site.pp file and only a dozen or so rules. But already
> we're seeing problems.
>

> 1. Puppet is good for configuring dynamic stuff that changes. But it

> seems silly to have rules for stuff that will be configured just one
> time and then will not change. If we set up some files, we don't
> expect them to disappear. In fact if they do disappear we might not
> want them silently fixed up we probably want to know what's going
> on. Doing everything in puppet results in ever-growing manifests. I
> don't know of a way to specify different manifests, e.g. every 30
> minutes I want Puppet to run and request the lean and mean regular
> manifest and then once a week I want it to run the "make sure
> everything is in the right place" manifest.

> 2. Puppet seems very sensitive to network glitches. We run puppet from

> a cron job and errors were so frequent that we just started sending
> all output to /dev/null.

> 3. Endless certificate issues. It's crazy. So sometimes hosts would get

> "dropped"... for unknown reasons their certificates were no longer
> accepted. Because we'd already stopped output (see previous bullet
> point) we would not know this and the server would be quietly not
> updated. And when you get a certificate problem, often simply
> deleting the cert on the agent and master won't fix it. Sometimes a
> restart of the master service (or more?) is required.

> * The solution to this to me is not "you should run puppet

> dashboard, then you'd know". This shouldn't be failing in the
> first place. If something is that flaky, I don't want to run it.
>
> (We're running version 3.4.2 on CentOS 6.5, 64-bit.)
>
> ---
>
> Questions.
>
> So my questions for the above three issue are I guess as follows
>

> 1. Is there a common Puppet pattern to address this? Or am I thinking
> about things all wrong.
> 2. Is there a way to get puppet to be more fault-tolerant, or at least
> complain less?
> 3. Are endless certificate woes the norm? Once an agent has

> successfully got its certificates working with the server, is it a
> known issue that it should sometimes start to subsequently fail?
>
> Thanks,
> Steve

1. I don't think about it as manifests increasing in size, but whether I
can completely recreate a server at anytime accurately. Or more
importantly can I provision 12 more of any server asap. It's been my
experience that active/passive sites usually drift into active/not
updated sites. I believe the same would apply to a Puppet install that
had one methodology for install and another for updates.

That said we do have servers that are usually short lived enough that we
run Puppet on install and then run specifically targeted updates when
needed using Puppet's --tags feature.

http://docs.puppetlabs.com/puppet/latest/reference/lang_tags.html#the-tag-metaparameter

2. I run Puppet masters in one US site and have agent machines is five
others including three sites outside of the US. We average roughly one
network related problem a month on the 50-100 nodes that aren't in the
main site. Without more information, logs, etc it would appear that your
the network's stability is the problem.

The symptoms you describe might be the result of an overloaded master.
If that sounds possible, I'd look at the number of Puppet master
processes you've configured in Apache/Passenger (or similar) and the
concurrent requests to the master during the day. Agents when left to
their own devices like to clump up over time. Additionally if you're
still using the puppetmasterd startup script your master won't be able
to handle more then one concurrent request.

3. I've been running Puppet for over four years and have never had the
sort of cert problems you've described. IIRC the cert expire time is
five years so that seems unlikely as well.

My best guess is time drift though I would expect transactions to
remain broken till NTP was updated.

Ramin

Ramin K

unread,

Jun 17, 2014, 11:46:09 AM6/17/14

to puppet...@googlegroups.com

google-groups appeared to have eaten the first version from yesterday.
Pardons if this is sent twice.

On 6/16/2014 12:33 PM, Stephen Morton wrote:

> I've got some newbie puppet questions.
> My team has a tremendous amount of linux/computer knowledge, but we're
> new to Puppet.
> We recently started using puppet to manage some 100 servers. Their
> configs are all pretty similar with some small changes.
>
> ----
> History
>
> Prior to Puppet, we already had a management system that involved having
> config files under revision control and the config file repo checked out
> on every server and the repo config files symlinked into the appropriate
> place in the filesystem. Updating the repo would update these files.This
> was mostly just great, with the following limitations:
>