Simultaneous Client updates

96 views
Skip to first unread message

GordonJB

unread,
Nov 29, 2012, 10:35:48 AM11/29/12
to puppet...@googlegroups.com
Hi all,

I'm currently getting a lot of update failures showing up with no logs in the dashboard. This happens for about half of our 28 nodes, about once an hour. When re-run half an hour later, everything seems fine.

Through a combination of server restarts and our Puppet master locking up entirely yesterday, almost all of our nodes are now trying to do their half hourly update at the same time. Could this be the reason half of them fail? If so, how can I avoid this?

Thanks,
Gordon

Matthew Burgess

unread,
Nov 29, 2012, 10:48:10 AM11/29/12
to puppet...@googlegroups.com
Are you using any webserver in front of Puppet (Apache/Passenger for
example)? If not, you're probably running into Webrick's (Puppet's
default HTTP server) single-threaded limitation, whereby it can only
service a request from a single node at a time. Depending on your
client's update schedule (by default it's once every 30 minutes) and
number of clients you may end up with multiple clients trying to
request a catalog run at the same time.

You can either set up Apache/Passenger to sit in front of Puppet for
you, thereby enabling more clients to be handled concurrently. Or you
can have your clients check in less frequently and schedule them (via
cron, for example) so as to avoid multiple clients checking in with
the master concurrently (this obviously does not scale well at all).
If you want to go the Passenger route, there's a guide at
http://docs.puppetlabs.com/guides/passenger.html.

Hope this helps,

Matt.

GordonJB

unread,
Nov 29, 2012, 11:15:51 AM11/29/12
to puppet...@googlegroups.com
I believe I've set up Apache/Passenger correctly, yes. Running passenger-status shows 12 processes running under the /usr/share/puppet/rack/puppetmasterd domain. Would setting up cron jobs be on top of this passenger configuration, or should Apache/Passenger be enough?

Thanks, 
Gordon

Matthew Burgess

unread,
Nov 29, 2012, 11:37:56 AM11/29/12
to puppet...@googlegroups.com
On Thu, Nov 29, 2012 at 4:15 PM, GordonJB <g.bon...@gmail.com> wrote:
> I believe I've set up Apache/Passenger correctly, yes. Running
> passenger-status shows 12 processes running under the
> /usr/share/puppet/rack/puppetmasterd domain. Would setting up cron jobs be
> on top of this passenger configuration, or should Apache/Passenger be
> enough?

No, Apache/Passenger should definitely be enough. So, do you happen
to have storeconfigs enabled? If so, if it's pointing to an sqlite
DB, or if it's pointing to a networked DB (MySQL or PostgreSQL) and
your puppet.conf's dbconnections setting is too low, then you'd
probably be hitting contention there (see
http://docs.puppetlabs.com/references/3.0.latest/configuration.html#dbconnections).

Regards,

Matt.

GordonJB

unread,
Nov 29, 2012, 11:44:14 AM11/29/12
to puppet...@googlegroups.com
storeconfigs is not enabled on the master.

Just realised I probably should have mentioned versions, I'm on master & nodes 3.0.1 and dashboard is 1.2.12.

Ramin K

unread,
Nov 29, 2012, 1:47:55 PM11/29/12
to puppet...@googlegroups.com
On 11/29/2012 8:44 AM, GordonJB wrote:
> storeconfigs is not enabled on the master.
>
> Just realised I probably should have mentioned versions, I'm on master &
> nodes 3.0.1 and dashboard is 1.2.12.

I'm betting that most of your servers are checking in at the same time.
Half of the clients make it through. The other half are waiting while
the Puppet master processes fight for resources with Puppet dashboard
which is busy trying to process the reports that have just come in. At
some point the clients give up.

I like these settings for Passenger 3.x

PassengerMaxPoolSize <number of cores * 2>
PassengerMinInstances 1
PassengerMaxRequests 10000
PassengerStatThrottleRate 30

If you're running 12 Puppet master Rack processes then you should have
at least 6 cores, though keep in mind that Puppet dashboard will need a
core or two as well.

The other limiting factor is RAM. If you've experienced lockups, you
might have too many Rack processes which clock in around 150-250M.

The simplest solution is to make sure your clients check ins are spaced
somewhat evenly apart. However with only 30 clients as long as you don't
globally restart the Puppet client on all machines and have a four core
Puppet master you'd likely be fine.

Also set a cron to run the cleanup scripts against the Puppet dashboard
db daily and delete reports or it'll grow quite large.

Ramin

Matthew Burgess

unread,
Nov 29, 2012, 2:09:41 PM11/29/12
to puppet...@googlegroups.com
On Thu, Nov 29, 2012 at 4:44 PM, GordonJB <g.bon...@gmail.com> wrote:
> storeconfigs is not enabled on the master.
>
> Just realised I probably should have mentioned versions, I'm on master &
> nodes 3.0.1 and dashboard is 1.2.12.

Hmm, OK then. Just so as the "let's not assume anything" base is covered;

Can you confirm that:

1) You do not have any puppetmaster daemon's running?
2) You have Apache running with a VirtualHost listening on port 8140?
3) Your agent's puppet.conf file points to the correct server (via the
'server' parameter)?

All of that can probably be confirmed easily enough by looking in
Apache's access logs to see the incoming connections from your
clients.

After that, it sounds like it might be Passenger tuning, which Ramin's
post covers.

One more thought; are you using PuppetDB for anything, just in case it
isn't Apache/Passenger contention?

Regards,

Matt.

GordonJB

unread,
Nov 30, 2012, 6:29:52 AM11/30/12
to puppet...@googlegroups.com
Ramin: Yes, they were all checking in within a span of about two minutes.

Passenger tuning is something that needs doing I think, the failures seem to have calmed down a lot over night, and the runs are a tiny bit more spread out, but obviously I'd like to avoid having this problem in the future. Cleaning up the dashboard DB is also something that needs to be done.

Matt:
1) I didn't think so, but looking in my ps, I have the following:

puppet   17826     1  1 03:27 ?        00:04:37 master
www-data 18614  1350  0 Nov29 ?        00:00:00 /usr/sbin/apache2 -k start
root     18626  1350  0 Nov29 ?        00:00:06 /usr/lib/phusion_passenger/ApplicationPoolServerExecutable 0 /usr/lib/phusion_passenger/passenger-spawn-server  /usr/bin/ruby  /tmp/passenger.1350
root     18628 18626  0 Nov29 ?        00:02:19 Passenger spawn server
www-data 18638  1350  0 Nov29 ?        00:04:41 /usr/sbin/apache2 -k start
www-data 18639  1350  0 Nov29 ?        00:04:39 /usr/sbin/apache2 -k start

I thought the puppet master service was puppet-master, is that it up the top?

2) Apache is running and listening on 8140.

3) The server parameter hasn't been set, as puppet resolves to the puppet master on our DNS.

I'm not using Puppet DB for anything currently.

Thanks,
Gordon

Matthew Burgess

unread,
Nov 30, 2012, 7:02:06 AM11/30/12
to puppet...@googlegroups.com
On Fri, Nov 30, 2012 at 11:29 AM, GordonJB <g.bon...@gmail.com> wrote:

> 1) I didn't think so, but looking in my ps, I have the following:
>
> puppet 17826 1 1 03:27 ? 00:04:37 master
> www-data 18614 1350 0 Nov29 ? 00:00:00 /usr/sbin/apache2 -k start
> root 18626 1350 0 Nov29 ? 00:00:06
> /usr/lib/phusion_passenger/ApplicationPoolServerExecutable 0
> /usr/lib/phusion_passenger/passenger-spawn-server /usr/bin/ruby
> /tmp/passenger.1350
> root 18628 18626 0 Nov29 ? 00:02:19 Passenger spawn server
> www-data 18638 1350 0 Nov29 ? 00:04:41 /usr/sbin/apache2 -k start
> www-data 18639 1350 0 Nov29 ? 00:04:39 /usr/sbin/apache2 -k start
>
> I thought the puppet master service was puppet-master, is that it up the
> top?

Yes, that's it. For puppet-3.0, it's shown as 'master', I think
previous puppet versions would display it as 'puppetmasterd'. Now,
the fact that its parent pid is '1', suggests that it was launched via
init, i.e. it's the standalone puppet master, not one of your
Passenger-invoked masters. I don't have a passenger setup here at the
moment, but can do so to provide a comparison if required.

For now, I'd suggest you do whatever the equivalent of 'service
puppetmaster stop; chkconfig puppetmaster off' is on your OS (i.e.
stop the standalone puppet master, and prevent it from starting across
reboots). You may also need to restart your httpd as I'd be surprised
if your VirtualHost is listening on 8140 with that standalone puppet
master bound to that port already.

Hopefully I've not misinterpreted the above!

Thanks,

Matt.

GordonJB

unread,
Nov 30, 2012, 9:08:32 AM11/30/12
to puppet...@googlegroups.com
I don't suppose you know how to stop the service on 3.0? There is no puppetmaster process left, just puppet, and I'd like to keep the puppet client running!

Thanks, Gordon

Matthew Burgess

unread,
Nov 30, 2012, 9:24:54 AM11/30/12
to puppet...@googlegroups.com
On Fri, Nov 30, 2012 at 2:08 PM, GordonJB <g.bon...@gmail.com> wrote:
> I don't suppose you know how to stop the service on 3.0? There is no
> puppetmaster process left, just puppet, and I'd like to keep the puppet
> client running!

If there's no master process left, then it is stopped, surely?

Here's an output of 'ps -ef | grep puppet' on my server:

puppet 2777 1 0 10:17 ? 00:02:11 /usr/bin/ruby
/usr/bin/puppet master
root 2959 1 0 10:17 ? 00:00:03 /usr/bin/ruby
/usr/bin/puppet agent

So, assuming you only have your 'puppet agent' process showing up
there, then all is good.

Now, what you probably want to do is restart your httpd server. Once
it's back up and running, use 'lsof -i :8140' to confirm that httpd is
listening on the port that your puppet agents will be trying to
connect to. Now watch the output of 'ps' whilst an agent checks in
(you can trigger this by running 'puppet agent --test'). Depending on
your Passenger settings, you should see a 'master' process appear,
then disappear again once the agent is has completed its run.

Regards,

Matt.
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/puppet-users/-/DhYEdDqRZdUJ.
>
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-users...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-users?hl=en.

Dan White

unread,
Nov 30, 2012, 9:37:06 AM11/30/12
to puppet...@googlegroups.com
You are using passenger, right ?
I believe you have to stop the passenger process and/or the apache process it is running on.

“Sometimes I think the surest sign that intelligent life exists elsewhere in the universe is that none of it has tried to contact us.”
Bill Waterson (Calvin & Hobbes)

GordonJB

unread,
Nov 30, 2012, 10:03:40 AM11/30/12
to puppet...@googlegroups.com
I am, but there were standalone processes I'm not sure how to kill and disable.

I've currently got (from ps -ef | grep puppet):
root      1296     1  0 11:34 ?        00:00:13 /usr/bin/ruby /usr/bin/puppet agent
puppet    2246     1  0 11:58 ?        00:01:29 master
puppet    5868     1  1 14:33 ?        00:00:26 master

Is that OK, or do those master processes need getting rid of?

Thanks,
Gordon

Matthew Burgess

unread,
Nov 30, 2012, 10:24:33 AM11/30/12
to puppet...@googlegroups.com
On Fri, Nov 30, 2012 at 3:03 PM, GordonJB <g.bon...@gmail.com> wrote:
> I am, but there were standalone processes I'm not sure how to kill and
> disable.
>
> I've currently got (from ps -ef | grep puppet):
> root 1296 1 0 11:34 ? 00:00:13 /usr/bin/ruby
> /usr/bin/puppet agent
> puppet 2246 1 0 11:58 ? 00:01:29 master
> puppet 5868 1 1 14:33 ? 00:00:26 master
>
> Is that OK, or do those master processes need getting rid of?

I don't know how those master processes have started up like that.
Here's what the process tree should look like:

ps -ef | grep master:
puppet 3113 3052 0 15:16 ? 00:00:00 master
puppet 3145 1 0 15:16 ? 00:00:00 Rack:
/usr/share/puppet/rack/puppetmasterd

See how 'master' is *not* owned by the init process? So, what owns
it? A succession of 'ps -fp <pid>' runs (taking the PPID of each
process) shows:

puppet 3113 3052 0 15:16 ? 00:00:00 master
root 3052 3050 1 15:16 ? 00:00:04 Passenger spawn server
root 3050 3046 0 15:16 ? 00:00:00 PassengerHelperAgent
root 3046 3045 0 15:16 ? 00:00:00 PassengerWatchdog
root 3045 1 0 15:16 ? 00:00:00 /usr/sbin/httpd

In answer to your question of how to get rid of those processes, I'd
start by stopping httpd. If that doesn't get rid of them, then just
use `kill' on them.

Regards,

Matt.

GordonJB

unread,
Nov 30, 2012, 11:34:02 AM11/30/12
to puppet...@googlegroups.com
Killing apache seemed to get rid of that master process. Restarting the apache server however brought back another master process after a couple of minutes:

puppet    8549     1 22 16:25 ?        00:00:03 master

I think this was due to clients checking in, however they did not disappear. I currently have two sitting on ps -ef:

puppet_master@puppet:~$ ps -ef | grep puppet
root      1296     1  0 11:34 ?        00:00:18 /usr/bin/ruby /usr/bin/puppet agent
root      2178   787  0 11:42 ?        00:00:00 sshd: puppet_master [priv]
1000      2181  2178  0 11:42 ?        00:00:00 sshd: puppet_master@pts/0
puppet    8549     1  2 16:25 ?        00:00:12 master
puppet    8629     1  0 16:28 ?        00:00:01 master
1000      8681  2182  0 16:32 pts/0    00:00:00 grep --color=auto puppet
puppet_master@puppet:~$ ps -fp 8549
UID        PID  PPID  C STIME TTY          TIME CMD
puppet    8549     1  2 16:25 ?        00:00:13 master
puppet_master@puppet:~$ ps -fp 1
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 11:33 ?        00:00:01 /sbin/init


Also, 'lsof -i :8140' returned nothing.

Matthew Burgess

unread,
Nov 30, 2012, 3:54:28 PM11/30/12
to puppet...@googlegroups.com
On Fri, Nov 30, 2012 at 4:34 PM, GordonJB <g.bon...@gmail.com> wrote:

> Also, 'lsof -i :8140' returned nothing.

OK, let's start there then. You should have an
/etc/httpd/conf.d/puppetmaster.conf that looks somewhat like the one
at http://docs.puppetlabs.com/guides/passenger.html#apache-configuration-for-puppet-024x.
That guide is quite old now, but that config file is still good
enough to get Puppet-3.0.x working under Passenger. All you need to
do is change the 'puppet-server.inqnet.at.pem' string in both the
SSLCertificateFile and SSLCertificateKeyFile to match the FQDN of your
puppet server.

If that's what you've got already, then your Apache logs under
/etc/httpd/logs/error_log should contain some clues as to what might
be wrong with the VirtualHost.

Once you've got httpd listening on 8140, take a look at your
/usr/share/puppet/rack/puppetmasterd/config.ru file. It should match
the one at https://raw.github.com/puppetlabs/puppet/master/ext/rack/files/config.ru.
That file needs to be owned by the 'puppet' user.

Hopefully this gets you a bit further along.

Regards,

Matt.
Reply all
Reply to author
Forward
0 new messages