Puppetmaster can't keep up with our 1400 nodes.

358 views
Skip to first unread message

Georgi Todorov

unread,
Oct 30, 2014, 10:45:16 AM10/30/14
to puppet...@googlegroups.com
Hi group,

We have a VM with 24 E7-8857 v2 @ 3.00GHz cores and 32G of ram (on big ESX hosts and fast backend) that is our foreman/puppetmaster with the following tuning params:

Passanger:
  PassengerMaxRequests 10000
  PassengerStatThrottleRate 180 
  PassengerMaxRequestQueueSize 300
  PassengerMaxPoolSize 18
  PassengerMinInstances 1
  PassengerHighPerformance on

PGSQL:
constraint_exclusion = on
checkpoint_completion_target = 0.9
checkpoint_segments = 16
max_connections = 100
maintenance_work_mem = 1GB
effective_cache_size = 22GB
work_mem = 192MB
wal_buffers = 8MB
shared_buffers = 7680MB

Apache
  StartServers        50
  MinSpareServers     5
  MaxSpareServers     20
  ServerLimit         256
  MaxClients          256
  MaxRequestsPerChild 4000


IPv6 disabled
vm.swappiness = 0
SELinux disabled
iptables flushed.

We have about 1400 hosts that checkin every 30 minutes and report facts. Facter execution time is less than 1 second on the nodes. 

The bottleneck seems to be 
Passenger RackApp: /etc/puppet/rack 

There is one of these for each passenger proc that sits at 100% all the time. A typical strace of it looks like this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.17   42.235808        1320     31988     15964 futex
  3.17    1.393038           0   5722020           rt_sigprocmask
  0.51    0.225576          14     16157         3 select
  0.12    0.051727           1     93402     83142 stat
  0.01    0.006303           0     13092     13088 lstat
  0.01    0.003000        1500         2           fsync
...

Here are the versions of software we've moved through:
Master OS: Centos 6.5, 6.6
Foreman: 1.4.9, 1.5.1, 1.6.2
puppet: 3.5.1, 3.6.2, 3.7.2
Ruby: 1.8.7 (centos...)
Passenger: 4.0.18, 4.0.53

Settings we've tried in various combinations:
  PassengerMaxPoolSize 12, 18, 24
  PassengerMaxRequestQueueSize 150, 200, 250, 350
  PassengerStatThrottleRate 120, 180
  ServerLimit 256, 512
  MaxClients 256, 512

Requests in queue are always maxed out and a lot of nodes just timeout.

What am I missing? Our node count doesn't seem to be that big, our catalogs are fairly small too (basically just a bunch of autofs maps via module and 2-3 files). 

Thanks!

Ramin K

unread,
Oct 30, 2014, 1:39:32 PM10/30/14
to puppet...@googlegroups.com
On 10/30/14 7:45 AM, Georgi Todorov wrote:
> Hi group,
>
> We have a VM with 24 E7-8857 v2 @ 3.00GHz cores and 32G of ram (on big
> ESX hosts and fast backend) that is our foreman/puppetmaster with the
> following tuning params:
>
> Passanger:
> PassengerMaxRequests 10000
> PassengerStatThrottleRate 180
> PassengerMaxRequestQueueSize 300
> PassengerMaxPoolSize 18
> PassengerMinInstances 1
> PassengerHighPerformance on

Puppet masters are CPU limited. I would increase MaxPoolSize until
you're consistently using 80% of the CPU available. Do keep an eye on
RAM usage as well. You may want to split your master into multiple parts
if you're already consuming all the resources on your current hardware.

I wrote about tuning Passenger for Puppet masters here.
http://ask.puppetlabs.com/question/13433/how-should-i-tune-passenger-to-run-puppet/

Ramin

Christopher Wood

unread,
Oct 30, 2014, 1:58:13 PM10/30/14
to puppet...@googlegroups.com
My related issue in that I want more simultaneous catalog compilations. Without sounding too lazy, my reaction is to add cores to the puppetmaster VMs and throw some more puppetmasters into the load balancer pools. 8x more cores will help me quickly and I can tune later when I can shift some priorities around to get time to think.

https://docs.puppetlabs.com/guides/scaling_multiple_masters.html

(As well, 1400 machines is past the point I would sleep well with a no-redundancy puppetmaster setup. You're very brave, Georgi.)

> Ramin
>
> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/545277CA.9080507%40badapple.net.
> For more options, visit https://groups.google.com/d/optout.

Garrett Honeycutt

unread,
Oct 30, 2014, 3:05:17 PM10/30/14
to puppet...@googlegroups.com
Hi Georgi,

How long does it take to compile a catalog? Is your VM server over
subscribed? Here's the formula for figuring out how many cores you need
dedicated to compiling catalogs. Note this is *dedicated* to compiling,
so minus two for the OS, if you run Dashboard minus the number of
workers, if you are running PuppetDB and Postgres, minus a few more.

Take a look at my post[1] to ask.puppetlabs.com regarding sizing.

cores = (nodes) * (check-ins per hour) * (seconds per catalog) /
(seconds per hour)

Another way to look at this is how many nodes should the current
hardware support.

nodes = (cores) * (seconds per hour) / (check-ins per hour) / (seconds
per catalog)


[1] -
http://ask.puppetlabs.com/question/3/where-can-i-find-information-about-sizing-for-puppet-servers/?answer=101#post-id-101

Best regards,
-g

--
Garrett Honeycutt
@learnpuppet
Puppet Training with LearnPuppet.com
Mobile: +1.206.414.8658

Georgi Todorov

unread,
Oct 30, 2014, 4:32:21 PM10/30/14
to puppet...@googlegroups.com
Chris, I sleep very well :). Our master is hourly backed up (the entire vm) and all configs go though git. Redeploying/restoring the master should be fairly quick (I have not tried though). Also, the way we use puppet, if it is down, it is no harm really. Only needed to push changes, which we don't do that often.

Ramin and Garrett, I was considering throwing more CPU at it, seeing how it is CPU bound, however the strace told me something else is a problem. And I finally solved it. The culprit was Ruby. Puppet agent runs used to take anywhere from 30 to 250 seconds depending on ... the weather? I'm guessing it depended on where in the queue they were. The VM cluster is not oversubscribed, and in fact I had the VM isolated on a single DL580 host for testing, just to make sure nothing is interfering.  I ended up compiled ruby 2.1.4, installed all the gems needed for foreman (about 75), and now have both foreman and puppet master running on ruby 2.1.4. My load average on the machine is now ~9 (down from about 17), requests in queue stays at 0 almost all the time with the occasional "jump" to 20 - nothing like my constantly full queue.

So, hopefully this would be helpful for anyone who is trying to run puppet master on CentOS. 

And thank you guys, I have actually read both of those links before and when we add the rest of our infra, if we start hitting a bottleneck, I'll split the master and increase the CPU count.

Cheers,
Georgi


Garrett Honeycutt

unread,
Oct 30, 2014, 11:56:14 PM10/30/14
to puppet...@googlegroups.com
Hi Georgi,

The catalog compilation time is how long it takes to compile the catalog
*on the master*. You can find it on CentOS with `grep Compile
/var/log/messages`. The amount of time it takes for your agent to run is
not at all tied with how long it takes to compile the catalog. Your
puppet agents are not talking to the puppet master once they have
received the catalog, except for file requests[1] and to submit a report.

If you are solving for long agent runs, check out the logs which include
timing information. A good visualization of this can be had with Puppet
Dashboard which will break down a run with times for each resource type.
Typically bottlenecks include exec, package, and service resources and
custom functions. Especially packages if you talk to the internet
instead of local mirrors.

By chance are you serving any large binary files with Puppet?

[1] -
http://4.bp.blogspot.com/-0xlYPWw61Hw/UpVulZU1qTI/AAAAAAAAAwY/egPhvnpn0jI/s1600/puppet_technical_flow.jpg

Georgi Todorov

unread,
Oct 31, 2014, 9:50:41 AM10/31/14
to puppet...@googlegroups.com

On Thursday, October 30, 2014 11:56:14 PM UTC-4, Garrett Honeycutt wrote:
Hi Georgi,

The catalog compilation time is how long it takes to compile the catalog
*on the master*. You can find it on CentOS with `grep Compile
/var/log/messages`. The amount of time it takes for your agent to run is
not at all tied with how long it takes to compile the catalog. Your
puppet agents are not talking to the puppet master once they have
received the catalog, except for file requests[1] and to submit a report.
 
The catalog compile on ruby 1.8.7 was about 12-14 seconds. On 2.1.4 it is now about 7-8. In either case we should be well within our limit (I came up with about ~5400 nodes for 2.1.4 and ~3300 on 1.8.7 from your formula above).
 
If you are solving for long agent runs, check out the logs which include
timing information.
 
My issue was that requests in queue in the passenger-watch output was hitting the max and client runs would just start timing out. Actually, sometime last night something happened and puppet stopped processing requests altogether. Stopping and starting httpd fixed this, but this could be just some bug in one of the new versions of software I upgraded to. I'll keep monitoring.
 
By chance are you serving any large binary files with Puppet?

No, all we're serving is some autofs maps and about a dozen tiny config files. 

Georgi Todorov

unread,
Nov 3, 2014, 11:39:14 AM11/3/14
to puppet...@googlegroups.com
On Friday, October 31, 2014 9:50:41 AM UTC-4, Georgi Todorov wrote:
 Actually, sometime last night something happened and puppet stopped processing requests altogether. Stopping and starting httpd fixed this, but this could be just some bug in one of the new versions of software I upgraded to. I'll keep monitoring.

So, unfortunately issue is not fixed :(. For whatever reason, everything ran great for a day. Catalog compiles were taking around 7 seconds, client runs finished in about 20s - happy days. Then overnight, the catalog compile times jumped to 20-30 seconds and client runs were now taking 200+ seconds. Few hours later, and there would be no more requests arriving at the puppet master at all. Is my http server flaking out? 

Running some --trace --evaltrace and strace it looks like most of the time is spent stat-ing:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.01    5.743474           9    673606    612864 stat
  7.72    0.534393           7     72102     71510 lstat
  6.76    0.467930       77988         6           wait4

That's a pretty poor "hit" rate (7k out of 74k stats)...

I've increased the check time to 1 hour on all clients, and the master seems to be keeping up for now - catalog compile avg 8 seconds, client run avg - 15 seconds, queue size = 0;

 Here is what a client run looks like when the server is keeping up:

Notice: Finished catalog run in 11.93 seconds
Changes:
Events:
Resources:
            Total: 522
Time:
       Filebucket: 0.00
             Cron: 0.00
         Schedule: 0.00
          Package: 0.00
          Service: 0.68
             Exec: 1.07
             File: 1.72
   Config retrieval: 13.35
         Last run: 1415032387
            Total: 16.82
Version:
           Config: 1415031292
           Puppet: 3.7.2


And when the server is just about dead:
Notice: Finished catalog run in 214.21 seconds
Changes:
Events:
Resources:
            Total: 522
Time:
             Cron: 0.00
       Filebucket: 0.00
         Schedule: 0.01
          Package: 0.02
          Service: 1.19
             File: 128.94
         Last run: 1415027092
            Total: 159.21
             Exec: 2.25
   Config retrieval: 26.80
Version:
           Config: 1415025705
           Puppet: 3.7.2


Probably 500 of the "Resources" are autofs maps using https://github.com/pdxcat/puppet-module-autofs/commits/master 

So there is definitely some bottle neck on the system, the problem is I can't figure out what it is. Is disk IO (iostat doesn't seem to think so), is it CPU (top looks fine), is it memory (ditto), is http/passenger combo not up to the task, is the postgres server not keeping up? There are so many components that it is hard for me to do a proper profile to find where the bottleneck is. Any ideas?

So far I've timed  the ENC script that pulls the classes for a node - takes less than 1 second.
From messages the catalog compile is from 7 seconds to 25 seconds (worst case, overloaded server). 

Anyway, figured I'd share that, unfortunately ruby was not the issue. Back to poking around and testing.

Henrik Lindberg

unread,
Nov 3, 2014, 7:49:33 PM11/3/14
to puppet...@googlegroups.com
> Notice: Finished catalog run in *11.93* seconds
> Changes:
> Events:
> Resources:
> Total: 522
> Time:
> Filebucket: 0.00
> Cron: 0.00
> Schedule: 0.00
> Package: 0.00
> Service: 0.68
> Exec: 1.07
> *File: 1.72*
You move away from Ruby 1.8.7 was a good move. That is essentially the
same as installing more hardware since Ruby versions after 1.8.7 are faster.

It may be worth trying running with Ruby 1.9.3 (p448 or later) just to
ensure it is not a Ruby 2x issue. It should be about par with Ruby 2x in
terms of performance.

That is, I am thinking that with the slow Ruby 1.8.7 you were simply
running out of compute resources, and then on Ruby 2x you may have hit
something else.

- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

jcbollinger

unread,
Nov 4, 2014, 9:50:15 AM11/4/14
to puppet...@googlegroups.com


On Monday, November 3, 2014 10:39:14 AM UTC-6, Georgi Todorov wrote:
On Friday, October 31, 2014 9:50:41 AM UTC-4, Georgi Todorov wrote:
 Actually, sometime last night something happened and puppet stopped processing requests altogether. Stopping and starting httpd fixed this, but this could be just some bug in one of the new versions of software I upgraded to. I'll keep monitoring.

So, unfortunately issue is not fixed :(. For whatever reason, everything ran great for a day. Catalog compiles were taking around 7 seconds, client runs finished in about 20s - happy days. Then overnight, the catalog compile times jumped to 20-30 seconds and client runs were now taking 200+ seconds. Few hours later, and there would be no more requests arriving at the puppet master at all. Is my http server flaking out? 

Running some --trace --evaltrace and strace it looks like most of the time is spent stat-ing:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.01    5.743474           9    673606    612864 stat
  7.72    0.534393           7     72102     71510 lstat
  6.76    0.467930       77988         6           wait4

That's a pretty poor "hit" rate (7k out of 74k stats)...

I've increased the check time to 1 hour on all clients, and the master seems to be keeping up for now - catalog compile avg 8 seconds, client run avg - 15 seconds, queue size = 0;

 Here is what a client run looks like when the server is keeping up:

Notice: Finished catalog run in 11.93 seconds
[...]
             File: 1.72
   Config retrieval: 13.35
[...]

And when the server is just about dead:
[...]
             File: 128.94
[...]
   Config retrieval: 26.80

 [...]
 
Probably 500 of the "Resources" are autofs maps using https://github.com/pdxcat/puppet-module-autofs/commits/master 

So there is definitely some bottle neck on the system, the problem is I can't figure out what it is. Is disk IO (iostat doesn't seem to think so), is it CPU (top looks fine), is it memory (ditto), is http/passenger combo not up to the task, is the postgres server not keeping up? There are so many components that it is hard for me to do a proper profile to find where the bottleneck is. Any ideas?

So far I've timed  the ENC script that pulls the classes for a node - takes less than 1 second.
From messages the catalog compile is from 7 seconds to 25 seconds (worst case, overloaded server). 

Anyway, figured I'd share that, unfortunately ruby was not the issue. Back to poking around and testing.


By far the biggest difference is is File retrieval time.  This will be for File resources where you specify content via a 'source' parameter rather than via a 'content' property.  The agent must make a separate request to the master for each such file, and those are collectively taking a long time.  Most likely they are backing up behind a bottleneck, so that much of the time consumed per node is actually spent waiting for service.

If the CPU is not overloaded and you have free physical RAM then It seems to me that the system service (i.e. httpd) and the I/O subsystem are your remaining candidates for the locus of the issue.  As you attempt to identify the bottleneck, do not ignore the total number of transactions serviced.  That could be as important as -- or even more important than -- the volume of data exchanged.

If the problem turns out to be related to the number / rate of transactions handled by httpd, then you could consider addressing it by switching File resources from using 'source' to using 'content' to specify file content.  That's a pretty clear win performance-wise for very small files, and it may be a win for you for somewhat larger files as well.  (Yes, I'm being vague.  Any hard numbers I threw out for "large" and "small" would be made up.)


John

Georgi Todorov

unread,
Nov 4, 2014, 10:32:32 AM11/4/14
to puppet...@googlegroups.com
On Tuesday, November 4, 2014 9:50:15 AM UTC-5, jcbollinger wrote:
By far the biggest difference is is File retrieval time.  This will be for File resources where you specify content via a 'source' parameter rather than via a 'content' property.  The agent must make a separate request to the master for each such file, and those are collectively taking a long time.  Most likely they are backing up behind a bottleneck, so that much of the time consumed per node is actually spent waiting for service.

I agree, there is something odd about File retrieval. We are creating a total of 10 autofs maps, however when I run trace, it looks like every separate entry creates a concat::fragment (puppetlabs-concat) which is later combined into a single file. I'm not too familliar with what concat::fragment does, but a quick scan of the code shows a lot of "source" references. We might have to just go to template files, instead of using the autofs module. That might take care of all our issues.

 
If the CPU is not overloaded and you have free physical RAM then It seems to me that the system service (i.e. httpd) and the I/O subsystem are your remaining candidates for the locus of the issue.  As you attempt to identify the bottleneck, do not ignore the total number of transactions serviced.  That could be as important as -- or even more important than -- the volume of data exchanged.
I've started graphing iostat and after a day of data will switch back to 30 minute interval to see what is hit the hardest. 
 
Henrik, I might have to go to 1.9.3 regardless, because foreman-rake does not work with the new rake in 2.1. Something to do with routes... I'm not a ruby/rake person so can't fix it myself. 

jcbollinger

unread,
Nov 5, 2014, 9:47:02 AM11/5/14
to puppet...@googlegroups.com


On Tuesday, November 4, 2014 9:32:32 AM UTC-6, Georgi Todorov wrote:
On Tuesday, November 4, 2014 9:50:15 AM UTC-5, jcbollinger wrote:
By far the biggest difference is is File retrieval time.  This will be for File resources where you specify content via a 'source' parameter rather than via a 'content' property.  The agent must make a separate request to the master for each such file, and those are collectively taking a long time.  Most likely they are backing up behind a bottleneck, so that much of the time consumed per node is actually spent waiting for service.

I agree, there is something odd about File retrieval. We are creating a total of 10 autofs maps, however when I run trace, it looks like every separate entry creates a concat::fragment (puppetlabs-concat) which is later combined into a single file. I'm not too familliar with what concat::fragment does, but a quick scan of the code shows a lot of "source" references.


The Concat module provides a mechanism for managing a single file as a collection of separately-managed pieces, or 'fragments'.  This has a number of advantages, mostly deriving from the ability to spread management of a single target file across multiple manifests, modules, or even separate nodes.  As a practical matter, though, the implementation manages each fragment as a separate File, and combines them on the client.

Those fragment files, one per Concat::Fragment resource, can use either 'content' or 'source', as specified for that fragment.  I find it a bit odd that an autofs module would choose the 'source' option (with a 'puppet://*' URL as its value), but if it's indeed doing so then perhaps that is generating large numbers of tiny, 'source'd files. 

 
We might have to just go to template files, instead of using the autofs module. That might take care of all our issues.



It might.  Or you might find an alternative autofs module that operates differently.  Or perhaps you could just use your current autofs module in a different way.


John

Georgi Todorov

unread,
Nov 5, 2014, 12:33:37 PM11/5/14
to puppet...@googlegroups.com
On Wednesday, November 5, 2014 9:47:02 AM UTC-5, jcbollinger wrote:
On Tuesday, November 4, 2014 9:32:32 AM UTC-6, Georgi Todorov wrote:
On Tuesday, November 4, 2014 9:50:15 AM UTC-5, jcbollinger wrote:
By far the biggest difference is is File retrieval time.  This will be for File resources where you specify content via a 'source' parameter rather than via a 'content' property.  The agent must make a separate request to the master for each such file, and those are collectively taking a long time.  Most likely they are backing up behind a bottleneck, so that much of the time consumed per node is actually spent waiting for service.

I agree, there is something odd about File retrieval. We are creating a total of 10 autofs maps, however when I run trace, it looks like every separate entry creates a concat::fragment (puppetlabs-concat) which is later combined into a single file. I'm not too familliar with what concat::fragment does, but a quick scan of the code shows a lot of "source" references.

The Concat module provides a mechanism for managing a single file as a collection of separately-managed pieces, or 'fragments'.  This has a number of advantages, mostly deriving from the ability to spread management of a single target file across multiple manifests, modules, or even separate nodes.  As a practical matter, though, the implementation manages each fragment as a separate File, and combines them on the client.

Those fragment files, one per Concat::Fragment resource, can use either 'content' or 'source', as specified for that fragment.  I find it a bit odd that an autofs module would choose the 'source' option (with a 'puppet://*' URL as its value), but if it's indeed doing so then perhaps that is generating large numbers of tiny, 'source'd files. 

You are right, I went through the autofs module code and it is using 'content' and not 'source' for the maps:

So far, with hourly checkins there are no issues at all. It is not really a big deal for us.

Thanks for your help!

Wil Cooley

unread,
Nov 7, 2014, 8:59:39 PM11/7/14
to puppet-users group


On Nov 4, 2014 7:32 AM, "Georgi Todorov" <ter...@gmail.com> wrote:
>
> I agree, there is something odd about File retrieval. We are creating a total of 10 autofs maps, however when I run trace, it looks like every separate entry creates a concat::fragment (puppetlabs-concat) which is later combined into a single file. I'm not too familliar with what concat::fragment does, but a quick scan of the code shows a lot of "source" references. We might have to just go to template files, instead of using the autofs module. That might take care of all our issues.
>

If you're generating the same map for a large subset of hosts, why not do that on the master and put the resulting file in a place the master's file service can serve it from?

Wil

Reply all
Reply to author
Forward
0 new messages