Trying to isolate performance issues with config retrieval.

1,686 views
Skip to first unread message

Trevor Vaughan

unread,
May 3, 2012, 4:25:10 PM5/3/12
to puppe...@googlegroups.com
All,

I've noticed an understandable correlation between catalog size and
config retrieval time. However, I'm a bit lost as to why it seems so
slow.

For an example 1.5M catalog:

24 seconds compile time
Almost instantaneous transfer time (localhost)
40 seconds registered by the puppet client as the 'Config retrieval time'

Why is there almost as much "other stuff" on the client as the
original compile time and is there anywhere that I can start looking
to potentially optimize it?

I'm perfectly willing to accept that it's just doing a lot and that's
how long it takes but it seems a bit odd.

Thanks,

Trevor

--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

Luke Kanies

unread,
May 5, 2012, 9:08:30 PM5/5/12
to puppe...@googlegroups.com
On May 3, 2012, at 1:25 PM, Trevor Vaughan wrote:

> All,
>
> I've noticed an understandable correlation between catalog size and
> config retrieval time. However, I'm a bit lost as to why it seems so
> slow.
>
> For an example 1.5M catalog:
>
> 24 seconds compile time
> Almost instantaneous transfer time (localhost)
> 40 seconds registered by the puppet client as the 'Config retrieval time'
>
> Why is there almost as much "other stuff" on the client as the
> original compile time and is there anywhere that I can start looking
> to potentially optimize it?
>
> I'm perfectly willing to accept that it's just doing a lot and that's
> how long it takes but it seems a bit odd.

I can't quite remember exactly what is there any more, but there are multiple things that the client does besides just downloading the catalog, and I think most of it is captured within that recorded time:

* Download and load plugins
* Discover facts
* Upload them
* Download catalog
* Convert catalog to RAL resources

There are some other things that might show up in this time, but I don't remember:

* Add autorequires
* Convert containment edges to dependency edges (creation of relationship graph)
* Prefetch appropriate resource types

Most of it is relatively clear in the Configurer class.

--
Luke Kanies | http://about.me/lak | http://puppetlabs.com/ | +1-615-594-8199

Emile MOREL

unread,
May 4, 2012, 4:19:55 AM5/4/12
to puppe...@googlegroups.com
Hi,

Trevor Vaughan a �crit :
> All,
>
> I've noticed an understandable correlation between catalog size and
> config retrieval time. However, I'm a bit lost as to why it seems so
> slow.
>
> For an example 1.5M catalog:
>
> 24 seconds compile time
> Almost instantaneous transfer time (localhost)
> 40 seconds registered by the puppet client as the 'Config retrieval time'
>
> Why is there almost as much "other stuff" on the client as the
> original compile time and is there anywhere that I can start looking
> to potentially optimize it?
>

'Config retrieval time' include the 'caching catalog time' (write
catalog on disk in YAML format). Your catalog is pretty big, so this
could be very slow.
You can check this by add theses lines in
lib/puppet/indirector/indirection.rb :

+ beginning_time = Time.now
Puppet.info "Caching #{self.name} for #{request.key}"
cache.save request(:save, result, *args)
+ Puppet.debug "Caching catalog time: #{(Time.now - beginning_time)}"

> I'm perfectly willing to accept that it's just doing a lot and that's
> how long it takes but it seems a bit odd.
>

I have made a patch to add an option for caching catalog in Marshal
format (read and write in this format is very faster); I will make a
pull request for this patch in few time.

> Thanks,
>
> Trevor
>
>

Otherwise i work on Puppet optimizations and i have seen some strange
things too. I work on big catalog (more than 1000 resources) and i want
to reduce the application time (around 20 sec).
I have seen two kind of things:
- First, recent Puppet versions are slowest (see PuppetVersions.png in
attachment), mainly the last one: 2.7.12.
- Second, ruby 1.9.3 is very slow compare to ruby EE (see PuppetRuby.png
in attachment). I think the problem come from the yaml library: caching
catalog is very slow.

Does someone have seen the same things?

Thanks,
�mile


PuppetRuby.png
PuppetVersions.png

Emile MOREL

unread,
May 9, 2012, 3:28:21 AM5/9/12
to puppe...@googlegroups.com
Hi,

Trevor Vaughan a �crit :
> All,
>
> I've noticed an understandable correlation between catalog size and
> config retrieval time. However, I'm a bit lost as to why it seems so
> slow.
>
> For an example 1.5M catalog:
>
> 24 seconds compile time
> Almost instantaneous transfer time (localhost)
> 40 seconds registered by the puppet client as the 'Config retrieval time'
>
Which version of Ruby and Puppet ?

'Config retrieval time' include the 'caching catalog time' (write
catalog on disk in YAML format). Your catalog is pretty big, so caching
could be very slow.
You can check this by add theses lines in
lib/puppet/indirector/indirection.rb on the agent side:

+ beginning_time = Time.now
Puppet.info "Caching #{self.name} for #{request.key}"
cache.save request(:save, result, *args)
+ Puppet.debug "Caching catalog time: #{(Time.now - beginning_time)}"

> Why is there almost as much "other stuff" on the client as the
> original compile time and is there anywhere that I can start looking
> to potentially optimize it?
>

I have made a patch to add an option for caching catalog in Marshal
format (read and write in this format is very faster); I will make a
pull request for this patch in few time.

> I'm perfectly willing to accept that it's just doing a lot and that's
> how long it takes but it seems a bit odd.
>
> Thanks,
>
> Trevor
>
>

�mile

Peter Meier

unread,
May 9, 2012, 5:15:09 AM5/9/12
to puppe...@googlegroups.com
> Otherwise i work on Puppet optimizations and i have seen some
> strange things too. I work on big catalog (more than 1000 resources)
> and i want to reduce the application time (around 20 sec).
> I have seen two kind of things:
> - First, recent Puppet versions are slowest (see PuppetVersions.png
> in attachment), mainly the last one: 2.7.12.

I can confirm that.

It's one of the reasons I'm still on 2.6 at a lot of places, as I have
a 10-20% catalog compilation time increase with 2.7. Even with a
master pre-2.7.12 (might have been 2.7.1 or so), which is what I
tested and I never read it got better, hence I never retested it.

> - Second, ruby 1.9.3 is very slow compare to ruby EE (see
> PuppetRuby.png in attachment). I think the problem come from the
> yaml library: caching catalog is very slow.

It is interesting that 1.9 is slower.

But anyway I remember that caching the catalog (read serializing it as
yaml to disk) has been very slow for years and also known. It usually
makes the agent look like it's hanging. This especially happens on
huge catalogs - I have catalogs with up to 10k resources.

Having a puppet release focused on stability and performance would
really be appreciated. I think there could some room for improvements
at various places.

~pete

David Schmitt

unread,
May 9, 2012, 11:31:31 AM5/9/12
to puppe...@googlegroups.com
On 09.05.2012 11:15, Peter Meier wrote:
> But anyway I remember that caching the catalog (read serializing it as
> yaml to disk) has been very slow for years and also known. It usually
> makes the agent look like it's hanging. This especially happens on huge
> catalogs - I have catalogs with up to 10k resources.
>
> Having a puppet release focused on stability and performance would
> really be appreciated. I think there could some room for improvements at
> various places.

+1!


Best Regards, David

Ashley Penney

unread,
May 9, 2012, 1:59:25 PM5/9/12
to puppe...@googlegroups.com
It's interesting this problem has come up because all of a sudden I'm plagued
by incredibly slow config retrieval:

Changes:
            Total: 2
Events:
          Success: 2
            Total: 2
Resources:
          Changed: 2
      Out of sync: 2
            Total: 390
          Skipped: 6
Time:
       Filebucket: 0.00
        Resources: 0.00
             Host: 0.00
           Anchor: 0.00
   Ssh authorized key: 0.00
             User: 0.00
          Yumrepo: 0.01
         Firewall: 0.06
          Package: 0.14
             Exec: 0.17
            Total: 124.05
         Last run: 1336586198
             File: 15.17
           Augeas: 19.98
          Service: 4.21
   Config retrieval: 84.30
Version:
           Config: 1336586075
           Puppet: 2.7.14

I haven't been able to isolate a cause for it yet and I'm not really sure how to troubleshoot it
down further.  I have 3 different puppetmasters and I see this kind of retrieval time for all of
them recently.  It started about a week ago.

(My catalog is about ~460k)





--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.


Daniel Pittman

unread,
May 9, 2012, 4:19:41 PM5/9/12
to puppe...@googlegroups.com
On Wed, May 9, 2012 at 2:15 AM, Peter Meier <peter...@immerda.ch> wrote:

>> Otherwise i work on Puppet optimizations and i have seen some strange
>> things too. I work on big catalog (more than 1000 resources) and i want to
>> reduce the application time (around 20 sec).
>> I have seen two kind of things:
>> - First, recent Puppet versions are slowest (see PuppetVersions.png in
>> attachment), mainly the last one: 2.7.12.
>
> I can confirm that.
>
> It's one of the reasons I'm still on 2.6 at a lot of places, as I have a
> 10-20% catalog compilation time increase with 2.7. Even with a master
> pre-2.7.12 (might have been 2.7.1 or so), which is what I tested and I never
> read it got better, hence I never retested it.

We are aware that there have been some performance drops - and that we
have some fundamental problems like a tendency to flush some cached
data to often when using environments - even if we don't fully
understand them yet.

Internally we are working on having some CI around performance so we
can know when things slow down.

>> - Second, ruby 1.9.3 is very slow compare to ruby EE (see PuppetRuby.png
>> in attachment). I think the problem come from the yaml library: caching
>> catalog is very slow.
>
> It is interesting that 1.9 is slower.
>
> But anyway I remember that caching the catalog (read serializing it as yaml
> to disk) has been very slow for years and also known. It usually makes the
> agent look like it's hanging. This especially happens on huge catalogs - I
> have catalogs with up to 10k resources.

It would be great to improve that, from our point of view, but I
wonder a bit: when is this a killer problem for everyone?

Also, someone commented about using marshall to speed this up - sadly,
that isn't an acceptable solution. We have had problems with Marshall
data being impossible to transport between Ruby versions, and worse,
causing segfaults. That means we can't use it to persist data where
it might be read by a different version Ruby later.

> Having a puppet release focused on stability and performance would really be
> appreciated. I think there could some room for improvements at various
> places.

I don't think we will ever have a release that is exclusively focused
on performance.

The discovery of the horrible performance drop in 2.7, though, has
brought us to see that we need to focus on it in a more formal way.

You can expect that performance, stability, and correctness will drive
the roadmap for the platform team much more than "shiny new features"
will over the coming months.

These are vital things to deliver, and my team is absolutely committed to them.

...and I am genuinely sorry that we have not communicated about this
to everyone effectively. This should have been obvious outside our
walls, and wasn't.

--
Daniel Pittman
⎋ Puppet Labs Developer – http://puppetlabs.com
♲ Made with 100 percent post-consumer electrons

Deepak Giridharagopal

unread,
May 9, 2012, 4:55:17 PM5/9/12
to puppe...@googlegroups.com
On Thu, May 3, 2012 at 2:25 PM, Trevor Vaughan <tvau...@onyxpoint.com> wrote:
All,

I've noticed an understandable correlation between catalog size and
config retrieval time. However, I'm a bit lost as to why it seems so
slow.

For an example 1.5M catalog:

24 seconds compile time
Almost instantaneous transfer time (localhost)
40 seconds registered by the puppet client as the 'Config retrieval time'

Why is there almost as much "other stuff" on the client as the
original compile time and is there anywhere that I can start looking
to potentially optimize it?

I'm perfectly willing to accept that it's just doing a lot and that's
how long it takes but it seems a bit odd.

Whether or not the 'config retrieval time' is unavoidable, this certainly seems like a part of the agent/master flow that could use some more transparency. If we logged the time spent for various parts of config retrieval, would that help at least triangulate the problem?

I've created https://projects.puppetlabs.com/issues/14386 to track this.

deepak

--
Deepak Giridharagopal / Puppet Labs

Jeff Weiss

unread,
May 10, 2012, 1:42:29 AM5/10/12
to puppe...@googlegroups.com
First, Émile, your performance plots are awesome! Thank you so much for taking the time to put these together. They will help us tremendously as we investigate and kill off these issues.

Second, we recognized the performance drop prior to this thread and Max and Carl had already been doing outstanding work to isolate, categorize, and prioritize the various problems.

Third, we take these performance problems seriously and are committed to fixing them. Significant internal discussions have been taking place, though outside this public forum. We are neither ignoring this problem nor giving up. If we have appeared unresponsive, it's not because we don't think it's important, but rather because we failed at communicating how important it truly is. Bottom line: Performance is critical. We will fix the problems.

Fourth, Peter, Émile, and Trevor (or anyone else experiencing the problem), would you be willing to be pre-release testers of improvements? Our ops team is seeing the problem too, but that's only a single real-world data point. We need to make sure we don't self-optimize. We need your help to make sure the performance fixes address *your* problems not just ours.

Fifth, thank you for keeping this issue in the spotlight. If something seems "off," please don't just assume that's how it's supposed to be. A delightful user experience is a core Puppet Labs value. If something doesn't work the way you think it should then we need to make our user experience better, not beat you into submission. We will undoubtedly make decisions that adversely affect a small number of individuals in, hopefully, only a minor way, but performance is not, and will not be, one of those tradeoffs.

Sixth, if you are not getting the timely response you think you should, please feel free to publicly shame me (or anyone else at Puppet Labs) for not engaging you. Something like this would be awesome: "Dammit, Jeff! I thought you guys were working on this, but I haven't seen anything in a while. Give us a build to test already!"

-Jeff

----
Jeff Weiss
Software Developer
Puppet Labs, Inc.
jeff....@puppetlabs.com
> --
> You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.

Brice Figureau

unread,
May 10, 2012, 3:52:10 AM5/10/12
to puppe...@googlegroups.com
On Wed, 2012-05-09 at 11:15 +0200, Peter Meier wrote:
> But anyway I remember that caching the catalog (read serializing it as
> yaml to disk) has been very slow for years and also known. It usually
> makes the agent look like it's hanging. This especially happens on
> huge catalogs - I have catalogs with up to 10k resources.

This has always been an issue (at least for me).

Back in the old days of 2.6.x, I even proposed that instead of
deserializing the catalog and serializing it again in a different format
for future use, we instead directly dump the serialized version we get
from the wire to disk (see #2892 [1]).
Of course this would break compatibility with previous versions, but at
least we could get rid of this (sometimes) large performance drop.

Note that this problem will certainly become much more pregnant if you
start to ship your catalogs with file content, instead of using file
resources.

[1]: https://projects.puppetlabs.com/issues/2892
--
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

Peter Meier

unread,
May 10, 2012, 4:44:28 AM5/10/12
to puppe...@googlegroups.com
>> But anyway I remember that caching the catalog (read serializing it as yaml
>> to disk) has been very slow for years and also known. It usually makes the
>> agent look like it's hanging. This especially happens on huge catalogs - I
>> have catalogs with up to 10k resources.
>
> It would be great to improve that, from our point of view, but I
> wonder a bit: when is this a killer problem for everyone?

If I push a simple change and want to apply it immediately, hence
manually, and puppet seems to hang for 2 minutes serializing the
catalog to disk, then this is not a killer problem, but at least very
annoying. I'm usually using tags to skip most of the catalog while
pushing immediate changes, but as still the whole catalog has to be
serialized, tags only speed up things while applying the catalog.

This means that if the shortened run (using tags) goes for ~4 minutes
and 2 minutes seem to be spent to serialize the catalog, it slows you
down quite a bit. Puppet hasn't to be super fast, but if you start
thinking twice or thrice if you're really ready to run it with the new
changes, we're over that point that we could say it's still reasonable
fast.

Not to forget the memory consumption, we mentioned in #2892, during
serialization, which can spike quite high and affect the other running
services. And as the only thing that I win (afair) with the
serialization is to have a cached catalog (which I don't really use,
as I run things from cron without using the cached catalog), it's
quite a high price I have to pay.

>> Having a puppet release focused on stability and performance would really be
>> appreciated. I think there could some room for improvements at various
>> places.
>
> I don't think we will ever have a release that is exclusively focused
> on performance.
>
> The discovery of the horrible performance drop in 2.7, though, has
> brought us to see that we need to focus on it in a more formal way.
>
> You can expect that performance, stability, and correctness will drive
> the roadmap for the platform team much more than "shiny new features"
> will over the coming months.
>
> These are vital things to deliver, and my team is absolutely
> committed to them.
>
> ...and I am genuinely sorry that we have not communicated about this
> to everyone effectively. This should have been obvious outside our
> walls, and wasn't.

Afair a couple of points have been raised in the past few years where
puppet could be improved regarding performance and resource usage. But
none of them were really followed.

However, all the past few release became in my opinion usually
significantly slower, without having a real investigation and answer
why this has happened and if we're willing to pay that price.

Having a focus on performance and also looking at changes regarding
their performance impact (or improvement) would already be an
improvement for me. So if we would know with a new release that puppet
is slower or even faster and why that's the case, this would be a
first good step! :)

Thanks a lot!

~pete

Peter Meier

unread,
May 10, 2012, 4:48:08 AM5/10/12
to puppe...@googlegroups.com
> Fourth, Peter, Émile, and Trevor (or anyone else experiencing the
> problem), would you be willing to be pre-release testers of
> improvements? Our ops team is seeing the problem too, but that's
> only a single real-world data point. We need to make sure we don't
> self-optimize. We need your help to make sure the performance fixes
> address *your* problems not just ours.

I can usually do some testing. It might take me a couple of days to
report back, but if your offering me something to test and I say yes
I'll do it, you can usually also come back at me and asking for it (or
yelling ;) ), if I don't see to report back things.

So, yes!

Thanks...

~pete

DEGREMONT Aurelien

unread,
May 10, 2012, 11:36:50 AM5/10/12
to puppe...@googlegroups.com
Daniel Pittman a écrit :
> It would be great to improve that, from our point of view, but I
> wonder a bit: when is this a killer problem for everyone?
>

We are using puppet only interactively, on command line. When writing
your manifests, you can try to apply your freshly written module a
couple of times to check if it works as expected. If each Puppet run
takes one minute or more (as it is the case currently), it is really
annoying for the admin.
Moreover, users do not understand why it needs 40 sec to simply copy a
file when running "puppet agent -t --tags foo"
It is difficult for me sometimes to explain them that Puppet is such a
great tool :)



Aurélien

Ashley Penney

unread,
May 10, 2012, 1:30:28 PM5/10/12
to puppe...@googlegroups.com
On Thu, May 10, 2012 at 1:42 AM, Jeff Weiss <jeff....@puppetlabs.com> wrote:

Fourth, Peter, Émile, and Trevor (or anyone else experiencing the problem), would you be willing to be pre-release testers of improvements? Our ops team is seeing the problem too, but that's only a single real-world data point.  We need to make sure we don't self-optimize.  We need your help to make sure the performance fixes address *your* problems not just ours.

I just wanted to chime in to offer to test any performance patches because I'm definitely interested in helping get to the bottom of any performance drops.  We also see the massive memory use during run issues other people have seen which has resulted in us stopping puppet runs on certain over-commited boxes as it can tip them over the edge.  Anything that targets either of this areas is something I'll happily hack in by hand for testing purposes.  (I have 4 puppetmasters currently so it's easy to target just a few and not the rest for comparisons).

Nick Lewis

unread,
May 10, 2012, 2:44:18 PM5/10/12
to puppe...@googlegroups.com
On that note: After spending a few hours hunting yesterday, we identified the cause of a significant performance regression in compilation when using modules with metadata. I've attached a patch for anyone who would like to test it out. We're seeing approximately 5-10x faster compilation.

The basic gist of the issue is that every time Puppet looks up a class (via include/import/autoload), it constructs a list of possible modules the class could be in (based on its name). For each module, it will create a Puppet::Module instance, so that it can ask which manifests in that module may contain the class. Unfortunately, creating the Puppet::Module instance will cause the metadata.json file to be read and parsed. Given that this happens several times *per include*, it gets expensive quickly.

Fortunately, as we already have a list of Puppet::Module objects for all the modules, the simple solution is to just use one of those.

Once again, a patch is attached. Please test this and provide feedback. Note that this patch should not be applied to Puppet 2.6, or to 2.7.3 or earlier. Due to a bug in caching, it will cause significantly *worse* performance.
typeloader-compilation-performance.patch

James Turnbull

unread,
May 10, 2012, 3:39:18 PM5/10/12
to puppe...@googlegroups.com
Nick Lewis wrote:
> On Thursday, May 10, 2012 at 10:30 AM, Ashley Penney wrote:
>> On Thu, May 10, 2012 at 1:42 AM, Jeff Weiss <jeff....@puppetlabs.com
>> <mailto:jeff....@puppetlabs.com>> wrote:
>>>
>>> Fourth, Peter, �mile, and Trevor (or anyone else experiencing the
>>> problem), would you be willing to be pre-release testers of
>>> improvements? Our ops team is seeing the problem too, but that's only
>>> a single real-world data point. We need to make sure we don't
>>> self-optimize. We need your help to make sure the performance fixes
>>> address *your* problems not just ours.
>>
>> I just wanted to chime in to offer to test any performance patches
>> because I'm definitely interested in helping get to the bottom of any
>> performance drops. We also see the massive memory use during run
>> issues other people have seen which has resulted in us stopping puppet
>> runs on certain over-commited boxes as it can tip them over the edge.
>> Anything that targets either of this areas is something I'll happily
>> hack in by hand for testing purposes. (I have 4 puppetmasters
>> currently so it's easy to target just a few and not the rest
>> for comparisons).
>
> On that note: After spending a few hours hunting yesterday, we
> identified the cause of a significant performance regression in
> compilation when using modules with metadata. I've attached a patch for
> anyone who would like to test it out. We're seeing approximately 5-10x
> faster compilation.
>

Kudos guys! That's awesome.

James


--
James Turnbull
Puppet Labs
1-503-734-8571
To schedule a meeting with me: http://tungle.me/jamtur01

Luke Kanies

unread,
May 10, 2012, 4:02:51 PM5/10/12
to puppe...@googlegroups.com
Excellent work, Nick - thank you.

-- 
<typeloader-compilation-performance.patch>

Ashley Penney

unread,
May 10, 2012, 4:37:27 PM5/10/12
to puppe...@googlegroups.com
You guys are the best, quite the quick turn around!  I'll be applying this to our largest puppetmaster tomorrow and running a series of tests so I can give you some feedback.

Trevor Vaughan

unread,
May 10, 2012, 6:12:50 PM5/10/12
to puppe...@googlegroups.com
Nick,

I just tried this patch with 2.7.13 and I didn't notice a large
difference in timing overall but I also didn't notice any adverse
effects.

And, yes, I'm more than happy to try out new patches for optimization.

Thanks,

Trevor

On Thu, May 10, 2012 at 2:44 PM, Nick Lewis <ni...@puppetlabs.com> wrote:

Trevor Vaughan

unread,
May 10, 2012, 6:15:18 PM5/10/12
to puppe...@googlegroups.com
Apologies for the double post but my Ruby version is 1.8.7.

Thanks,

Trevor

Trevor Vaughan

unread,
May 10, 2012, 6:24:15 PM5/10/12
to puppe...@googlegroups.com
Based on this change, caching catalog time appears to be about 1/10 of
the total Catalog retrieval number fairly consistently.

Thanks,

Trevor

On Wed, May 9, 2012 at 3:28 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
> Hi,
>
> Trevor Vaughan a écrit :
> Émile
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com.
> To unsubscribe from this group, send email to
> puppet-dev+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/puppet-dev?hl=en.
>



Peter Meier

unread,
May 11, 2012, 4:25:37 PM5/11/12
to puppe...@googlegroups.com
> On that note: After spending a few hours hunting yesterday, we
> identified the cause of a significant performance regression in
> compilation when using modules with metadata. I've attached a patch
> for anyone who would like to test it out. We're seeing approximately
> 5-10x faster compilation.

Awesome work!

> The basic gist of the issue is that every time Puppet looks up a
> class (via include/import/autoload), it constructs a list of possible
> modules the class could be in (based on its name). For each module,
> it will create a Puppet::Module instance, so that it can ask which
> manifests in that module may contain the class. Unfortunately,
> creating the Puppet::Module instance will cause the metadata.json
> file to be read and parsed. Given that this happens several times
> *per include*, it gets expensive quickly.
>
> Fortunately, as we already have a list of Puppet::Module objects for
> all the modules, the simple solution is to just use one of those.
>
> Once again, a patch is attached. Please test this and provide
> feedback. Note that this patch should not be applied to Puppet 2.6,
> or to 2.7.3 or earlier. Due to a bug in caching, it will cause
> significantly *worse* performance.

Question: I have only like 2-5 Modules (of >120) that have a
metadata.json file. Would this make sense to test it as well in my
situation? I didn't yet understand whether even looking for it is
already an expensive process as well or mainly reading the metadata.json
file is expensive.
But looking at your patch it sounds anyway better to not create a lot of
unnecessary objects.

So maybe this would also explain my compilation time increase I can see
with 2.7.

Thanks

~pete

PS: I assume I fine that patch also in your git repo?

signature.asc

Nick Lewis

unread,
May 11, 2012, 4:29:48 PM5/11/12
to puppe...@googlegroups.com
Well it's probably worth trying at least, since the patch is simple. I think the difference really depends on which particular modules have metadata, and how your modules are related. If you often refer to nested classes from within those modules, or refer to nested classes that live in those modules, it may definitely help. Also, even if the files don't exist, it's still a lot of stats, which could add up.

This would definitely be an interesting data point.

Trevor Vaughan

unread,
May 11, 2012, 8:16:03 PM5/11/12
to puppe...@googlegroups.com
For another point of data, I tried a handful of different ciphers
between the server and client with no noticeable difference in time.

Emile MOREL

unread,
May 14, 2012, 5:11:09 AM5/14/12
to puppe...@googlegroups.com
Daniel Pittman a écrit :
>
> It would be great to improve that, from our point of view, but I
> wonder a bit: when is this a killer problem for everyone?
>
> Also, someone commented about using marshall to speed this up - sadly,
> that isn't an acceptable solution. We have had problems with Marshall
> data being impossible to transport between Ruby versions, and worse,
> causing segfaults. That means we can't use it to persist data where
> it might be read by a different version Ruby later.
>
>
Caching catalog may not be used as a cache. We are using puppet only in
interactive mode and we use the cache to extract some data (like a kind
of report).
So if the cache doesn't pass the ruby upgrades or if it make sometime a
segfault
(does it ever happened to me for now) it's does not matter. On the other
side, win
some seconds is much appreciated.

Émile

Emile MOREL

unread,
May 14, 2012, 5:16:11 AM5/14/12
to puppe...@googlegroups.com
Jeff Weiss a �crit :
> First, �mile, your performance plots are awesome! Thank you so much for taking the time to put these together. They will help us tremendously as we investigate and kill off these issues.
>

You're welcome, share my results with the community is quite natural ;)

> Second, we recognized the performance drop prior to this thread and Max and Carl had already been doing outstanding work to isolate, categorize, and prioritize the various problems.
>
> Third, we take these performance problems seriously and are committed to fixing them. Significant internal discussions have been taking place, though outside this public forum. We are neither ignoring this problem nor giving up. If we have appeared unresponsive, it's not because we don't think it's important, but rather because we failed at communicating how important it truly is. Bottom line: Performance is critical. We will fix the problems.
>
> Fourth, Peter, �mile, and Trevor (or anyone else experiencing the problem), would you be willing to be pre-release testers of improvements? Our ops team is seeing the problem too, but that's only a single real-world data point. We need to make sure we don't self-optimize. We need your help to make sure the performance fixes address *your* problems not just ours.
>

Of course, we will test all proposals that improve performances.

�mile

Brice Figureau

unread,
May 14, 2012, 8:09:22 AM5/14/12
to puppe...@googlegroups.com
On Fri, 2012-05-11 at 20:16 -0400, Trevor Vaughan wrote:
> For another point of data, I tried a handful of different ciphers
> between the server and client with no noticeable difference in time.

I think (but might be totally wrong) that since you're not seeing any
real improvement with Nick's patch, and you also don't gain much by
disabling caching altogether, then what might take time for you is the
pson deserialization or catalog conversion. Both will create numerous
objects, which is known to be quite expensive.

I suggest you time (with the Puppet::Util.benchmark method for instance)
the various parts of the Configurer and report us the details.

Hope tha thelps,

Daniel Pittman

unread,
May 14, 2012, 12:46:58 PM5/14/12
to puppe...@googlegroups.com
On Mon, May 14, 2012 at 2:11 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
> Daniel Pittman a écrit :
>
>> It would be great to improve that, from our point of view, but I
>> wonder a bit: when is this a killer problem for everyone?
>>
>> Also, someone commented about using marshall to speed this up - sadly,
>> that isn't an acceptable solution.  We have had problems with Marshall
>> data being impossible to transport between Ruby versions, and worse,
>> causing segfaults.  That means we can't use it to persist data where
>> it might be read by a different version Ruby later.
>
> Caching catalog may not be used as a cache. We are using puppet only in
> interactive mode and we use the cache to extract some data (like a kind of
> report).

That is an interesting use-case, and not one I had expected - I am
surprised someone is using the "cached" catalog as an API like that.
What report style data do you extract?


Supporting multiple cache formats is a trade-off, because there is
additional complexity to identify the content to read safely, which
means more code - and more code to have bugs. It sounds like you
would almost be as happy to just disable the caching entirely, other
than your reporting needs, yes?

Luke Kanies

unread,
May 14, 2012, 12:49:00 PM5/14/12
to puppe...@googlegroups.com
On May 14, 2012, at 9:46 AM, Daniel Pittman wrote:

> On Mon, May 14, 2012 at 2:11 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
>> Daniel Pittman a écrit :
>>
>>> It would be great to improve that, from our point of view, but I
>>> wonder a bit: when is this a killer problem for everyone?
>>>
>>> Also, someone commented about using marshall to speed this up - sadly,
>>> that isn't an acceptable solution. We have had problems with Marshall
>>> data being impossible to transport between Ruby versions, and worse,
>>> causing segfaults. That means we can't use it to persist data where
>>> it might be read by a different version Ruby later.
>>
>> Caching catalog may not be used as a cache. We are using puppet only in
>> interactive mode and we use the cache to extract some data (like a kind of
>> report).
>
> That is an interesting use-case, and not one I had expected - I am
> surprised someone is using the "cached" catalog as an API like that.
> What report style data do you extract?

This is actually really common - we've actually built reports like this for customers, even.

It's the only "database" available when storeconfigs is too slow to use in production.

I'd like to see the 'catalog' face extended to provide the operations people want, so they're a bit more isolated from the storage format.

> Supporting multiple cache formats is a trade-off, because there is
> additional complexity to identify the content to read safely, which
> means more code - and more code to have bugs. It sounds like you
> would almost be as happy to just disable the caching entirely, other
> than your reporting needs, yes?



--
Luke Kanies | http://about.me/lak | http://puppetlabs.com/ | +1-615-594-8199

Daniel Pittman

unread,
May 14, 2012, 12:53:29 PM5/14/12
to puppe...@googlegroups.com
On Mon, May 14, 2012 at 9:49 AM, Luke Kanies <lu...@puppetlabs.com> wrote:
> On May 14, 2012, at 9:46 AM, Daniel Pittman wrote:
>> On Mon, May 14, 2012 at 2:11 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
>>> Daniel Pittman a écrit :
>>>
>>>> It would be great to improve that, from our point of view, but I
>>>> wonder a bit: when is this a killer problem for everyone?
>>>>
>>>> Also, someone commented about using marshall to speed this up - sadly,
>>>> that isn't an acceptable solution.  We have had problems with Marshall
>>>> data being impossible to transport between Ruby versions, and worse,
>>>> causing segfaults.  That means we can't use it to persist data where
>>>> it might be read by a different version Ruby later.
>>>
>>> Caching catalog may not be used as a cache. We are using puppet only in
>>> interactive mode and we use the cache to extract some data (like a kind of
>>> report).
>>
>> That is an interesting use-case, and not one I had expected - I am
>> surprised someone is using the "cached" catalog as an API like that.
>> What report style data do you extract?
>
> This is actually really common - we've actually built reports like this for customers, even.
> It's the only "database" available when storeconfigs is too slow to use in production.

Hah. I must have missed that. Obviously, I do need to better
understand that particular use.

> I'd like to see the 'catalog' face extended to provide the operations people want, so they're a bit more isolated from the storage format.

That would allow greater abstraction here - regardless of our internal
format, that could perform transformations into whatever consumption
format people want.

R.I.Pienaar

unread,
May 14, 2012, 12:59:24 PM5/14/12
to puppe...@googlegroups.com
I've built a few things around the catalogs - diff to help with upgrades you
can compare compiled catalogs with different versions of puppet to see if
there's any changes due to parser behaviour etc that will bite you also just
little things to dump out all files or all packages etc for a given catalog
which is great to help people discover what is and isnt managed on a machine.

Others have built things for vim that would warn soon as you edit a managed
file etc

Luke Kanies

unread,
May 14, 2012, 1:21:15 PM5/14/12
to puppe...@googlegroups.com
On May 14, 2012, at 9:53 AM, Daniel Pittman wrote:

> On Mon, May 14, 2012 at 9:49 AM, Luke Kanies <lu...@puppetlabs.com> wrote:
>> On May 14, 2012, at 9:46 AM, Daniel Pittman wrote:
>>> On Mon, May 14, 2012 at 2:11 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
>>>> Daniel Pittman a écrit :
>>>>
>>>>> It would be great to improve that, from our point of view, but I
>>>>> wonder a bit: when is this a killer problem for everyone?
>>>>>
>>>>> Also, someone commented about using marshall to speed this up - sadly,
>>>>> that isn't an acceptable solution. We have had problems with Marshall
>>>>> data being impossible to transport between Ruby versions, and worse,
>>>>> causing segfaults. That means we can't use it to persist data where
>>>>> it might be read by a different version Ruby later.
>>>>
>>>> Caching catalog may not be used as a cache. We are using puppet only in
>>>> interactive mode and we use the cache to extract some data (like a kind of
>>>> report).
>>>
>>> That is an interesting use-case, and not one I had expected - I am
>>> surprised someone is using the "cached" catalog as an API like that.
>>> What report style data do you extract?
>>
>> This is actually really common - we've actually built reports like this for customers, even.
>> It's the only "database" available when storeconfigs is too slow to use in production.
>
> Hah. I must have missed that. Obviously, I do need to better
> understand that particular use.

Are you making changes to this area right now? Not that I have a complete picture, but I'm happy to do a brain-dump on this. Basically, think local access to a single-node PuppetDB instance.

>> I'd like to see the 'catalog' face extended to provide the operations people want, so they're a bit more isolated from the storage format.
>
> That would allow greater abstraction here - regardless of our internal
> format, that could perform transformations into whatever consumption
> format people want.

Yep. I've been wanting a 'puppet catalog select' action to do all kinds of subgraph selection and presentation, with optional piping that to 'puppet apply', for instance.

Daniel Pittman

unread,
May 14, 2012, 2:52:33 PM5/14/12
to puppe...@googlegroups.com
On Mon, May 14, 2012 at 10:21 AM, Luke Kanies <lu...@puppetlabs.com> wrote:
> On May 14, 2012, at 9:53 AM, Daniel Pittman wrote:
>> On Mon, May 14, 2012 at 9:49 AM, Luke Kanies <lu...@puppetlabs.com> wrote:
>>> On May 14, 2012, at 9:46 AM, Daniel Pittman wrote:
>>>> On Mon, May 14, 2012 at 2:11 AM, Emile MOREL <mor...@ocre.cea.fr> wrote:
>>>>> Daniel Pittman a écrit :
>>>>>
>>>>>> It would be great to improve that, from our point of view, but I
>>>>>> wonder a bit: when is this a killer problem for everyone?
>>>>>>
>>>>>> Also, someone commented about using marshall to speed this up - sadly,
>>>>>> that isn't an acceptable solution.  We have had problems with Marshall
>>>>>> data being impossible to transport between Ruby versions, and worse,
>>>>>> causing segfaults.  That means we can't use it to persist data where
>>>>>> it might be read by a different version Ruby later.
>>>>>
>>>>> Caching catalog may not be used as a cache. We are using puppet only in
>>>>> interactive mode and we use the cache to extract some data (like a kind of
>>>>> report).
>>>>
>>>> That is an interesting use-case, and not one I had expected - I am
>>>> surprised someone is using the "cached" catalog as an API like that.
>>>> What report style data do you extract?
>>>
>>> This is actually really common - we've actually built reports like this for customers, even.
>>> It's the only "database" available when storeconfigs is too slow to use in production.
>>
>> Hah.  I must have missed that.  Obviously, I do need to better
>> understand that particular use.
>
> Are you making changes to this area right now?   Not that I have a complete picture, but I'm happy to do a brain-dump on this.  Basically, think local access to a single-node PuppetDB instance.

No, our immediate roadmap doesn't touch on improving the space.

With Telly on the verge of going out the door, though, we are going to
be refocusing on what that list *should* be.

>>> I'd like to see the 'catalog' face extended to provide the operations people want, so they're a bit more isolated from the storage format.
>>
>> That would allow greater abstraction here - regardless of our internal
>> format, that could perform transformations into whatever consumption
>> format people want.
>
> Yep.  I've been wanting a 'puppet catalog select' action to do all kinds of subgraph selection and presentation, with optional piping that to 'puppet apply', for instance.

*nod* That sounds like an awesome demonstration that we have
delivered useful access to that catalog data.

Trevor Vaughan

unread,
May 14, 2012, 5:19:28 PM5/14/12
to puppe...@googlegroups.com
Any suggestions on where to benchmark here? I tried to figure out
where to hit it but didn't really get anywhere with it.

I did try using pure ruby to load the catalog from disk using the
Puppet methods and it took 7 seconds in a case where Config retrieval
is reporting around 40 seconds.

Thanks,

Trevor
> --
> You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
>



Nick Lewis

unread,
May 14, 2012, 5:22:38 PM5/14/12
to puppe...@googlegroups.com
On Monday, May 14, 2012 at 2:19 PM, Trevor Vaughan wrote:
Any suggestions on where to benchmark here? I tried to figure out
where to hit it but didn't really get anywhere with it.

I did try using pure ruby to load the catalog from disk using the
Puppet methods and it took 7 seconds in a case where Config retrieval
is reporting around 40 seconds.

The catalog is being serialized or deserialized three times: once to PSON on the master to transfer to the agent, once from PSON to revive on the agent, and once to YAML for the agent to cache it. That seems like it basically ought to account for the extra time, if one deserialize was 7 seconds.

Trevor Vaughan

unread,
May 14, 2012, 5:29:21 PM5/14/12
to puppe...@googlegroups.com
Yes, this does make sense.

Mystery solved I suppose.

But, in that case, couldn't you skip that last phase if you just pass
YAML the whole way and cut off a third of the load time?

Trevor

Nick Lewis

unread,
May 14, 2012, 5:45:39 PM5/14/12
to puppe...@googlegroups.com
On Monday, May 14, 2012 at 2:29 PM, Trevor Vaughan wrote:
Yes, this does make sense.

Mystery solved I suppose.

But, in that case, couldn't you skip that last phase if you just pass
YAML the whole way and cut off a third of the load time?

Yes, or more ideally, we would store the catalog in PSON rather than YAML. Unfortunately, catalog retrieval is somewhat inflexible today. The bit doing the caching to YAML has no idea where the catalog came from, making it difficult to effectively spool the catalog to disk like this. That would definitely be an improvement once we can make it, though.

Interestingly, I just tried this with a catalog of my own (only 7 seconds to compile), and it took around .5 seconds to save to PSON, but 5.6 seconds to save to YAML. On the other hand, it took 2.5 seconds to revive from PSON and only .94 to revive from YAML. So in any case, using PSON as the transmission format definitely seems optimal, and storing as PSON rather than YAML could also significantly reduce the overhead from caching it.

Luke Kanies

unread,
May 14, 2012, 6:39:19 PM5/14/12
to puppe...@googlegroups.com
On May 14, 2012, at 2:22 PM, Nick Lewis wrote:

On Monday, May 14, 2012 at 2:19 PM, Trevor Vaughan wrote:
Any suggestions on where to benchmark here? I tried to figure out
where to hit it but didn't really get anywhere with it.

I did try using pure ruby to load the catalog from disk using the
Puppet methods and it took 7 seconds in a case where Config retrieval
is reporting around 40 seconds.

The catalog is being serialized or deserialized three times: once to PSON on the master to transfer to the agent, once from PSON to revive on the agent, and once to YAML for the agent to cache it. That seems like it basically ought to account for the extra time, if one deserialize was 7 seconds.

JSON serialization is a negligible amount of time - most of that is being eaten up in YAML.

Just getting rid of that yaml serialization -- or switching to JSON -- would make a huge difference.

Brice Figureau

unread,
May 15, 2012, 3:35:32 AM5/15/12
to puppe...@googlegroups.com
I'll sound like a broken record, but that'd be awesome if we could
resurrect #2892 from the grave:
https://projects.puppetlabs.com/issues/2892

James Turnbull

unread,
May 15, 2012, 3:43:20 AM5/15/12
to puppe...@googlegroups.com
>> Just getting rid of that yaml serialization -- or switching to JSON --
>> would make a huge difference.
>>
>
> I'll sound like a broken record, but that'd be awesome if we could
> resurrect #2892 from the grave:
> https://projects.puppetlabs.com/issues/2892
>

Didn't that get supplanted by:

https://projects.puppetlabs.com/issues/3714

Brice Figureau

unread,
May 15, 2012, 4:17:08 AM5/15/12
to puppe...@googlegroups.com
On Tue, 2012-05-15 at 00:43 -0700, James Turnbull wrote:
> >> Just getting rid of that yaml serialization -- or switching to JSON --
> >> would make a huge difference.
> >>
> >
> > I'll sound like a broken record, but that'd be awesome if we could
> > resurrect #2892 from the grave:
> > https://projects.puppetlabs.com/issues/2892
> >
>
> Didn't that get supplanted by:
>
> https://projects.puppetlabs.com/issues/3714
>

It's additional: caching in JSON is good because it's faster.
I still think we don't need to re-serialize what we already have
serialized, so we'd gain a serialization in the cycle (ie 1/3 of the
total time retrieving the catalog), hence my request of resurrecting
#2892.

James Turnbull

unread,
May 15, 2012, 4:25:04 AM5/15/12
to puppe...@googlegroups.com
Brice Figureau wrote:
> On Tue, 2012-05-15 at 00:43 -0700, James Turnbull wrote:
>>>> Just getting rid of that yaml serialization -- or switching to JSON --
>>>> would make a huge difference.
>>>>
>>> I'll sound like a broken record, but that'd be awesome if we could
>>> resurrect #2892 from the grave:
>>> https://projects.puppetlabs.com/issues/2892
>>>
>> Didn't that get supplanted by:
>>
>> https://projects.puppetlabs.com/issues/3714
>>
>
> It's additional: caching in JSON is good because it's faster.
> I still think we don't need to re-serialize what we already have
> serialized, so we'd gain a serialization in the cycle (ie 1/3 of the
> total time retrieving the catalog), hence my request of resurrecting
> #2892.

I re-opened and assigned to Daniel for a comment.

Trevor Vaughan

unread,
May 15, 2012, 1:09:37 PM5/15/12
to puppe...@googlegroups.com
Just out of curiosity, since all clients report back to the server
what version of Puppet and Ruby version they are running, couldn't you
dynamically select a serialization format based on optimization
factors?

Marshal -> PSON -> YAML

Just curious.

Thanks,

Trevor

Andrew Parker

unread,
May 15, 2012, 1:34:12 PM5/15/12
to puppe...@googlegroups.com
That ticket (#2892) is about the amount of memory being used during conversion.
Is that still the problem? Or is the problem really now centered around the amount
of time that the conversion is taking? Either way using JSON all the way through
would probably fix both problems.

Luke Kanies

unread,
May 15, 2012, 1:43:58 PM5/15/12
to puppe...@googlegroups.com
Exactly - the problem is that we're using YAML for a potentially very large file, and definitely for a very complex data structure with lots of internal references. That problem results in lots of symptoms, including huge memory consumption and painful slowness.

Luke Kanies

unread,
May 15, 2012, 2:01:18 PM5/15/12
to puppe...@googlegroups.com
We're already always sending over JSON - we're just storing in YAML on the client. This is my stupidity in the beginning; I just couldn't get it done with the current model of how we do caching.

Aurélien Degrémont

unread,
May 15, 2012, 3:13:26 PM5/15/12
to puppe...@googlegroups.com, James Turnbull
Le 15/05/2012 10:25, James Turnbull a �crit :
> Brice Figureau wrote:
>> On Tue, 2012-05-15 at 00:43 -0700, James Turnbull wrote:
>>>>> Just getting rid of that yaml serialization -- or switching to JSON --
>>>>> would make a huge difference.
>>>>>
>>>> I'll sound like a broken record, but that'd be awesome if we could
>>>> resurrect #2892 from the grave:
>>>> https://projects.puppetlabs.com/issues/2892
>>>>
>>> Didn't that get supplanted by:
>>>
>>> https://projects.puppetlabs.com/issues/3714
>>>
>> It's additional: caching in JSON is good because it's faster.
>> I still think we don't need to re-serialize what we already have
>> serialized, so we'd gain a serialization in the cycle (ie 1/3 of the
>> total time retrieving the catalog), hence my request of resurrecting
>> #2892.
> I re-opened and assigned to Daniel for a comment.
>
When I see the performance difference with PSON, even without removing
the extra re-serialize, I think it could be a huge improvement if the
caching could be done in PSON. This will be a sensible speed up for big
catalogs! We need it!


Aur�lien

Daniel Pittman

unread,
May 15, 2012, 4:48:21 PM5/15/12
to puppe...@googlegroups.com
On Tue, May 15, 2012 at 1:25 AM, James Turnbull <ja...@puppetlabs.com> wrote:
> Brice Figureau wrote:
>> On Tue, 2012-05-15 at 00:43 -0700, James Turnbull wrote:
>>>>> Just getting rid of that yaml serialization -- or switching to JSON --
>>>>> would make a huge difference.
>>>>>
>>>> I'll sound like a broken record, but that'd be awesome if we could
>>>> resurrect #2892 from the grave:
>>>> https://projects.puppetlabs.com/issues/2892
>>>>
>>> Didn't that get supplanted by:
>>>
>>> https://projects.puppetlabs.com/issues/3714
>>>
>>
>> It's additional: caching in JSON is good because it's faster.
>> I still think we don't need to re-serialize what we already have
>> serialized, so we'd gain a serialization in the cycle (ie 1/3 of the
>> total time retrieving the catalog), hence my request of resurrecting
>> #2892.
>
> I re-opened and assigned to Daniel for a comment.

That seems reasonable. The real goal is better captured by the first
ticket - to make the "caching catalog" phase faster.

PSON support is desirable, but may or may not be related to the first step.

Thanks.

david-dasz

unread,
May 23, 2012, 4:19:41 AM5/23/12
to puppe...@googlegroups.com
>> Which version of Ruby and Puppet ?
>>
>> 'Config retrieval time' include the 'caching catalog time' (write
>> catalog on
>> disk in YAML format). Your catalog is pretty big, so caching could be
>> very
>> slow.
>> You can check this by add theses lines in
>> lib/puppet/indirector/indirection.rb on the agent side:
>>
>> +      beginning_time = Time.now
>>      Puppet.info "Caching #{self.name} for #{request.key}"
>>      cache.save request(:save, result, *args)
>> +      Puppet.debug "Caching catalog time: #{(Time.now -
>> beginning_time)}"

Here're my results for a catalog of ~2000 resources, some of which are
tidys of big directories:

[root@aaa ~]# time puppetd --test --noop
notice: Ignoring --listen on onetime run
info: Retrieving plugin
info: Loading facts in mysql_exists
info: Loading facts in mysql_exists
info: Caching catalog for aaa
info: Caching catalog time: 12.668795
info: Applying configuration version '1337759696'
notice: Finished catalog run in 65.62 seconds

real 4m41.662s
user 1m29.677s
sys 0m13.375s
[root@aaa ~]# puppetd --version
2.7.1
[root@aaa ~]# ruby --version
ruby 1.8.5 (2006-08-25) [x86_64-linux]
[root@aaa ~]# lsb_release -a
LSB
Version: :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: ScientificSL
Description: Scientific Linux SL release 5.4 (Boron)
Release: 5.4
Codename: Boron
[root@aaa ~]#

There is no excessive swap or IO while the agent is running. Compilation
of the catalog takes ~100s, mostly due to not yet having switched to
PuppetDB (;-)

Best Regards, David

Peter Meier

unread,
May 23, 2012, 9:00:48 AM5/23/12
to puppe...@googlegroups.com
>>> +      beginning_time = Time.now
>>>      Puppet.info "Caching #{self.name} for #{request.key}"
>>>      cache.save request(:save, result, *args)
>>> +      Puppet.debug "Caching catalog time: #{(Time.now -
>>> beginning_time)}"
>
> Here're my results for a catalog of ~2000 resources, some of which are
> tidys of big directories:

Here some results for a host with ~5K of resources:

[root@xxx (staff) ~]# ruby -v
ruby 1.8.6 (2010-09-02 patchlevel 420) [x86_64-linux]
[root@xxx (staff) ~]# lsb_release
LSB Version:
:core-4.0-amd64:core-4.0-ia32:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-ia32:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-ia32:printing-4.0-noarch
[root@xxx (staff) ~]# cat /etc/redhat-release
CentOS release 5.8 (Final)
[root@xxx (staff) ~]# rpm -qa | grep puppet
puppet-2.6.14-1.1.el5.centos

info: Caching catalog time: 102.1323


Compilation time is about 160.12s, probably switching to PuppetDB will
also save me here. ;)

~pete

Jeff McCune

unread,
May 23, 2012, 12:39:00 PM5/23/12
to puppe...@googlegroups.com
On Wed, May 23, 2012 at 6:00 AM, Peter Meier <peter...@immerda.ch> wrote:
+      beginning_time = Time.now
     Puppet.info "Caching #{self.name} for #{request.key}"
     cache.save request(:save, result, *args)
+      Puppet.debug "Caching catalog time: #{(Time.now -
beginning_time)}"

Here're my results for a catalog of ~2000 resources, some of which are
tidys of big directories:

Here some results for a host with ~5K of resources:

[root@xxx (staff) ~]# ruby -v
ruby 1.8.6 (2010-09-02 patchlevel 420) [x86_64-linux]

I can't stress how strongly I (we?) discourage the use of Ruby 1.8.6.  =)

We (Puppet Labs) really don't do any testing; quality, performance, or otherwise, against this version of Ruby.

I'm not saying you'll see a huge difference with MRI 1.8.7 but at least you'll be comparing similar results.

-Jeff

Peter Meier

unread,
May 23, 2012, 1:46:04 PM5/23/12
to puppe...@googlegroups.com
> I can't stress how strongly I (we?) discourage the use of Ruby 1.8.6. =)

Yeah, this is a backported version from one of the fedoras for EL5...

> We (Puppet Labs) really don't do any testing; quality, performance, or
> otherwise, against this version of Ruby.
>
> I'm not saying you'll see a huge difference with MRI 1.8.7 but at least
> you'll be comparing similar results.

I should anyways update that one (security wise), but are there any
known issues for 1.8.6 I'm not aware of and to which you are referring
(perfomance wise)?

thanks

~pete

signature.asc
Reply all
Reply to author
Forward
0 new messages