Possible fix?

1 view
Skip to first unread message

Ken

unread,
Apr 13, 2010, 12:50:51 PM4/13/10
to Puppet Developers
One of the guys here (who is too busy to post this himself :-) has
proposed a slight change to the file: puppet/parser/lexer.rb

<snip>
# this is probably pretty damned inefficient...
# it'd be nice not to have to load the whole file first...
def file=(file)
@file = file
@line = 1
File.open(file) { |of|
str = ""
- of.each { |line| str += line }
+ of.each { |line| str << line }
@scanner = StringScanner.new(str)
}
</snip>

The idea being that << is more efficient in memory than +=:

http://www.rubyfleebie.com/appending-to-a-string/

We were suffering due to very stupid large node_foo.pp files (and yes
- we are moving to external nodes soon). And this seemed to help.

I'm no Ruby expert and just wanted to open debate - does this sound
like a viable (and simple) improvement?

ken.

Luke Kanies

unread,
Apr 13, 2010, 2:14:08 PM4/13/10
to puppe...@googlegroups.com

Certainly seems simple and viable to me.

And really, it'd be nice to replace StringScanner with something that
didn't require we read the whole file in, but it looks like it doesn't
support that, so we'd have to start from scratch.

--
This space intentionally has nothing but text explaining why this
space has nothing but text explaining that this space would otherwise
have been left blank, and would otherwise have been left blank.
---------------------------------------------------------------------
Luke Kanies -|- http://puppetlabs.com -|- +1(615)594-8199

Markus Roberts

unread,
Apr 13, 2010, 2:19:18 PM4/13/10
to puppet-dev
 It does, but on first reading I'm not seeing why we aren't doing:


def file=(file)
  @file = file
  @line = 1
  @scanner = StringScanner.new(File.read(file))
 :
 :

instead.  It should be even faster and (and least from the code shown here) have exactly the same effect.

-- Markus

Ken

unread,
Apr 14, 2010, 10:11:50 AM4/14/10
to Puppet Developers
Some numbers to chew on. I did all this using the tip of the 0.25.x
branch.

Test manifests: This is with 55 *.pp files ... totalling 63526 Puppet
dsl lines across all files.
System: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz, 3G ram,
Ubuntu 10.x (64 bit)


<No changes>
kbarber@purple:~/tmp/puppet/lak/jaro$ time ~/tmp/puppet/lak/real-
puppet/bin/puppet --noop ~/tmp/puppet/lak/jaro/manifests/site.pp --
modulepath=/home/ken/tmp/puppet/lak/jaro/modules/
Could not find default node or by name with 'purple.rim.net,
purple.rim, purple' on node purple.rim.net

real 3m0.705s
user 2m54.930s
sys 0m5.690s
kbarber@purple:~/tmp/puppet/lak/jaro$

< With KW's change (ie. << from +=) .... >

kbarber@purple:~/tmp/puppet/lak/jaro$ time ~/tmp/puppet/lak/real-
puppet/bin/puppet --noop ~/tmp/puppet/lak/jaro/manifests/site.pp --
modulepath=/home/ken/tmp/puppet/lak/jaro/modules/
Could not find default node or by name with 'purple.rim.net,
purple.rim, purple' on node purple.rim.net

real 0m18.183s
user 0m16.250s
sys 0m1.900s
kbarber@purple:~/tmp/puppet/lak/jaro$

More samples:

18.218
18.025
18.154

< With MR's suggestion >

kbarber@purple:~/tmp/puppet/lak/jaro$ time ~/tmp/puppet/lak/real-
puppet/bin/puppet --noop ~/tmp/puppet/lak/jaro/manifests/site.pp --
modulepath=/home/ken/tmp/puppet/lak/jaro/modules/
Could not find default node or by name with 'purple.rim.net,
purple.rim, purple' on node purple.rim.net
real 0m17.936s
user 0m16.020s
sys 0m1.930s
kbarber@purple:~/tmp/puppet/lak/jaro$

More samples:

17.936
18.011
17.945

So your recommendation Markus is slightly faster than KW's - and
probably easier to read as well. ~ 2 minutes 40 second average saving
though either way ... which is quite good.

If we are happy, what are the chances of getting this in 0.25.5?

ken.

Markus Roberts

unread,
Apr 14, 2010, 12:39:24 PM4/14/10
to puppet-dev
If we are happy, what are the chances of getting this in 0.25.5?

No chance on 0.25.5 since we are in the rc cycle, within hours of blessing 0.25.5rc1 as 0.25.5 and after the pain of 0.25.2-4 we've vowed to live virtuous lives and not try to slip changes, no matter how innocuous seeming, in at this point in the cycle. 

On the other hand, the next feature release (2.6.0, aka Rowlf) is preparing for liftoff and still accepting new changes, and there will probably be a 0.25.6 point release within the next month as well, so the learnings made here won't languish on the sidelines.

Thanks for all the effort, and especially for the quantification of the results!  It's always much better to let data drive these things.

-- Markus

Luke Kanies

unread,
Apr 14, 2010, 12:40:58 PM4/14/10
to puppe...@googlegroups.com
On Apr 14, 2010, at 7:11 AM, Ken wrote:

> Some numbers to chew on. I did all this using the tip of the 0.25.x
> branch.

> [...]


> So your recommendation Markus is slightly faster than KW's - and
> probably easier to read as well. ~ 2 minutes 40 second average saving
> though either way ... which is quite good.

Heh, I think you do yourself a disservice - 2 minutes and 40 seconds
sounds good, but 90% reduction in consumed time sounds far better and
is correct.

It looks like you ran that with some odd options, probably so you just
tested the compile, right?

For the record, though, for most purposes this is a one-time cost:
long-running servers will only do this once for every change in
configurations. Still expensive, but not as horrible as it looks.

> If we are happy, what are the chances of getting this in 0.25.5?

Hmm. We're all ready to cut the release, and my guess is that Markus
and James are a bit sketchy on the idea of including this at this late
stage.

OTOH, it's a wicked-small patch with a big improvement, so I think it
should go in.

On a related note, I've backported my performance improvement to 0.25.x:

http://github.com/lak/puppet/tree/tickets/0.25.x/1903-metaparam_searching_is_faster

Could someone who has slow compiles do a similar comparison on that?

It's not quite as small a fix, but it might have close to the same
performance improvements, and I might push James and Markus to include
that, if we can get good numbers.

> --
> You received this message because you are subscribed to the Google
> Groups "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en
> .
>


--
The great thing about television is that if something important
happens anywhere in the world, day or night, you can always change
the channel. -- From "Taxi"

Ken

unread,
Apr 14, 2010, 12:49:55 PM4/14/10
to Puppet Developers
> On a related note, I've backported my performance improvement to 0.25.x:
> http://github.com/lak/puppet/tree/tickets/0.25.x/1903-metaparam_searc...

KW (irc nic - matti) asked me to test that as well patch as well. I've
sent him the results - I'll let him take up the discussion with you on
irc.

ken.

Ken

unread,
Apr 14, 2010, 12:53:48 PM4/14/10
to Puppet Developers
Fair enough. Shall I raise a bug with the patch + results etc.?

Peter Meier

unread,
Apr 14, 2010, 12:58:49 PM4/14/10
to puppe...@googlegroups.com
Zitat von Ken <k...@bob.sh>:

> Fair enough. Shall I raise a bug with the patch + results etc.?

yes please.

cheers pete

Luke Kanies

unread,
Apr 14, 2010, 12:59:10 PM4/14/10
to puppe...@googlegroups.com

I look forward to it. Any hints? :)

--
The great aim of education is not knowledge but action.
-- Herbert Spencer

Ken

unread,
Apr 14, 2010, 1:01:17 PM4/14/10
to Puppet Developers
No noticeable improvement basically. Not in our test case anyway.

Test with change: 261b3ce0800b7c2c9c2bee4d797c90541c510be7
(Fixing #1903 - metaparam inheritance is much faster)

kbarber@purple:~/tmp/puppet/lak/jaro$ time ~/tmp/puppet/lak/real-
puppet/bin/puppet --noop ~/tmp/puppet/lak/jaro/manifests/site.pp --
modulepath=/home/ken/tmp/puppet/lak/jaro/modules/
Could not find default node or by name with 'purple.rim.net,
purple.rim, purple' on node purple.rim.net

real 2m59.585s
user 2m54.190s
sys 0m5.380s
kbarber@purple:~/tmp/puppet/lak/jaro$

ken.

Ken

unread,
Apr 14, 2010, 1:16:00 PM4/14/10
to Puppet Developers

Luke Kanies

unread,
Apr 14, 2010, 1:47:07 PM4/14/10
to puppe...@googlegroups.com
Is this on top of your patch?

I'd only expect to see a noticeable improvement for cases where
compiling itself is slow - e.g., the main test that stuck out was
cutting a 10 second compile to 3 seconds.

The cost of this process is closely linked to the number of resources
in the catalog and the depth of the graph, so you have to have a
certain amount of complexity there to see much benefit.

Sent from mobile device

Ken

unread,
Apr 14, 2010, 3:54:09 PM4/14/10
to Puppet Developers
I ran the test without any other patch - I just cherry picked your
change into the 0.25.x branch.

I don't believe complexity is our problem at the moment as the
manifest development has only just started here. So I doubt our data
sample can confirm your enhancements atm, except for the obvious "it
didn't make it any worse" :-).

Luke Kanies

unread,
Apr 14, 2010, 3:55:33 PM4/14/10
to puppe...@googlegroups.com
Ah, I see.

Well, a good validation, anyway.

Thanks.

> For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en
> .
>


--
I don't always know what I'm talking about, but I'm always pretty much
convinced that I'm right. -- Mojo Nixon

Markus Roberts

unread,
Apr 29, 2010, 5:16:01 PM4/29/10
to puppet-dev
If we are happy, what are the chances of getting this in 0.25.5?

So the answer has changed from "no" to "yes" on this; we ran into issues with 0.25.5rc1 that are going to require a 0.25.5rc2 so I'll be sweeping up a few small targeted changes such as this.

-- Markus
 

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages