fileserver and distributing many files

1,419 views
Skip to first unread message

Thomas Rasmussen

unread,
Feb 28, 2011, 3:36:00 AM2/28/11
to Puppet Users
Hello

I'm beginning to learn howto puppet but I have stumbled upon a problem
that I can't seem to find a solution to.

We want to use puppet to deploy applications (MySQL and JBOSS
installations) but when I create a manifests like this

class mysql::version559 {
include mysql_user
file { "/path/mysql-5.5.9":
ensure => directory,
recurse => true,
purge => true,
force => true,
owner => "root",
group => "root",
source => "puppet://puppet-server/modules/mysql/mysql-5.5.9",
}
}

it runs very slowly, with about 3-5 seconds per file (even the small
ones!) taking several hours to deploy 300MB which I really don't like

We are running a pretty basic setup with puppet 2.6.4 installed on
Ubuntu 10.04.02 LTS on both clients and server. I tried to make a tar-
ball of the directory, and then it only takes a minute or two
transferring about 3 MB/seconds which is OK.

Will switching to mongrel instead of webrick solve my problem? The
server is not loaded (only servering one client at the moment) but
while running puppetd, the client is 100% cpu loaded.

Any ideas on what the best solution is? It is NOT a solution to simply
setup a manifests that installs the app from the ubuntu repository. Is
there any way of using ie. rsync to deploy the files instead of
puppet?

Thanks

Regards
Thomas

Patrick

unread,
Feb 28, 2011, 3:53:45 AM2/28/11
to puppet...@googlegroups.com

On Feb 28, 2011, at 12:36 AM, Thomas Rasmussen wrote:

> Hello
>
> I'm beginning to learn howto puppet but I have stumbled upon a problem
> that I can't seem to find a solution to.
>
> We want to use puppet to deploy applications (MySQL and JBOSS
> installations) but when I create a manifests like this
>
> class mysql::version559 {
> include mysql_user
> file { "/path/mysql-5.5.9":
> ensure => directory,
> recurse => true,
> purge => true,
> force => true,
> owner => "root",
> group => "root",
> source => "puppet://puppet-server/modules/mysql/mysql-5.5.9",
> }
> }
>
> it runs very slowly, with about 3-5 seconds per file (even the small
> ones!)

Assuming your definition of small matches mine (less that 50Kb), in my experience Puppet will only do this if the server is loaded (not applicable to you) or if you have high latency. (more than 100ms ping) Switching away from Webrick is strongly advised because 2 clients running at the same time can heavily load it down when serving files, but I know that doesn't apply to you.

In my case, I use an exec managed by puppet that uses rsync to sync the files at 2am. Here it is although it doesn't sound like it's very useful to you. There's also a bit more code to force it to run on the first run using a creates.


exec { "/usr/bin/rsync -avz simba.outer::www/ /var/www/":
schedule => long_maintenance,
require => [Package["apache2"], Package["rsync"]],
}

schedule { long_maintenance:
period => daily,
repeat => 1,
range => "1:30 - 2:30",
}


> Any ideas on what the best solution is? It is NOT a solution to simply
> setup a manifests that installs the app from the ubuntu repository. Is
> there any way of using ie. rsync to deploy the files instead of
> puppet?

Again, I'm giving you what you asked for, but this is rather simple.

Thomas Rasmussen

unread,
Feb 28, 2011, 5:19:20 AM2/28/11
to Puppet Users
Hi

My network is 100Mbit (approximately, but through a VPN so not that
fast) with latency around 2ms (right now our test-setup is running on
servers right beside each other :-))

I have tried to switch to passenger and this does not seem that much
faster, still uses very very long time to run. I now have tried to
copy the tar-ball and unpackaged this to the target directory and run
puppet again, now it just wants to correct permissions (which is OK
because they are wrong in the tar-ball) and this takes 2-5 seconds per
file!) which is pretty much unusable (I still have the manifest to
copy from the master to clients)

I'm not that happy about making a solution like yours, it might be the
solution we choose but I really don't see this as the best one. I'd
rather have puppet serve the files on its own, but it seems as though
it is not feasible?

Still hopes for solutions :-)

Thomas

Daniel Piddock

unread,
Feb 28, 2011, 5:28:01 AM2/28/11
to puppet...@googlegroups.com
On 28/02/11 10:19, Thomas Rasmussen wrote:
> Hi
>
> My network is 100Mbit (approximately, but through a VPN so not that
> fast) with latency around 2ms (right now our test-setup is running on
> servers right beside each other :-))
>
> I have tried to switch to passenger and this does not seem that much
> faster, still uses very very long time to run. I now have tried to
> copy the tar-ball and unpackaged this to the target directory and run
> puppet again, now it just wants to correct permissions (which is OK
> because they are wrong in the tar-ball) and this takes 2-5 seconds per
> file!) which is pretty much unusable (I still have the manifest to
> copy from the master to clients)

Directory recursion is horribly inefficient and quite broken. It's
md5summing every source and target file twice. It then md5sums the
target again at the end to ensure it did it right.

See https://projects.puppetlabs.com/issues/5650 , 6003 and 6004.

> I'm not that happy about making a solution like yours, it might be the
> solution we choose but I really don't see this as the best one. I'd
> rather have puppet serve the files on its own, but it seems as though
> it is not feasible?

If you *really* want puppet to manage the files you have two solutions:
* Put up with the horrible delay and brokenness until it's eventually fixed.
* List each file and subdirectory in your manifest.

Personally, I went with rsync run from a script with a schedule of
daily, similar to Patrick.
exec { '/usr/local/scripts/installMaps.sh':
schedule => daily,

Thomas Rasmussen

unread,
Feb 28, 2011, 7:23:19 AM2/28/11
to Puppet Users
hey

OK, now I have tried to do it via rsync and it seems to be working...
but the recurse bug is apparently very serious... I now have a
manifest that does:

file { "/pack/mysql-5.5.9":
ensure => directory,
recurse => true,
force => true,
owner => "root",
group => "root",
require => Exec[rsync_mysql_install],
}

This takes about the same time as if I was copying (I need to be sure
of permissions of rsync'ed files). Is the recurse feature really that
bad?

Thomas

On Feb 28, 11:28 am, Daniel Piddock <dgp-g...@corefiling.co.uk> wrote:
> On 28/02/11 10:19, Thomas Rasmussen wrote:
>
> > Hi
>
> > My network is 100Mbit (approximately, but through a VPN so not that
> > fast) with latency around 2ms (right now our test-setup is running on
> > servers right beside each other :-))
>
> > I have tried to switch to passenger and this does not seem that much
> > faster, still uses very very long time to run. I now have tried to
> > copy the tar-ball and unpackaged this to the target directory and run
> > puppet again, now it just wants to correct permissions (which is OK
> > because they are wrong in the tar-ball) and this takes 2-5 seconds per
> > file!) which is pretty much unusable (I still have the manifest to
> > copy from the master to clients)
>
> Directory recursion is horribly inefficient and quite broken. It's
> md5summing every source and target file twice. It then md5sums the
> target again at the end to ensure it did it right.
>
> Seehttps://projects.puppetlabs.com/issues/5650, 6003 and 6004.

Daniel Piddock

unread,
Feb 28, 2011, 7:33:20 AM2/28/11
to puppet...@googlegroups.com
On 28/02/11 12:23, Thomas Rasmussen wrote:
> hey
>
> OK, now I have tried to do it via rsync and it seems to be working...
> but the recurse bug is apparently very serious... I now have a
> manifest that does:
>
> file { "/pack/mysql-5.5.9":
> ensure => directory,
> recurse => true,
> force => true,
> owner => "root",
> group => "root",
> require => Exec[rsync_mysql_install],
> }
>
> This takes about the same time as if I was copying (I need to be sure
> of permissions of rsync'ed files). Is the recurse feature really that
> bad?

It'll still be reading and md5summing all those files multiple times
even though there's nothing to compare against. Try setting "checksum =>
none" to disable it.

If it's still really slow, run "puppet agent --test" or "puppet apply"
in strace to see what's taking so long. Search for known file names.

Dan

Patrick

unread,
Feb 28, 2011, 3:35:32 PM2/28/11
to puppet...@googlegroups.com

On Feb 28, 2011, at 4:23 AM, Thomas Rasmussen wrote:

> hey
>
> OK, now I have tried to do it via rsync and it seems to be working...
> but the recurse bug is apparently very serious... I now have a
> manifest that does:
>
> file { "/pack/mysql-5.5.9":
> ensure => directory,
> recurse => true,
> force => true,
> owner => "root",
> group => "root",
> require => Exec[rsync_mysql_install],
> }
>
> This takes about the same time as if I was copying (I need to be sure
> of permissions of rsync'ed files). Is the recurse feature really that
> bad?

If the permissions you need to be sure of are all "root,root,755", it will be much faster to just do a chmod+chown at the end and put that and the rsync in a shellscript.

> --
> You received this message because you are subscribed to the Google Groups "Puppet Users" group.
> To post to this group, send email to puppet...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>

Thomas Rasmussen

unread,
Mar 1, 2011, 2:58:03 AM3/1/11
to Puppet Users


On Feb 28, 9:35 pm, Patrick <kc7...@gmail.com> wrote:
> On Feb 28, 2011, at 4:23 AM, Thomas Rasmussen wrote:
>
>
>
>
>
> > hey
>
> > OK, now I have tried to do it via rsync and it seems to be working...
> > but the recurse bug is apparently very serious... I now have a
> > manifest that does:
>
> >    file { "/pack/mysql-5.5.9":
> >      ensure => directory,
> >      recurse => true,
> >      force => true,
> >      owner => "root",
> >      group => "root",
> >      require => Exec[rsync_mysql_install],
> >    }
>
> > This takes about the same time as if I was copying (I need to be sure
> > of permissions of rsync'ed files). Is the recurse feature really that
> > bad?
>
> If the permissions you need to be sure of are all "root,root,755", it will be much faster to just do a chmod+chown at the end and put that and the rsync in a shellscript.
>

That is a hack but not a solution, honestly I'm really disappointed in
puppet not handling this very effeciently.... I've been used to
cfengine2 for the past year or so, but for a particular project we
have decided to use puppet, mainly because we like the manifests
better here than on cfengine...

So I hope that the performance issue when recursing directories gets
attention and gets fixed soon.

Thomas

Patrick

unread,
Mar 1, 2011, 4:37:02 AM3/1/11
to puppet...@googlegroups.com

On Feb 28, 2011, at 11:58 PM, Thomas Rasmussen wrote:

>
>
> On Feb 28, 9:35 pm, Patrick <kc7...@gmail.com> wrote:
>> On Feb 28, 2011, at 4:23 AM, Thomas Rasmussen wrote:
>>
>>
>>
>>
>>
>>> hey
>>
>>> OK, now I have tried to do it via rsync and it seems to be working...
>>> but the recurse bug is apparently very serious... I now have a
>>> manifest that does:
>>
>>> file { "/pack/mysql-5.5.9":
>>> ensure => directory,
>>> recurse => true,
>>> force => true,
>>> owner => "root",
>>> group => "root",
>>> require => Exec[rsync_mysql_install],
>>> }
>>
>>> This takes about the same time as if I was copying (I need to be sure
>>> of permissions of rsync'ed files). Is the recurse feature really that
>>> bad?
>>
>> If the permissions you need to be sure of are all "root,root,755", it will be much faster to just do a chmod+chown at the end and put that and the rsync in a shellscript.
>>
>
> That is a hack but not a solution, honestly I'm really disappointed in

> puppet not handling this very efficiently.... I've been used to


> cfengine2 for the past year or so, but for a particular project we
> have decided to use puppet, mainly because we like the manifests
> better here than on cfengine...
>
> So I hope that the performance issue when recursing directories gets
> attention and gets fixed soon.
>

Frankly, many people have said this in the past for the last few years. It's a lot better, (yes it's a lot better) but I'm not very hopeful for it in the near future. This has been a problem for years.

Brice Figureau

unread,
Mar 1, 2011, 5:13:02 AM3/1/11
to puppet...@googlegroups.com
On Mon, 2011-02-28 at 23:58 -0800, Thomas Rasmussen wrote:
>
> On Feb 28, 9:35 pm, Patrick <kc7...@gmail.com> wrote:
> > On Feb 28, 2011, at 4:23 AM, Thomas Rasmussen wrote:
> >
> >
> >
> >
> >
> > > hey
> >
> > > OK, now I have tried to do it via rsync and it seems to be working...
> > > but the recurse bug is apparently very serious... I now have a
> > > manifest that does:
> >
> > > file { "/pack/mysql-5.5.9":
> > > ensure => directory,
> > > recurse => true,
> > > force => true,
> > > owner => "root",
> > > group => "root",
> > > require => Exec[rsync_mysql_install],
> > > }
> >
> > > This takes about the same time as if I was copying (I need to be sure
> > > of permissions of rsync'ed files). Is the recurse feature really that
> > > bad?
> >
> > If the permissions you need to be sure of are all "root,root,755", it will be much faster to just do a chmod+chown at the end and put that and the rsync in a shellscript.
> >
>
> That is a hack but not a solution, honestly I'm really disappointed in
> puppet not handling this very effeciently.... I've been used to
> cfengine2 for the past year or so, but for a particular project we
> have decided to use puppet, mainly because we like the manifests
> better here than on cfengine...

You need to understand the issue first:
When puppet manage a files (ie all aspects of it), it creates internally
a resource object (like every other resources you manage). Then this
resource is evaluated and the said resources do what is necessary so
that its target (the file on disk) is modified to match what you wanted.
To simplify recursive file resources management, puppet will manage all
the files found by recursively walking the hierarchy as if they were
independent resources. Which means that puppet has to create many
instances in memory (one per managed file/dir found during the walk).
This in turn expose some scalability issues in the event system and
transaction system of puppet.

> So I hope that the performance issue when recursing directories gets
> attention and gets fixed soon.

I also hope this will be the case. Unfortunately each time I wanted to
address this issue, I found that this part of the code is horribly
complex, and all my attempts to do it differently were doomed to fail.
--
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

Jeffrey Goldschrafe

unread,
Mar 1, 2011, 9:24:38 AM3/1/11
to puppet...@googlegroups.com
Is there any reason it's not feasible to build a package for your distribution and then use Puppet to install it? If there's a "right" way to do what you want, that's going to be it.
Reply all
Reply to author
Forward
0 new messages