large file download, timeout?

1,023 views
Skip to first unread message

Fernando Padilla

unread,
Jul 17, 2009, 9:58:38 PM7/17/09
to puppet...@googlegroups.com
Hi. I'm a beginner, but I have a basic puppet setup working. I am
doing a manual tarball installation and it seems to be hanging then
eventually timing out on just downloading the file:

file { "/opt/hadoop-0.20.0.tar.gz":
source => "puppet:///hadoop020/hadoop-0.20.0.tar.gz"
}

I have another module that does the same things and works, my only guess
is the size of the tarball:

modules/hadoop020/files/hadoop-0.20.0.tar.gz - 41M
modules/zookeeper320/files/zookeeper-3.2.0.tar.gz - 12M

Any ideas or suggestions to speed up file transfers??

If I manually scp the file, it takes only 30seconds (between office and
ec2), why would it take so long and eventually timeout inside the colo (
ec2)?

Sylvain Avril

unread,
Jul 18, 2009, 11:13:12 AM7/18/09
to puppet...@googlegroups.com
I myself don't use puppet to pull big files.
Maybe you use puppet with the default Webrick HTTP frontend. You may
test another frontend like mongrel or passenger :
http://reductivelabs.com/trac/puppet/wiki/UsingMongrel
http://reductivelabs.com/trac/puppet/wiki/UsingPassenger

For my use, I use an HTTP server and a custom curl definition. But for
slow connections, it didn't resolve the timeout problem.

define common::archive::tar-gz($source, $target) {
exec {"$name unpack":
command => "curl ${source} | tar -xzf - -C ${target} && touch ${name}",
creates => $name
}
}

But the more elegant solution would be to package hadoop.

2009/7/18 Fernando Padilla <fe...@alum.mit.edu>

Peter Meier

unread,
Jul 19, 2009, 3:35:46 PM7/19/09
to puppet...@googlegroups.com
Hi

> Any ideas or suggestions to speed up file transfers??

try 0.25.0beta file serving should be heavily improved.

> If I manually scp the file, it takes only 30seconds (between office and
> ec2), why would it take so long and eventually timeout inside the colo (
> ec2)?

currently puppet uses xmlrpc, even to transfer files, which means that
every file have to be loaded into memory, escaped, sent over the wire,
de-escaped and written down to disc.

with the change to REST with the upcoming 0.25.0 this should have changed.

cheers pete

Fernando Padilla

unread,
Jul 19, 2009, 8:32:36 PM7/19/09
to puppet...@googlegroups.com
Thank you. I suppose that's an easy way around it.. I wonder if I want
puppetmaster to also host a simple apache..

Or.. does the "source" attribute support http/ftp over just file/puppet
protocols?



On Sat, 18 Jul 2009 13:13 +0200, "Sylvain Avril" <avr...@gmail.com>
wrote:

Fernando Padilla

unread,
Jul 19, 2009, 8:34:02 PM7/19/09
to puppet...@googlegroups.com
Oh. right. I remember reading about this..

So.. are there any rpm/deb packages for the latest puppet?? From the
standard apt-repository (for jaunty), it still only has 0.24.5 (not even
0.24.8).


On Sun, 19 Jul 2009 17:35 +0200, "Peter Meier" <peter...@immerda.ch>
wrote:

James Turnbull

unread,
Jul 19, 2009, 8:40:32 PM7/19/09
to puppet...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Fernando Padilla wrote:
> Oh. right. I remember reading about this..
>
> So.. are there any rpm/deb packages for the latest puppet?? From the
> standard apt-repository (for jaunty), it still only has 0.24.5 (not even
> 0.24.8).

http://reductivelabs.com/trac/puppet/wiki/DownloadingPuppet

Regards

James Turnbull

- --
Author of:
* Pro Linux Systems Administration
(http://tinyurl.com/linuxadmin)
* Pulling Strings with Puppet
(http://tinyurl.com/pupbook)
* Pro Nagios 2.0
(http://tinyurl.com/pronagios)
* Hardening Linux
(http://tinyurl.com/hardeninglinux)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpjvQAACgkQ9hTGvAxC30A9cQCfaj0JaH6AhnxNZNhreXjD6KH+
fAAAn10+2WmN6YT+yJ2mBFymjPZYrbNs
=jngG
-----END PGP SIGNATURE-----

Greg

unread,
Jul 20, 2009, 12:18:41 AM7/20/09
to Puppet Users
For files, its puppet or files only at this stage. (Though packages
can handle http so long as the package manager can understand a HTTP
file source...)

I started up some discussion on HTTP as a source for files as a means
to overcome this. (Linky:
http://groups.google.com.au/group/puppet-users/browse_thread/thread/b0d3004ad9daf40c/cf95773e76622eb5
)

Its apparently something that is in the works, and may well go some
way towards this under 0.25 when its available... I haven't really
looked at the 0.25 betas yet, so I'm not sure how its changed in this
regard...

For now, I'm making do with an NFS mount under autofs (shudder - Wish
I didn't have to do this), and will migrate all the large file serving
to run under http once the new release is available and tested... Not
ideal, but it means I can do:

file { "/nfs/mount/foo.pkg": ensure => present }
exec { "/nfs/mount/foo.sh":
refreshonly => true,
subscribe => File["/nfs/mount/foo.pkg"]
}

Of course this means that foo.sh gets called each time foo.pkg changes
(ie. upgrades...)

Greg

On Jul 20, 10:32 am, "Fernando Padilla" <f...@alum.mit.edu> wrote:
> Thank you. I suppose that's an easy way around it.. I wonder if I want
> puppetmaster to also host a simple apache..
>
> Or.. does the "source" attribute support http/ftp over just file/puppet
> protocols?
>
> On Sat, 18 Jul 2009 13:13 +0200, "Sylvain Avril" <avr...@gmail.com>
> wrote:
>
>
>
> > I myself don't use puppet to pull big files.
> > Maybe you use puppet with the default Webrick HTTP frontend. You may
> > test another frontend like mongrel or passenger :
> >http://reductivelabs.com/trac/puppet/wiki/UsingMongrel
> >http://reductivelabs.com/trac/puppet/wiki/UsingPassenger
>
> > For my use, I use an HTTP server and a custom curl definition. But for
> > slow connections, it didn't resolve the timeout problem.
>
> > define common::archive::tar-gz($source, $target) {
> >   exec {"$name unpack":
> >     command => "curl ${source} | tar -xzf - -C ${target} && touch
> >     ${name}",
> >     creates => $name
> >   }
> > }
>
> > But the more elegant solution would be to package hadoop.
>
> > 2009/7/18 Fernando Padilla <f...@alum.mit.edu>

Trevor Vaughan

unread,
Jul 20, 2009, 12:49:00 PM7/20/09
to puppet...@googlegroups.com
The only problem that I have with most methods is that they transfer
the whole file every time.

I prefer to set up an rsync server and use an rsync pull command to
only transfer the file changes if the file has changed.

I suppose that storing the files in SVN would work as well, but that
seems like an awful lot of overhead.

Trevor

Fernando Padilla

unread,
Jul 20, 2009, 1:24:47 PM7/20/09
to puppet...@googlegroups.com
i mean, if they support multiple protocols, why can't they support
rsync? Maybe through a ssh-proxy setup by puppet?

Fernando Padilla

unread,
Jul 20, 2009, 1:47:14 PM7/20/09
to puppet...@googlegroups.com
great, I'll look at what I can do.

It's weird there is no RubyGem for the latest 0.25...



sadly there is no deb package for jaunty, only karmic ( ubuntu alpha ).

Fernando Padilla

unread,
Jul 20, 2009, 5:19:25 PM7/20/09
to puppet...@googlegroups.com
I have now installed puppet using RubyGems, but nothing seems to work..

Below is the logs of me running install rubygems1.9, then having
rubygems install puppet. But, though it looks to have successfully
installed puppet, i can't find any puppet files.. (/etc/puppet,
/etc/init.d/puppetd, /usr/sbin/puppet*, etc etc etc)..

any help would be appreciated!



root@domU-12-31-39-00-E5-94:~# apt-get install rubygems1.9
...
root@domU-12-31-39-00-E5-94:~# gem install --remote puppet
Successfully installed facter-1.5.6
Successfully installed puppet-0.24.8
2 gems installed
Installing ri documentation for facter-1.5.6...
Installing ri documentation for puppet-0.24.8...
Updating class cache with 0 classes...
Installing RDoc documentation for facter-1.5.6...
Could not find main page README
Could not find main page README
Could not find main page README
Could not find main page README
Installing RDoc documentation for puppet-0.24.8...

Could not find main page README
Could not find main page README
Could not find main page README
Could not find main page README

James Turnbull

unread,
Jul 20, 2009, 6:13:35 PM7/20/09
to puppet...@googlegroups.com
Fernando Padilla wrote:
> great, I'll look at what I can do.
>
> It's weird there is no RubyGem for the latest 0.25...
>

It's not weird - I didn't create Gems for the beta. Partially because
you can't version a Ruby Gem with a numeric and text - for example
0.25.0beta1. This increases the risk of a 0.25.0 beta gem being
mistaken for production. If Gems allowed proper versioning then I'd
create them.

Regards

James Turnbull

signature.asc

Fernando Padilla

unread,
Jul 20, 2009, 7:03:14 PM7/20/09
to puppet...@googlegroups.com
Alright, so I have just added an apache2 server onto my puppet master,
and will host files there. Wow, it's much faster than any other options
( within ec2 ). It took just a few seconds to download the 42M file, i
blinked and it was done. :)

So until I can properly work out a good rpm/deb of 0.25, then this will
be my solution :) :)

thank you.

James Turnbull

unread,
Jul 20, 2009, 7:26:18 PM7/20/09
to puppet...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Fernando Padilla wrote:
> Alright, so I have just added an apache2 server onto my puppet master,
> and will host files there. Wow, it's much faster than any other options
> ( within ec2 ). It took just a few seconds to download the 42M file, i
> blinked and it was done. :)
>
> So until I can properly work out a good rpm/deb of 0.25, then this will
> be my solution :) :)
>

Well you could try here:

http://tmz.fedorapeople.org/repo/puppet/

And I am pretty sure someone on the list built .deb files too.

Regards

James Turnbull

- --


Author of:
* Pro Linux Systems Administration
(http://tinyurl.com/linuxadmin)
* Pulling Strings with Puppet
(http://tinyurl.com/pupbook)
* Pro Nagios 2.0
(http://tinyurl.com/pronagios)
* Hardening Linux
(http://tinyurl.com/hardeninglinux)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpk/RoACgkQ9hTGvAxC30AHmgCg0Sfouhy1OO7p8fMLpw5tJixo
QGgAoMdOP97xFOSbsWXZwQsY8X9ZysFo
=Idu1
-----END PGP SIGNATURE-----

Matt

unread,
Jul 21, 2009, 9:02:54 AM7/21/09
to puppet...@googlegroups.com
If you're hosting files in S3, i've been having great success with using :-

http://reductivelabs.com/trac/puppet/wiki/Recipes/AmazonWebService

Basically uses curl to pull an authenticated URL.

2009/7/21 James Turnbull <ja...@lovedthanlost.net>:

Peter Meier

unread,
Jul 21, 2009, 4:13:07 PM7/21/09
to puppet...@googlegroups.com
Fernando Padilla wrote:
> i mean, if they support multiple protocols, why can't they support
> rsync? Maybe through a ssh-proxy setup by puppet?

maybe, because nobody have yet written the rsync part?!

cheers pete

Reply all
Reply to author
Forward
0 new messages