About puppet and fabric (WAS: operational overhead for HBase)

Showing 1-7 of 7 messages
About puppet and fabric (WAS: operational overhead for HBase) Alex Holmes 8/17/11 1:42 PM
Hi,

On thread "operational overhead for HBase", J-D gave out some
interesting insights into automated deployments:

  " - Have tools to automate cluster maintenance, such as doing
rolling upgrades. We use Puppet and Fabric[2]."

I'm currently evaluating the use of Puppet for Hadoop/HBase automated
deploys and Fabric looks a lot simpler and more descriptive.  I'm
curious how well Fabric would work in its own right without Puppet for
automate installs?

Apologies if this isn't 100% related to HBase.

Cheers,
Alex

Re: About puppet and fabric (WAS: operational overhead for HBase) Ryan Rawson 8/17/11 2:33 PM
I think my assessment would be that everyone has their pre chosen toolset
and goes with it. You can make any of them work (with enough effort).

Personally, we are using chef. They are building service orchestration,
which few toolsets support.

Re: About puppet and fabric (WAS: operational overhead for HBase) Jean-Daniel Cryans 8/17/11 4:45 PM
> I'm currently evaluating the use of Puppet for Hadoop/HBase automated
> deploys and Fabric looks a lot simpler and more descriptive.  I'm
> curious how well Fabric would work in its own right without Puppet for
> automate installs?

I'll let my puppet masters answer that.

>
> Apologies if this isn't 100% related to HBase.

Still sorta is :)

J-D

Re: About puppet and fabric (WAS: operational overhead for HBase) Dave Barr 8/17/11 8:17 PM
On Wed, Aug 17, 2011 at 4:45 PM, Jean-Daniel Cryans <jdcr...@apache.org> wrote:
>> I'm currently evaluating the use of Puppet for Hadoop/HBase automated
>> deploys and Fabric looks a lot simpler and more descriptive.  I'm
>> curious how well Fabric would work in its own right without Puppet for
>> automate installs?
>
> I'll let my puppet masters answer that.

Hi!  :)

Using puppet for hadoop/hbase deploys is a fine option.

The downside of puppet is that it's not very efficient in deploying.
It checksums every file it is told about, and being written in Ruby,
checksumming isn't fast. If you want rapid pushes, rsync will be
faster (but less elegant).  But if you can kick off puppet runs in
parallel, you probably don't care too much how slow an individual box
is to update.

Having a pull model is nice because you can just bring a new node
online and bam, you'll know the correct files will be there.  You
don't have to remember to push it.

The downside of puppet is of course you can't really control when
changes go out, and you can't easily test changes without putting
complexity at the puppet level.

--Dave

Re: About puppet and fabric (WAS: operational overhead for HBase) Aravind Gottipati 8/18/11 7:01 AM
On Wed, Aug 17, 2011 at 4:45 PM, Jean-Daniel Cryans <jdcr...@apache.org> wrote:
>> I'm currently evaluating the use of Puppet for Hadoop/HBase automated
>> deploys and Fabric looks a lot simpler and more descriptive.  I'm
>> curious how well Fabric would work in its own right without Puppet for
>> automate installs?
>
> I'll let my puppet masters answer that.

We use puppet to sync out (hadoop) config files and other static O.S
settings (sysctl, fstab, etc).  We could also use puppet to distribute
hbase and hadoop builds as well, but usually a hbase deployment is
followed by a rolling restart of the cluster.  imo, puppet is okay
with configuration management tasks, but not really great at
orchestrating a sequence of steps across multiple machines. Fab on the
other hand works well for that kind of stuff.  We use fab to manage
rolling restarts, draining nodes in a cluster, running quick checks of
versions across the cluster, etc..  Here is the list of fab tasks we
use to manage our cluster.  Doing something similar with puppet would
take more puppet wrangling than I am comfortable with.

$ fab -l
Available commands:

    assert-configs    Check that all the region servers have the same config...
    assert-regions    Check that all the regions have been vacated from the ...
    assert-release    Check the release running on the server.
    deploy-hbase      Deploy the new hbase release to the regionserver.
    disable-balancer  Disable the balancer.
    dist-hadoop       Rsyncs the hadoop release to the region servers.
    dist-hbase        Rsyncs the hbase release to the region servers.
    dist-release      Rsyncs the release to the region servers.
    enable-balancer   Balance regions and enable the balancer.
    hadoop-start      Start hadoop.
    hbase-start       Start hbase.
    hbase-stop        Stop hbase (WARNING: does not unload regions).
    jmx-kill          Kill JMX collectors.
    prep-release      Copies the tar file from face and extract it.
    rolling-restart   Rolling restart of the whole cluster.
    thrift-restart    Re-start thrift.
    thrift-start      Start thrift.
    thrift-stop       Stop thrift.
    unload-regions    Un-load HBase regions on the server so it can be shut ...

$

Re: About puppet and fabric (WAS: operational overhead for HBase) Andrew Purtell 8/18/11 10:16 AM
> From: Aravind Gottipati <ara...@freeshell.org>

> imo, puppet is okay> with configuration management tasks, but not really great at
> orchestrating a sequence of steps across multiple machines.

This is our experience as well.

We use Puppet to maintain a synchronized static configuration for the OS, Hadoop, and HBase but found that doing more requires contortions. We also found the Puppet language to be brittle, error messages to be unhelpful, and node variable inheritance to be directly counterintuitive.

Fabric sounds interesting. For orchestrating startup, rolling restart, shutdowns, etc. we use hand crafted shell scripts.

Best regards,


    - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)

Re: About puppet and fabric (WAS: operational overhead for HBase) Time Less 8/22/11 3:41 PM
Getting it working with Chef was about 10 days of full-time work, including
setting up a Chef server, bootstrapping the cluster, and writing recipes.
Then maybe another 2-3 days for extending it in various ways beyond basic
functionality.

--
Tim