puppet modules for rdbms

26 views
Skip to first unread message

Chris Price

unread,
May 25, 2012, 1:44:01 PM5/25/12
to puppe...@googlegroups.com
Hey folks,

I'm surveying the landscape of existing puppet modules that have to do with relational database functionality.  I've talked with a few folks now and it seems like there is a fairly broad range of opinions on what "ideal" would look like w/rt managing rdbms with puppet.

At one extreme, there is the opinion that each RDBMS should have its own specific puppet module... so, one module for mysql, one for postgres, one for sqlite, etc.  With this approach, the possibility exists that there will be redundant code and divergent models from one module to the next.  However, depending on how much functionality the module is trying to provide, it's entirely possible that this divergence is inevitable (because the deeper you get into an individual database server, the more their implementation and behaviors differ).

An obvious upside of this approach is that module authors and contributors don't need to worry about implementation details of *other* rdbms's.  If you know mysql and you want to contribute to the mysql module, you can just do it.

At the other end of the spectrum is the idea of having a sort of uber-module for rdbms; this would contain some defined types that were meaningful across several rdbms platforms, and providers for each supported rdbms.  The main upside of this approach would be that it would be easy to switch between your choice of rdbms by simply toggling a parameter in your manifests.  Another upside would be the ability to share common code / logic where applicable (and hopefully including a lot of the test logic, meaning that all implementations would be more thoroughly and consistently tested).

The biggest downside here would be that the barrier to contribution would be a bit higher; if you wanted to add a feature for a specific rdbms instance, you'd potentially need to consider how the other several rdbms providers would cope with this new feature.

There are obviously some intermediate options in between those two extremes.  At the moment, I think that I'm reasonably convinced that the barrier to contribution to the modules is the most compelling variable in this equation, and thus I'm leaning towards just keeping the modules separated and isolated... though I'm struggling with the decision to give up the possibility of re-usable test logic between the modules.

Would love to hear any ideas, suggestions, thoughts that anyone has on the topic!

Thanks
Chris

Ashley Penney

unread,
May 25, 2012, 2:49:41 PM5/25/12
to puppe...@googlegroups.com
I would vote in favor of a single module with multiple providers.  As Puppet grows up and becomes more powerful I think this model is going to become more common anyway - once you move beyond basic modules you start relying on more and more functions and "real code".  The pool of people who are likely to contribute to community modules like these are the people that will either already use these more advanced features or will be interested in moving in that direction anyway.  I think the number of people who would contribute to a per database module but not a provider is relatively small and it is better to provide the best solution rather than the one that snares the most people.

I'm mostly basing my experiences on the puppetlabs-firewall module.  I recently attempted to add pkttype to the provider and was able to get it to the stage that the tests pass and the code seems to work.  Anyone who knows me from #puppet knows that if I can stumble my way through this then so can the majority of puppet users.

The other reason I vote in favorite of a generic database module is because it's much easier to support in other forge modules.  Nothing is more difficult than trying to make a module that works with three different database modules, all of which have different names for similar functionality as they've become closely entwined with their database of choice.

We don't have different package{} resources for each kind of package and the same logic applies here.  The more we can abstract away the easier it is to build modules that can be used in a variety of environments without telling end users "Well sorry, you'll have to go through and replace all the database stuff with the way the postgres module handles adding users.

So yes, it's definitely harder and puppetlabs will probably end up carrying more of the load of building this module up front than you would with database specific ones, but the benefits are definitely worth it.

Chris

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.

Chris Price

unread,
May 25, 2012, 2:59:02 PM5/25/12
to puppe...@googlegroups.com
Thanks so much for the reply, Ashley!

It's interesting that you brought up the example of the "package" provider.  I brought that one up internally as a well, because it seems like it has some of the same challenges.  The package type currently has some parameters that are only relevant for specific providers, and to some degree it feels to me like that lessens the luster of the unified type, because you might still end up with the same problem that you're describing where if you need to migrate from one provider to another, you may still be forced to tweak your manifests to account for the provider-specific details.

I guess the flip side of that coin is that without the unified type/model, you are *guaranteed* to need to deal with this implementation-specific stuff; whereas *with* the unified type, you are only *probably* going to have to deal with provider-specific stuff.  :)

Thoughts?

Thanks again for the helpful feedback.

Ashley Penney

unread,
May 25, 2012, 3:26:58 PM5/25/12
to puppe...@googlegroups.com
That's true, there's definitely some specific differences between the different providers for packages that you have to tweak if you're moving from one to another.  It'll be the same for the firewall ones, some properties just won't apply to other kinds of firewalls.  I guess the biggest difference is that you're still using the same resource names and it's really just properties that change which are easier to deal with.  It's easier to deal with a few properties via variables/hiera than it is to deal with wildly different resources.

I'm thinking of the difference between:

database { 'blah':
  ensure => present,
  property_mysql => blah,
}

and database { 'blah':
  ensure => present,
  property_postgres => blah2,
}

Versus something like:

mysql::database::create { 'name':
  owner => 'fred',
  require => Mysql::database::create_user['fred'],
}

And:

postgres::create_db { 'name':
  user => 'fred',
  require => postgres::user::create['fred'],
}

Those kinds of differences between class names make it significantly harder to deal with.  These probably weren't even great examples.  If I use some real examples from my puppet::server class we have:

  @@mysql::rights { "puppetmaster-$hostname":
    ensure   => present,
    database => 'puppet',
    user     => 'puppet',
    password => "$puppet::params::database_password",
    host     => "$::fqdn",
    tag      => "$puppet::params::database_server",
  }

  @@mysql::rights { "puppetmaster-$ipaddress":
    ensure   => present,
    database => 'puppet',
    user     => 'puppet',
    password => "$puppet::params::database_password",
    host     => "$::ipaddress",
    tag      => "$puppet::params::database_server",
  }

  @@mysql::database {"puppet-$hostname":
    ensure   => present,
    database => 'puppet',
    tag      => "$puppet::params::database_server",
  }

In the postgres module we have:

postgres::createuser {'name':
  password => 'blah',
}

and postgres::createdb { 'name':
  owner => 'name',
}

I don't even know if it has an equiv to mysql::rights.  You can see how different these are and how much they'd benefit from hiding behind a single interface in terms of being able to create users/databases/rights in other modules.

Chris Price

unread,
May 25, 2012, 6:08:40 PM5/25/12
to puppe...@googlegroups.com
Thanks, that's helpful too.  Out of curiosity, are you using modules for the msyql/pg stuff that you've copied above?  Or is it something that you built up from scratch?  If it's from scratch, can you provide some insight as to what was lacking from existing modules that kept you from using them?

Ashley Penney

unread,
May 25, 2012, 6:29:22 PM5/25/12
to puppe...@googlegroups.com
I'm using existing ones.  The biggest reason I never shifted to puppetlabs-mysql was the lack of replication handling.  The one we're using is a little clumsy in that it requires augeas but it let us easily override various settings.  (The postgres one is currently unused).

We have a class that does:

 $serverid = hiera(mysql_server-id)

  Augeas["my.cnf/performance"] {
    changes => [
     "set mysqld/key_buffer 384M",
     "set mysqld/max_allowed_packet 40M",
     "set mysqld/table_cache 512",
     "set mysqld/sort_buffer_size 2M",
     "set mysqld/read_buffer_size 2M",
     "set mysqld/read_rnd_buffer_size 8M",
     "set mysqld/net_buffer_length 8K",
     "set mysqld/myisam_sort_buffer_size 64M",
     "set mysqld/thread_cache_size 8",
     "set mysqld/query_cache_size 32M",
     "set mysqld/thread_concurrency 8",
     "set mysqldump/max_allowed_packet 40M",
     "set isamchk/key_buffer 256M",
     "set isamchk/sort_buffer_size 256M",
     "set isamchk/read_buffer 2M",
     "set isamchk/write_buffer 2M",
     "set myisamchk/key_buffer 256M",
     "set myisamchk/sort_buffer_size 256M",
     "set myisamchk/read_buffer 2M",
     "set myisamchk/write_buffer 2M",
     "set mysqld/innodb_thread_concurrency 8",
     "set mysqld/innodb_file_per_table 1",
     "set mysqld/innodb_additional_mem_pool_size 16M",
     "set mysqld/innodb_buffer_pool_size 2000M"
    ]
  }

  Augeas["my.cnf/replication"] {
    changes => [
      "set server-id $serverid",
    ]
  }

So the two biggest things we were looking for was "an easy way to customize my.cnf settings per role or host without massive amounts of hiera work" and "a better way to handle replication".  At the time we looked the official mysql module lacked both of those abilities.

David Schmitt

unread,
May 27, 2012, 4:18:30 PM5/27/12
to puppe...@googlegroups.com
On 2012-05-25 19:44, Chris Price wrote:
> I'm surveying the landscape of existing puppet modules that have to do
> with relational database functionality. I've talked with a few folks
> now and it seems like there is a fairly broad range of opinions on what
> "ideal" would look like w/rt managing rdbms with puppet.

Hi,

I believe that rdbms are complex enough they warrant implementing both
layers separately: One module for each implementation that provides for
fine-grained control over all available features and tunings, as well as
a common integration module that provides a simple, unified interface.

The first is needed because I need all the knobs for tuning my system.

The second is needed because I just want to say "class { 'zenoss':
database => 'postgres' }" and have it work.

Using the first set of module in the common integration module ensures
that tuning is still possible.


Common code can always be placed into a base module, used by everyone.


Best Regards, David

Chris Price

unread,
May 29, 2012, 2:38:37 PM5/29/12
to puppe...@googlegroups.com
Thanks, David.  I think that something along these lines is where we will likely end up at some point.  One question that has come up a few times, though, is whether or not to package the implementation-specific stuff as separate modules or as part of a single, larger module.

We will definitely need to draw a line between simple, common functionality and db-specific functionality, but there are a few different options with respect to packaging.  If we package the basic functionality as one module, and then the db-specific functionality as separate modules, then we end up with bits of our mysql support in two separate places, bits of our postgres support in two separate places, etc.

An alternative would be to package the "common" stuff and the db-specific stuff all in one larger module together; then everything lives in the same place, but you do get into some other possible issues with clutter, module size, etc.

It's starting to feel like the best way to address this may be to do some work to improve our postgres capabilities--as a separate module for the time being--and bring them into alignment with our current mysql stuff; if we do this with an eye towards bringing the common parts together at some point, hopefully the overlap will become more obvious and we can take the appropriate next steps afterwards.

Thanks a ton for the input... if anyone has any additional thoughts they would be welcomed!


--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+unsubscribe@googlegroups.com.

David Schmitt

unread,
May 30, 2012, 2:46:15 AM5/30/12
to puppe...@googlegroups.com
On Tue, 29 May 2012 11:38:37 -0700, Chris Price <ch...@puppetlabs.com>
wrote:
> Thanks, David. I think that something along these lines is where we
will
> likely end up at some point. One question that has come up a few times,
> though, is whether or not to package the implementation-specific stuff
as
> separate modules or as part of a single, larger module.

I think a single _project_ is appropriate. That is, the same set of
people, a single repo, a single tracker. This enables lock-step changes
across the complete set of classes. I also think that separate modules,
even if generated from the same repo, are the way to go for people who only
want the plumbing, but not the porcelain.



Best Regards, David

Chris Price

unread,
May 30, 2012, 9:33:10 PM5/30/12
to puppe...@googlegroups.com
Interesting idea.  I don't think we've done much experimentation with a single-vcs-repo => multiple modules setup.  Do you draw a firm distinction between modules and classes here?  E.g., if you could elect to only use certain (pg-specific or mysql-specific) classes from within a single module, your plumbing vs. porcelain issue would be satisfied, would it not?




Best Regards, David

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To post to this group, send email to puppe...@googlegroups.com.
To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.

David Schmitt

unread,
May 31, 2012, 3:27:40 AM5/31/12
to puppe...@googlegroups.com
On 31.05.2012 03:33, Chris Price wrote:
> Interesting idea. I don't think we've done much experimentation with a
> single-vcs-repo => multiple modules setup. Do you draw a firm
> distinction between modules and classes here? E.g., if you could elect
> to only use certain (pg-specific or mysql-specific) classes from within
> a single module, your plumbing vs. porcelain issue would be satisfied,
> would it not?

The last problem would be autoloading/naming. Classes from a common
rdbms module would be all have to be prefixed with the module name to
make autloading work.

But this is only a minor nit and shouldn't keep you from doing what you
think is right.


Best Regards, David

>
> On Tue, May 29, 2012 at 11:46 PM, David Schmitt <da...@dasz.at
> <mailto:da...@dasz.at>> wrote:
>
> On Tue, 29 May 2012 11:38:37 -0700, Chris Price
> <ch...@puppetlabs.com <mailto:ch...@puppetlabs.com>>
> wrote:
> > Thanks, David. I think that something along these lines is where we
> will
> > likely end up at some point. One question that has come up a few
> times,
> > though, is whether or not to package the implementation-specific
> stuff
> as
> > separate modules or as part of a single, larger module.
>
> I think a single _project_ is appropriate. That is, the same set of
> people, a single repo, a single tracker. This enables lock-step changes
> across the complete set of classes. I also think that separate modules,
> even if generated from the same repo, are the way to go for people
> who only
> want the plumbing, but not the porcelain.
>
>
>
> Best Regards, David
>
> --
> You received this message because you are subscribed to the Google
> Groups "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com
> <mailto:puppe...@googlegroups.com>.
> To unsubscribe from this group, send email to
> puppet-dev+...@googlegroups.com
> <mailto:puppet-dev%2Bunsu...@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages