We at Reductive Labs keep running into clients who need something like
an attribute class - that is, for a given module, they want a single
class that handles all of the variable setting and overriding, and
then they want that attribute class to be merged into all or some of
the classes in the module. (Teyo has especially been pushing me to
solve this problem.)
This looks a lot like composition, which Puppet doesn't currently
support - you include one class's behaviour in another, rather than
inheriting. This lack of composition is, I think, partially why
people keep trying to do variable inheritance and getting frustrated
that it doesn't work well.
While doing a code audit for a client this week, though, I found their
solution to this problem to be close enough that I was able to take it
almost the rest of the of way. They had what amounts to a 'data'
module with a class for each class that needed external data, and then
they defined all of the attributes in that data class. That is, if
you have an 'apache' class, you'd also have a 'data::apache' class
with a bunch of attributes. Then all of your usage of those
attributes would say '$data::apache::variable'.
The big benefit of this model for them is that it allows clean change
control of all of the important configuration data (as opposed to
manifest structure and resources), and, again, it looks a lot like
composition. In their case, they branch this module for all of their
environments, but none of the other modules.
As soon as I saw it, though, I thought of a way that might make it
better, so I wrote a function that enables that way (and a converter
for their existing data).
Basically, I created (and have attached) a simple function that knows
how to find and load a yaml file from a data directory, and it loads
that file as a hash of parameters that should be set as local
variables in the class.
For instance, say you have this class apache; you'd create this file:
data/apache.yaml
And put all of the attributes you care about in that file.
Then, in your apache class, you'd say:
class apache {
load_data()
...
}
It would pick the right file based on the class name (although the
current function allows you to specify the class, also), load it, and
set all of the contained attributes as local variables in the class.
So, if your apache.yaml file looks like this:
---
host: myhost.com
port: 80
Then this 'load_data' call is equivalent to this Puppet code:
class apache {
$host = "myhost.com"
$port = 80
}
The function is currently set up to support one big 'data' directory
for all of your modules. One could argue that it should instead
support a data file per module, but the benefit of this one big data
directory is that it makes it *much* easier to write sharable modules
- you extract all of your site-specific data into this data dir, and
you share the module with essentially no site data. It's probably
most reasonable to support an in-module data file and a site-wide data
directory to make it easy to provide defaults.
I'm beginning to think that this, or a function a lot like this,
should be included directly into Puppet, and data should get loaded
automatically, rather than requiring the call to the 'load_data'
function.
What do you think?
--
Fallacies do not cease to be fallacies because they become fashions.
--G. K. Chesterton
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com
Interesting..
Three questions about this:
1. Could these variables be overridden by a node variable? I assume no.
2. What happens if the data file is missing?
3. Could their be optional and mandatory variables?
This solves variables but does not solve the other issue I mention
about files or templates that are custom per site install.
>
> I'm beginning to think that this, or a function a lot like this,
> should be included directly into Puppet, and data should get loaded
> automatically, rather than requiring the call to the 'load_data'
> function.
hmm is there cases were you wouldn't want the data to load?
-L
>
>
>> Then this 'load_data' call is equivalent to this Puppet code:
>>
>> class apache {
>> $host = "myhost.com"
>> $port = 80
>> }
>
> Interesting..
>
> Three questions about this:
> 1. Could these variables be overridden by a node variable? I assume
> no.
As currently implemented, no, but it would be easy to make it so they
could be - just don't set a given variable if it's already set.
The thing to keep in mind here is that this is a bare proof of
concept; I'm much more interested in how it should behave than how it
does.
>
> 2. What happens if the data file is missing?
Kerplow, at this point.
Really, I see there being a minimum of one, probably two, and quite
possibly more, data files per class. I'd expect any module that used
this to ship with a default data file, but then most sites would
override the data in that file with a separate data file.
Where you could get into even more data files per class is if you
wanted to provide branching on platforms or something similar.
Following our previous apache class, you could have:
data/apache.yaml
data/apache-debian.yaml
data/apache-rhel.yaml
...
I don't love this, but it can make some things easier. It could
especially help in calculating whether a given module supports a given
platform. Although, of course, you'd almost definitely still need
some per-platform modifications in a module (e.g., when you need to
install two packages instead of one).
Most likely, this should be done as a search path - don't hard code
that you're looking for operating system, but allow a module to
specify that it should search os, os version, then hardware;
basically, any fact or series of facts.
>
> 3. Could their be optional and mandatory variables?
Well, at this point your module would need to handle that; all this
function does is load data.
This clearly points to being able to declare, discover, and define
class attributes, so you'd think that it would have something like
that. But I'd say that's getting pretty far ahead of ourselves at
this point.
>
> This solves variables but does not solve the other issue I mention
> about files or templates that are custom per site install.
I think you've convinced that a given module should search all of the
module path directories, rather than just the first found directory.
This would allow you to have your site module dir in front of the dist
module dir and override templates or files just by putting them in the
earlier path.
>> I'm beginning to think that this, or a function a lot like this,
>> should be included directly into Puppet, and data should get loaded
>> automatically, rather than requiring the call to the 'load_data'
>> function.
>
> hmm is there cases were you wouldn't want the data to load?
Me? No. Someone? Assuredly. Will I support that? I highly doubt
it. Variety is the spice of thousands of filed bugs.
--
It is curious that physical courage should be so common in the world and
moral courage so rare. -- Mark Twain
Explain how this would work in detail? This does not work currently
in Puppet does it?
-L
Currently Puppet searches through your module path for a name, and the
first directory that has a subdirectory with that name is considered
to be the entire module; any other modules with the same name in later
directories will get ignored by Puppet. This ignoring is why you
can't override actual files.
My proposed change would have Puppet no longer ignore those later
modules - you could have your dist modules in a later search path,
your site module in an early search path, and Puppet would prefer
files from the module earlier in the path.
That work?
--
I happen to feel that the degree of a person's intelligence is directly
reflected by the number of conflicting attitudes she can bring to bear
on the same topic. -- Lisa Alther
A big plus plus +1 from me. I'd love to see this - though probably in a
module-by-module basis rather than a global.
Regards
James Turnbull