Roles and profiles dissent

124 views
Skip to first unread message

Chris Southall

unread,
Aug 1, 2019, 4:01:44 PM8/1/19
to Puppet Users
Our site is using a collection of puppet modules to manage various Linux components using the roles and profiles model.  While it works OK for the most part, I often find it necessary to update a module or profile for some reason or other.  Modules obtained from puppet forge sometimes don't quite do what is needed, and writing good quality modules on your own can be a challenge.  When using roles and profiles you end up declaring all the module parameters again to avoid losing functionality and flexibility.  You
also need to be familiar with all the classes, types, and parameters from all modules in order to use them effectively.  

To avoid all of the above, I put together the 'basic' module and posted it on the forge:  https://forge.puppet.com/southalc/basic

This module uses the hiera_hash/create_resources model for all the native puppet (version 5.5) types, using module parameters that match the type (exceptions for metaparameters, per the README).  The module also includes the 'file_line' type from puppetlabs/stdlib, the 'archive' type from puppet/archive, and the local defined type 'binary', which together provide a simple and powerful way to create complex configurations from hiera.  All module parameters default to an empty hash and also have a merge strategy of 'hash' to enable a great deal of flexibility.  With this approach I've found it possible to replace many single purpose modules it's much faster and easier to get the results I'm looking for.

Yes, the hiera data can become quite large, but I find it much easier to manage data in hiera than coding modules with associated logic, parameters, templates, etc.  Is this suitable for hyper-scale deployment?  Maybe not, but for a few hundred servers with a few dozen configuration variants it seems to work nicely.  Is everyone else using puppet actually happy with the roles/profiles method?

Luke Bigum

unread,
Aug 1, 2019, 5:23:28 PM8/1/19
to Puppet Users
Hi Chris,

Quite a similar question was posted about two weeks back, you might find that very interesting:



On Thursday, 1 August 2019 17:01:44 UTC+1, Chris Southall wrote:
Our site is using a collection of puppet modules to manage various Linux components using the roles and profiles model.  While it works OK for the most part, I often find it necessary to update a module or profile for some reason or other.  Modules obtained from puppet forge sometimes don't quite do what is needed, and writing good quality modules on your own can be a challenge. 


There was another recent post about using Forge modules or importing the Puppet code into a personal Git repository directly:


If you are a confident Puppet Coder, you might prefer to import the source, patch the module to add your feature, then submit the patch back upstream.

 
When using roles and profiles you end up declaring all the module parameters again to avoid losing functionality and flexibility.

... Not sure I agree with that statement.  That sounds odd.  Why would you be re-declaring module parameters if you're not changing something from the defaults?  And if you are intending to change something, then of course you are supplying different parameters?

 
You also need to be familiar with all the classes, types, and parameters from all modules in order to use them effectively.

Ideally the README page of a module would contain amazing user level documentation of how the module should work... but not that many do.  I often find I have to go read the Puppet code itself to figure out exactly what a parameter does.
 
To avoid all of the above, I put together the 'basic' module and posted it on the forge:  https://forge.puppet.com/southalc/basic

Ok :-) I'm beginning to see what the core of your problem is.  The fact that you've created your own module to effectively do create_resources() hash definitions says to me that you haven't quite grasped the concepts of the Role / Profile design pattern.  I know I have a very strong view on this subject and many others will disagree, but personally I think the Role / Profile pattern and the "do-everything-with-Hiera-data" pattern are practically incompatible.

This module uses the hiera_hash/create_resources model for all the native puppet (version 5.5) types, using module parameters that match the type (exceptions for metaparameters, per the README).  The module also includes the 'file_line' type from puppetlabs/stdlib, the 'archive' type from puppet/archive, and the local defined type 'binary', which together provide a simple and powerful way to create complex configurations from hiera.  All module parameters default to an empty hash and also have a merge strategy of 'hash' to enable a great deal of flexibility.  With this approach I've found it possible to replace many single purpose modules it's much faster and easier to get the results I'm looking for.

A Hiera-based, data-driven approach will always be faster to produce a "new" result (just like writing Ansible YAML is faster to produce than Puppet code)...  It's very easy to brain dump configuration into YAML and have it work, and that's efficient up to a certain point.  For your simple use cases, yes, I can completely see why you would be looking at the Role Profile pattern and saying to yourself "WTF for?".  I think the tipping point of which design method becomes more efficient directly relates to how complicated (or how much control) you want over your systems.

The more complicated you go, the more I think you will find that Hiera just doesn't quite cut it.  Hiera is a key value store.  You can start using some neat tricks like hash merging, you can look up other keys to de-duplicate data... When you start to model more and more complicated infrastructure, I think you will find that you don't have enough power in Hiera to describe what you want to describe, and that you need an imperative programming language (eg: if statements, loops, map-reduce).  The Puppet DSL is imperative.

Yes, the hiera data can become quite large, but I find it much easier to manage data in hiera than coding modules with associated logic, parameters, templates, etc.  Is this suitable for hyper-scale deployment?  Maybe not, but for a few hundred servers with a few dozen configuration variants it seems to work nicely.  Is everyone else using puppet actually happy with the roles/profiles method?

 If you are only making small-to-medium changes to a standard operating system, and/or your machines are short-lived cloud systems that get thrown away after half an hour, then sure, a Hiera-only approach will work fine at the scale you are suggesting.

I also think team size and composition is a big factor.  If I was in a team of one or two people I'm sure I'd be saying "Yeah! Hiera! I can change anything really really easily!".  If I was in a team of a dozen engineers geographically spread across the world with vastly different levels of Puppet knowledge I think I'd be saying "Oh god... Everything's in Hiera... It's so easy for someone to mess up. What on earth has someone changed now".  If you haven't guessed already, I've been here before.

Personally I think the most useful part of the Role Profile design pattern is the encapsulation of implementation details behind business-specific Profiles.  Jesus, what a mouthful.  How about "hiding away all the details behind an interface that makes sense to me and my team"?

Best demonstrated with a real life example we use here...


The above is the Profile for an LMAX "Statistics Collection Server".  A statistics collection server collects statistics.  If someone wants to collect statistics, all they have to do is put:

   include ::profile::statistics::collection_server

Somewhere in a node definition and set *AT MOST* nine Hiera parameters for that Profile.  That's the real win - an LMAX statistics collection server has only 9 parameters that can be changed.  They don't really have to understand exactly what goes into building a Statistics Collection Server if they don't want to (in practice they might need to browse the code to check what a parameter does though, because we are lazy and don't document our Profiles).

If you go read that profile in detail you'll see I pull in several component modules: Puppetlabs Apache, Influxdb, a private LVM module that's a wrapper for Puppetlabs' LVM, Grafana, and Chronograf.  Apache (with SSL) is set up to proxy Grafana and Chronograf.  Our LVM module creates the file system underneath Influx before is installed.  Most of the parameters to the component modules are hard coded, and this is a great thing because that means every single one of our Statistics Collection Servers are exactly the same.  I even pull in a (private) Nagios module to define monitoring resources, so when one of my Engineers uses that profile they get the monitoring _automatically_.

I count 81 parameters to component modules in that Profile, so that would be at least 81 lines of Hiera needed to reproduce that functionality in YAML (and even then, good luck ensuring that the LVM disk is there before Influx is installed).  I have condensed that to 9 possible parameters where I think someone should legitimately change something.  Otherwise, you use my defaults, and that keeps things the same, reducing entropy across our estate.  Yes, writing this profile took a lot longer than doing it in YAML, but our engineers shouldn't need to "figure out" how to build an InfluxDB server ever again.

Another big win for me: testing.  I can write puppet-rspec units tests for the above Profile to make sure that if someone tries to change it, they don't break any existing functionality.  Our current workflow has our engineers committing onto branches and creating Merge Requests in our private Git Lab. All tests must pass before they can merge code to Master.  They usually get notified within minutes if something they've pushed hasn't passed tests.

You can do testing of Hiera-defined infrastructure, however all approaches I've read about seem awfully cumbersome and wasteful.  I won't rant about that today.

So tell me, how did I go at convincing you? :-)

-Luke

Rob Nelson

unread,
Aug 1, 2019, 11:59:15 PM8/1/19
to Puppet Users
I agree with everything Luke said, but would also like to point out 2 other techniques that are useful:

1) create_resources() is a bit of a kludge left over from puppet 3. Starting in puppet 4 (and 3’s future parser), iteration was added. Instead of create_resources($some_hash), you would say $some_hash.each |$title, $options| {} and create each resource inside the block. You can still use hiera to get the hash as an automatic parameter lookup on the class, but the creation of resources is a bit more explicit.

2) you also get the chance to define defaults, which means users don’t necessarily have to provide everything! Create a $defaults hash and assign it plus the defined overrides as (say for a user) user {$title: * => $defaults + $options}. This merges the options and defaults and applies the resulting hash as the parameters and values for the resource. You can keep your hiera tidier by creating sane defaults and only specifying the overrides in hiera. Have a new default? Modify the code once and all the resources in hiera benefit from it, unless they explicitly override it.

A practical example of this might be creating local users on nodes without access to a central auth mechanism, maybe in staging. In your code you create some defaults:

$defaults = {
ensure => present,
password_max_age => 90,
shell => ‘/bin/fish’,
}

Your hiera might look like:

profile::linux::local_users:
rnelson0:
password: ‘hash1’
groups:
- wheel
password_max_age: 180
root:
password: “hash2”
password_max_age: 9999
lbigum:
ensure: absent

In your code, you iterate over the class parameter local_users and combine your defaults with the specific options:

$local_users.each |$title, $options| {
user { $title:
* => defaults + $options,
}
}

Now my user is created, root’s password is changed and set to basically never expire, and Luke’s account is deleted if it exists.

This is a good way to combine the power of hiera with the predictability of puppet DSL, maintain unit and acceptance tests, and make it easy for your less familiar puppet admins to manage resources without having to know every single attribute required or even available in order to use them without going too far down the road of recreating a particular well known CM system. It’s always a bit of a balancing act, but I find this is a comfortable boundary for me and one that my teammates understand.

There a lot more power to iteration that can be found in the puppet documentation and particularly this article by RI that I still reference frequently https://www.devco.net/archives/2015/12/16/iterating-in-puppet.php

Hope that helps.

Chris Southall

unread,
Aug 3, 2019, 1:03:29 AM8/3/19
to Puppet Users
Hi Luke.  Thanks for a thoughtful and detailed response.

Quite a similar question was posted about two weeks back, you might find that very interesting:

I saw this, and have been kicking around the idea leading to this post

If you are a confident Puppet Coder, you might prefer to import the source, patch the module to add your feature, then submit the patch back upstream.

This is likely part of my problem.  I am not a confident puppet coder, probably closer to barely competent.
  
When using roles and profiles you end up declaring all the module parameters again to avoid losing functionality and flexibility.

... Not sure I agree with that statement.  That sounds odd.  Why would you be re-declaring module parameters if you're not changing something from the defaults?  And if you are intending to change something, then of course you are supplying different parameters?

Lets say a module has 10 parameters and supplies defaults for most of them.  When writing a profile you have to choose how many of the class parameters can remain defaults, how many to override, and how many to expose as profile parameters.  It's sounds fine to limit the number of parameters at the profile, right up until you hit an edge case that doesn't work with the default values and the parameter you need to change now requires a profile update...
 
You also need to be familiar with all the classes, types, and parameters from all modules in order to use them effectively.

Ideally the README page of a module would contain amazing user level documentation of how the module should work... but not that many do.  I often find I have to go read the Puppet code itself to figure out exactly what a parameter does.

Ditto on the documentation.  Some modules are better than others, and of course you can review the manifests, but with my admitted weakness in Puppet DSL it's not always immediately apparent to me what some classes are doing.
 
To avoid all of the above, I put together the 'basic' module and posted it on the forge:  https://forge.puppet.com/southalc/basic

Ok :-) I'm beginning to see what the core of your problem is.  The fact that you've created your own module to effectively do create_resources() hash definitions says to me that you haven't quite grasped the concepts of the Role / Profile design pattern.  I know I have a very strong view on this subject and many others will disagree, but personally I think the Role / Profile pattern and the "do-everything-with-Hiera-data" pattern are practically incompatible.

I'd like to think I grasp the roles/profiles concept, but am just not convinced it's a better approach.  Abstracting away configuration details and exposing a limited set of parameters results in uniform configurations.  In doing so it also seems it limits flexibility and ensures that you'll continue to spend a good deal of time maintaining your collection of profiles/modules.
 
This module uses the hiera_hash/create_resources model for all the native puppet (version 5.5) types, using module parameters that match the type (exceptions for metaparameters, per the README).  The module also includes the 'file_line' type from puppetlabs/stdlib, the 'archive' type from puppet/archive, and the local defined type 'binary', which together provide a simple and powerful way to create complex configurations from hiera.  All module parameters default to an empty hash and also have a merge strategy of 'hash' to enable a great deal of flexibility.  With this approach I've found it possible to replace many single purpose modules it's much faster and easier to get the results I'm looking for.

A Hiera-based, data-driven approach will always be faster to produce a "new" result (just like writing Ansible YAML is faster to produce than Puppet code)...  It's very easy to brain dump configuration into YAML and have it work, and that's efficient up to a certain point.  For your simple use cases, yes, I can completely see why you would be looking at the Role Profile pattern and saying to yourself "WTF for?".  I think the tipping point of which design method becomes more efficient directly relates to how complicated (or how much control) you want over your systems.

A number of people I've talked to like Ansible because of the easy learning curve and great time-to-results.
 
The more complicated you go, the more I think you will find that Hiera just doesn't quite cut it.  Hiera is a key value store.  You can start using some neat tricks like hash merging, you can look up other keys to de-duplicate data... When you start to model more and more complicated infrastructure, I think you will find that you don't have enough power in Hiera to describe what you want to describe, and that you need an imperative programming language (eg: if statements, loops, map-reduce).  The Puppet DSL is imperative.

Speaking of hiera tricks, I created an exec resource with the command defined as a multi-line script to include variables and function declarations.  I use this to collect data and create local facts.  The next puppet run creates additional resources based on the presence of these facts.  This is basically the same as creating a module with external facts, but doesn't require a module.  An upside is that the fact script doesn't need to execute on every puppet agent run, with the downside being that the host takes a second puppet run to create all resources.  I'm not sure if I should be proud or ashamed of what I did, but it works!
 
Yes, the hiera data can become quite large, but I find it much easier to manage data in hiera than coding modules with associated logic, parameters, templates, etc.  Is this suitable for hyper-scale deployment?  Maybe not, but for a few hundred servers with a few dozen configuration variants it seems to work nicely.  Is everyone else using puppet actually happy with the roles/profiles method?

 If you are only making small-to-medium changes to a standard operating system, and/or your machines are short-lived cloud systems that get thrown away after half an hour, then sure, a Hiera-only approach will work fine at the scale you are suggesting.

In my situation, we have a Red Hat Satellite serving as a Puppet ENC.  We also use automated OS provisioning where we can add custom facts.  With this we're able to target groups of hosts with specific hiera data, creating a fairly comprehensive configuration.
 
I also think team size and composition is a big factor.  If I was in a team of one or two people I'm sure I'd be saying "Yeah! Hiera! I can change anything really really easily!".  If I was in a team of a dozen engineers geographically spread across the world with vastly different levels of Puppet knowledge I think I'd be saying "Oh god... Everything's in Hiera... It's so easy for someone to mess up. What on earth has someone changed now".  If you haven't guessed already, I've been here before.

This may be the greatest factor to influence the decision.  In my case we have 2 people working with puppet, and the system we're building is to be handed over to team with little to no puppet experience.  This system runs at a single site with only a couple hundred managed nodes and maybe a couple dozen unique configurations. 
 
Personally I think the most useful part of the Role Profile design pattern is the encapsulation of implementation details behind business-specific Profiles.  Jesus, what a mouthful.  How about "hiding away all the details behind an interface that makes sense to me and my team"?

While I like the idea of automating business processes, I dislike the idea of hard-coding them.  Business processes change, and the code base will be constantly chasing.  Given the choice, I'd rather change data than code. 

Best demonstrated with a real life example we use here...


The above is the Profile for an LMAX "Statistics Collection Server".  A statistics collection server collects statistics.  If someone wants to collect statistics, all they have to do is put:

   include ::profile::statistics::collection_server

Somewhere in a node definition and set *AT MOST* nine Hiera parameters for that Profile.  That's the real win - an LMAX statistics collection server has only 9 parameters that can be changed.  They don't really have to understand exactly what goes into building a Statistics Collection Server if they don't want to (in practice they might need to browse the code to check what a parameter does though, because we are lazy and don't document our Profiles).

If you go read that profile in detail you'll see I pull in several component modules: Puppetlabs Apache, Influxdb, a private LVM module that's a wrapper for Puppetlabs' LVM, Grafana, and Chronograf.  Apache (with SSL) is set up to proxy Grafana and Chronograf.  Our LVM module creates the file system underneath Influx before is installed.  Most of the parameters to the component modules are hard coded, and this is a great thing because that means every single one of our Statistics Collection Servers are exactly the same.  I even pull in a (private) Nagios module to define monitoring resources, so when one of my Engineers uses that profile they get the monitoring _automatically_.

I count 81 parameters to component modules in that Profile, so that would be at least 81 lines of Hiera needed to reproduce that functionality in YAML (and even then, good luck ensuring that the LVM disk is there before Influx is installed).  I have condensed that to 9 possible parameters where I think someone should legitimately change something.  Otherwise, you use my defaults, and that keeps things the same, reducing entropy across our estate.  Yes, writing this profile took a lot longer than doing it in YAML, but our engineers shouldn't need to "figure out" how to build an InfluxDB server ever again.

Thanks for sharing your example.  I won't argue that roles/profiles isn't a great way to standardize deployments.  I guess my complaint is that it takes so much investment to get there.
 
Another big win for me: testing.  I can write puppet-rspec units tests for the above Profile to make sure that if someone tries to change it, they don't break any existing functionality.  Our current workflow has our engineers committing onto branches and creating Merge Requests in our private Git Lab. All tests must pass before they can merge code to Master.  They usually get notified within minutes if something they've pushed hasn't passed tests.
You can do testing of Hiera-defined infrastructure, however all approaches I've read about seem awfully cumbersome and wasteful.  I won't rant about that today.
 
When I looked into writing rspec tests it was a bit daunting.  Much worse than beginning with Puppet DSL.  My puppet tests generally consist of applying my changes locally and observing results.  Cumbersome and wasteful are the correct terms.

So tell me, how did I go at convincing you? :-)

Well, you have caused me some guilt that maybe I've taken the easy way out rather than becoming more proficient with puppet.  Once you've had that first hit and instant high from the hiera crack pipe... it's hard not to go back.

Rob Nelson

unread,
Aug 3, 2019, 3:09:01 AM8/3/19
to puppet...@googlegroups.com
On Fri, Aug 2, 2019 at 9:03 PM Chris Southall <sout...@gmail.com> wrote:


Lets say a module has 10 parameters and supplies defaults for most of them.  When writing a profile you have to choose how many of the class parameters can remain defaults, how many to override, and how many to expose as profile parameters.  It's sounds fine to limit the number of parameters at the profile, right up until you hit an edge case that doesn't work with the default values and the parameter you need to change now requires a profile update...

Are you using “include classname” or “class { classname: }” in your manifests? It sounds like the latter, which means you’re likely to pass down class parameters which must now be exposed as parameters in your class.

The former is far more flexible. Because of hiera’s automatic parameter lookup, you could write this:

class profile::base::linux {
  include ntp
  include ssh::server
}

Pair it with this hiera data:

---
ntp::servers:
ssh::servers::ciphers: “blowfish,aes-128,aes-256”

And now your profile needs to expose zero parameters, which means you don’t need to write the code to accept AND pass the parameters or even really know the modules; your ntp and security team can be the experts on that, instead.

Similarly, a profile::mysql::server class wouldn’t need to pass any parameters down, and your DBA can come up with the various key/value pairs like mysql::root_password: “changeme” to use.

Your class parameters would then be restricted to things like feature flags and site specific details (if $manage_some_feature {include some_feature} or if ($datacenter == “onprem”) {mount {“/nfs}: … }), which are things y’all would know lots about, and in turn let the subject matter experts focus on the component modules that require their expertise.


As far as tests go, check out puppet-retrospec. It’s a gem that will create (naive) rspec-puppet tests for existing code. It’s a good way to get started and illustrate how it works with the least amount of pain.
--
Rob Nelson

Luke Bigum

unread,
Aug 3, 2019, 9:51:58 AM8/3/19
to Puppet Users

On Saturday, 3 August 2019 02:03:29 UTC+1, Chris Southall wrote:
Hi Luke.  Thanks for a thoughtful and detailed response.


You are most welcome.

 
I'd like to think I grasp the roles/profiles concept, but am just not convinced it's a better approach.  Abstracting away configuration details and exposing a limited set of parameters results in uniform configurations.  In doing so it also seems it limits flexibility and ensures that you'll continue to spend a good deal of time maintaining your collection of profiles/modules.

Absolutely.  One of the key points I never told my team was I was enforcing a style that'd make it harder for them to change anything.  This is purely entropy reduction.  On a very slow day LMAX Exchange trades US$ 100,000 a second, so it's very important to me that say, for example, someone doesn't have the power to easily mess up the indentation on one line of YAML... There by breaking a big hash in Hiera... Which is being used to generate a long list of puppetlabs-firewall rules... Which ends up removing half the firewall rules on a machine.  That of course never happened (¬_¬) ... 


Speaking of hiera tricks, I created an exec resource with the command defined as a multi-line script to include variables and function declarations.  I use this to collect data and create local facts.  The next puppet run creates additional resources based on the presence of these facts.  This is basically the same as creating a module with external facts, but doesn't require a module.  An upside is that the fact script doesn't need to execute on every puppet agent run, with the downside being that the host takes a second puppet run to create all resources.  I'm not sure if I should be proud or ashamed of what I did, but it works!

Two-pass Puppet is often hard to get away from 100%.  If it works, and your team can understand it / debug it, that's probably more important.


This may be the greatest factor to influence the decision.  In my case we have 2 people working with puppet, and the system we're building is to be handed over to team with little to no puppet experience.  This system runs at a single site with only a couple hundred managed nodes and maybe a couple dozen unique configurations. 

It sounds like your team size fits your design choice.  There's one aspect of the Role Profile pattern that relates to what you say above that I haven't talked about (because I don't do it).  It's actually one of the core principals in Craig Dunn's early presentations.  When encapsulating business design in Profiles, you create an interface for how that business deliverable can be changed (in my example, a Statistics Collection Server).  It's possible to give people outside the Puppet team the ability to configure that interface in standard ways.  The core Puppet people produce well tested Profiles that, say, the web developers consume and configure for their purposes.  The Web developers only know a little bit of Puppet but they can do basic things like change some web server settings by tweaking the parameters of the Profiles given to them by the core Puppet people.

What this looks like in practice is the web developers either having some level of access to Hiera (eg: they can write to a level of the Hierarchy that's lower priority than the Puppet team), or partial write access to a Puppet ENC.  If you are a team of two building everything right now and handing over to another team, what you might want to do in future is allow this other team a bit more self-service to make their own changes.  You of course still need to be in control of the standard build, but you're not doing every little thing for them.  This would work well in a company where various teams can spin up instances of their own cloud infrastructure.

Well, you have caused me some guilt that maybe I've taken the easy way out rather than becoming more proficient with puppet.  Once you've had that first hit and instant high from the hiera crack pipe... it's hard not to go back.

From what you've explained about your company, I think your choice of style is appropriate right now.  The only thing I can stress is don't this become a limitation in three years when your company grows.  The cost of an operational fault is also a factor.  If it's relatively inexpensive to fail or break something, fix it, and race on, then optimising for speed of delivery makes perfect sense.

Rob's suggestions of learning the Puppet 4/5/6 DSL functions that replace create_resources() are a great starting point.  It's sometimes a hard thing to grasp (I have to re-read the Puppet Docs on each function quite often), but if you can master the map(), reduce() and each() functions, you'll learn quite a lot of data manipulation tricks. Then if it becomes more efficient for you down the line, you can begin moving some of your business logic out of Hiera and into code.

-Luke

Chris Southall

unread,
Aug 4, 2019, 12:02:12 AM8/4/19
to Puppet Users
1) create_resources() is a bit of a kludge left over from puppet 3. Starting in puppet 4 (and 3’s future parser), iteration was added. Instead of create_resources($some_hash), you would say $some_hash.each |$title, $options| {} and create each resource inside the block. You can still use hiera to get the hash as an automatic parameter lookup on the class, but the creation of resources is a bit more explicit.

So you discourage use of create_resources() is favor of each().  I can get on board with that.
 
2) you also get the chance to define defaults, which means users don’t necessarily have to provide everything! Create a $defaults hash and assign it plus the defined overrides as (say for a user) user {$title: * => $defaults + $options}. This merges the options and defaults and applies the resulting hash as the parameters and values for the resource. You can keep your hiera tidier by creating sane defaults and only specifying the overrides in hiera. Have a new default? Modify the code once and all the resources in hiera benefit from it, unless they explicitly override it.

In fairness, create_resources() also lets you set defaults.
 
A practical example of this might be creating local users on nodes without access to a central auth mechanism, maybe in staging. In your code you create some defaults:

  $defaults = {
    ensure => present,
    password_max_age => 90,
    shell => ‘/bin/fish’,
  }

Your hiera might look like:

profile::linux::local_users:
  rnelson0:
    password: ‘hash1’
    groups:
    - wheel
    password_max_age: 180
  root:
    password: “hash2”
    password_max_age: 9999
  lbigum:
    ensure: absent

In your code, you iterate over the class parameter local_users and combine your defaults with the specific options:

  $local_users.each |$title, $options| {
    user { $title:
      * => defaults + $options,
    }
  }

Now my user is created, root’s password is changed and set to basically never expire, and Luke’s account is deleted if it exists.

This is a good way to combine the power of hiera with the predictability of puppet DSL, maintain unit and acceptance tests, and make it easy for your less familiar puppet admins to manage resources without having to know every single attribute required or even available in order to use them without going too far down the road of recreating a particular well known CM system. It’s always a bit of a balancing act, but I find this is a comfortable boundary for me and one that my teammates understand.

Good points and a nice example.  In the case of my basic module I'm currently using a separate create_resources line for each class parameter.  Is there a way to iterate over all class parameters using each() so I can use a single nested loop to create everything?

 
There a lot more power to iteration that can be found in the puppet documentation and particularly this article by RI that I still reference frequently https://www.devco.net/archives/2015/12/16/iterating-in-puppet.php

Thanks for sharing!
 

Rob Nelson

unread,
Aug 4, 2019, 1:48:04 AM8/4/19
to Puppet Users
> Good points and a nice example. In the case of my basic module I'm currently using a separate create_resources line for each class parameter. Is there a way to iterate over all class parameters using each() so I can use a single nested loop to create everything?

You can - add an extra tier to the hash with the first level being the resource name and then create a default hash with a key for each type you use - but I simply don’t think it scales, especially once you need to merge data from multiple layers of hiera. Even the deepest merge will, to my knowledge, end up replacing and not augmenting the hash values under each key. For example:

#default.yaml
---
Profile::base::all_resources:
user:
rnelson0: {}
appuser: {}

#clientcert/somenode.yaml
---
Profile::base::all_resources:
user:
localbackups: {}
package:
tree: {}

A deep merge will merge in the the new key ‘package’, but *replace* the ‘user’ key, resulting in rnelson0 and appuser everywhere but only localbackups on node ‘somenode’. Because of this, it’s not as flexible as you’d think. You can see more detail at https://puppet.com/docs/puppet/5.0/hiera_merging.html (can’t find the 6.x link but to the best of my knowledge, it works the same).



It also doesn’t scale because you’re writing YAML not code, as Luke suggested earlier. Testing is difficult, and troubleshooting is difficult, and ordering is even more difficult. If you want to, say, add a repo and make sure it’s managed prior to any packages, you’re gonna have to spell out the ordering in your YAML, whereas something like ‘Repo <| tag == “class” |> -> Package <| tag == “class” |>’ within a class can set that ordering only for the related resources much more easily.

The last thing I’d point out is that composition is a really good pattern, and a one-class-does-it-all is an anti-pattern to that. Doing just what you need in a series of single, small classes allows you to easily compose a desired state through a role that includes the relevant, and just the relevant, classes. Within each profile, you should be able to delineate much of the specifics, rather than dynamically determine them at runtime via a superclass.

Perhaps a question to ask is, how opinionated are your profiles, and how opinionated should they be? IMO, very, and that would probably lower the number of resources you need to dynamically define.

Chris Southall

unread,
Aug 24, 2019, 4:52:19 AM8/24/19
to Puppet Users
> Good points and a nice example.  In the case of my basic module I'm currently using a separate create_resources line for each class parameter.  Is there a way to iterate over all class parameters using each() so I can use a single nested loop to create everything?

You can - add an extra tier to the hash with the first level being the resource name and then create a default hash with a key for each type you use - but I simply don’t think it scales, especially once you need to merge data from multiple layers of hiera. Even the deepest merge will, to my knowledge, end up replacing and not augmenting the hash values under each key.
...  
A deep merge will merge in the the new key ‘package’, but *replace* the ‘user’ key, resulting in rnelson0 and appuser everywhere but only localbackups on node ‘somenode’. Because of this, it’s not as flexible as you’d think. You can see more detail at https://puppet.com/docs/puppet/5.0/hiera_merging.html (can’t find the 6.x link but to the best of my knowledge, it works the same).

I thought about the extra tier to the hash approach, but decided against it due to the merge behavior.  A simple merge at the top level provides good enough flexibility and predictable results.

It also doesn’t scale because you’re writing YAML not code, as Luke suggested earlier. Testing is difficult, and troubleshooting is difficult, and ordering is even more difficult. If you want to, say, add a repo and make sure it’s managed prior to any packages, you’re gonna have to spell out the ordering in your YAML, whereas something like ‘Repo <| tag == “class” |> -> Package <| tag == “class” |>’ within a class can set that ordering only for the related resources much more easily.
 
This is more to my original point.  I'd just as soon avoid writing code and define my environment in data, although you do need to define resource dependencies explicitly this way and testing/troubleshooting is a concern.  I've found troubleshooting to be fairly straight forward to this point, although the environment is growing and complexity with it.  For testing I generally sacrifice a goat (an expendable system) to see that changes I've added do what's expected before releasing to the full target audience.

The last thing I’d point out is that composition is a really good pattern, and a one-class-does-it-all is an anti-pattern to that. Doing just what you need in a series of single, small classes allows you to easily compose a desired state through a role that includes the relevant, and just the relevant, classes. Within each profile, you should be able to delineate much of the specifics, rather than dynamically determine them at runtime via a superclass.

 
Perhaps a question to ask is, how opinionated are your profiles, and how opinionated should they be? IMO, very, and that would probably lower the number of resources you need to dynamically define.

The profiles we currently use have a significant number of parameters to customize behavior, so we do have a good amount of data in hiera.  This is what led me to think: "if I'm putting this much in hiera, why not put everything in hiera?".   I couldn't really come up with a good reason not to go this route, so I started this thread.

Since reading the reasoning here I've continued to think about this off and on and still have a hard time with the idea of hard-coding configuration.  It seems like a bit of a paradox within puppet.  When writing modules it is generally accepted to separate any configuration data from the module code, but when writing profiles go ahead and hard code as many values as possible.  I've been trained to think that separating data from code is a "Good Thing", so going counter to that makes me question my own existence.

For those who may be interested, I've re-visited my module on the forge and updated it to use proper iteration with abstract types instead of create_resources() per the previous points made in this thread.  My software comes with no warranty expressed or implied.

Chadwick Banning

unread,
Aug 24, 2019, 12:42:54 PM8/24/19
to Puppet Users
Since reading the reasoning here I've continued to think about this off and on and still have a hard time with the idea of hard-coding configuration.  It seems like a bit of a paradox within puppet.  When writing modules it is generally accepted to separate any configuration data from the module code, but when writing profiles go ahead and hard code as many values as possible.  I've been trained to think that separating data from code is a "Good Thing", so going counter to that makes me question my own existence.

Just a little more to think on -- separating data from code is a really good thing, but its not universally always a good thing. There are times when putting data in code is the right choice (for instance a small amount of static data required by an application). In Puppet, this choice can also come down to a "pets vs. cattle" situation. I often find myself advising people to put data in a Puppet profile and NOT in Hiera precisely because the data should not change across Hiera hierarchy levels. If I have profiles for a shared service/platform in multiple opertional environments, I don't want unique snowflake versions of it all across the Hiera hierarchy. Putting the data in the profile manifest can ensure it stays the same for any instance of the profile.
Reply all
Reply to author
Forward
0 new messages