Status of Data in modules

449 views
Skip to first unread message

Eric Sorenson

unread,
Oct 11, 2013, 2:09:23 PM10/11/13
to puppe...@googlegroups.com, puppet...@googlegroups.com

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide. The problem that most testers ran into though is that for non-trivial modules they ended up having to re-implement the Puppet DSL logic encoded in their params.pp in convoluted, non-obvious ways. The solutions to this led to more contortions until we'd ended up with the ability to execute parser functions in the right-hand-side of a yaml value. So something which started out trying to help separate data from code ended up putting code back into data!

Additionally, even after multiple attempts to simplify the surface area and user experience with the bindings system (described in ARM-9) that underlay the data-in-modules implementation, users still found its complexity daunting. There are some important bits of scaffolding (like an actual type system for Puppet!) that will prove valuable as more of the future parser and evaluator work that Henrik is building makes its way into the product, but in the final analysis the data in modules feature was the wrong vehicle to introduce them.

Refocusing on the problems users were trying to solve (and here I have to give shout-outs to Ashley Penney for his [puppetlabs-ntp][] branch and the dynamic duo of Spencer Krug/William van Hevelingen for their [startrek][] module) and the problems with the 'params' pattern lent some clarity. We've gotten into a situation of disparity with regard to hiera and data bindings, because data bindings enable module _users_ to use their site-wide hiera data but don't provide moduel _authors_ the same affordance. But rather than introduce additional complexity, we can close the gap for existing code patterns.

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.
- remove the user-facing '--binder' functionality
- fix known problems with the hiera-puppet lookups ([Redmine 15746][15746], namely, but if there are others that are important to you please speak up!)

To show how this would work, I'll rework the ['smart parameter defaults' example][param-classes] I linked above, with my commentary behind `##` comments:

# /etc/puppet/modules/webserver/manifests/params.pp

class webserver::params { ## nothing changes here...
$packages = $operatingsystem ? {
/(?i-mx:ubuntu|debian)/ => 'apache2',
/(?i-mx:centos|fedora|redhat)/ => 'httpd',
}
$vhost_dir = $operatingsystem ? {
/(?i-mx:ubuntu|debian)/ => '/etc/apache2/sites-enabled',
/(?i-mx:centos|fedora|redhat)/ => '/etc/httpd/conf.d',
}
}

# /etc/puppet/modules/webserver/manifests/init.pp

class webserver( ## inheritance is gone, and
$packages, ## data bindings look up the defaults
$vhost_dir ## as webserver::params::vhost_dir
) {

package { $packages: ensure => present }

file { 'vhost_dir':
path => $vhost_dir,
ensure => directory,
mode => '0750',
owner => 'www-data',
group => 'root',
}
}

# /etc/puppet/manifests/site.pp

node default {
class { 'webserver': } ## no params needed, they're in hiera

## then in one of my site-wide hiera layers, I can override
## the value without modifying the module or class declaration

# /etc/puppet/hieradata/snowflake.domain.com.yaml
webserver::vhost_dir: '/some/other/dir'

This way the module author (who probably has the most work to do and needs the expressiveness of the DSL) can provide default data, but site administrators can still override it using mechanisms they're already using.

Note too that this is the next iteration, not necessarily the end state. It's super important to get this right because the whole community is going to have to live with it for a long time; those of you out here on the bleeding edge willing to risk some skin to make something awesome are critical to making that happen.


Eric Sorenson - eric.s...@puppetlabs.com - freenode #puppet: eric0
puppet platform // coffee // techno // bicycles


[puppetlabs-ntp]: https://github.com/apenney/puppetlabs-ntp/tree/data-in-modules
[startrek]: https://github.com/pro-puppet/puppet-module-startrek
[param-classes]: http://docs.puppetlabs.com/guides/parameterized_classes.html#appendix-smart-parameter-defaults
[15746]: https://projects.puppetlabs.com/issues/15746

Nan Liu

unread,
Oct 11, 2013, 2:50:08 PM10/11/13
to puppet-dev, puppet...@googlegroups.com
On Fri, Oct 11, 2013 at 1:09 PM, Eric Sorenson <eric.s...@puppetlabs.com> wrote:

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

Thanks for sending a summary.
 
The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide. The problem that most testers ran into though is that for non-trivial modules they ended up having to re-implement the Puppet DSL logic encoded in their params.pp in convoluted, non-obvious ways. The solutions to this led to more contortions until we'd ended up with the ability to execute parser functions in the right-hand-side of a yaml value. So something which started out trying to help separate data from code ended up putting code back into data!

Additionally, even after multiple attempts to simplify the surface area and user experience with the bindings system (described in ARM-9) that underlay the data-in-modules implementation, users still found its complexity daunting. There are some important bits of scaffolding (like an actual type system for Puppet!) that will prove valuable as more of the future parser and evaluator work that Henrik is building makes its way into the product, but in the final analysis the data in modules feature was the wrong vehicle to introduce them.

Yep, in trivial cases hiera data layer can approximate conditional in params.pp, but the I can see how the complexity ramps up rapidly.

Refocusing on the problems users were trying to solve (and here I have to give shout-outs to Ashley Penney for his [puppetlabs-ntp][] branch and the dynamic duo of Spencer Krug/William van Hevelingen for their [startrek][] module) and the problems with the 'params' pattern lent some clarity. We've gotten into a situation of disparity with regard to hiera and data bindings, because data bindings enable module _users_ to use their site-wide hiera data but don't provide moduel _authors_ the same affordance. But rather than introduce additional complexity, we can close the gap for existing code patterns.

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.

So this is only limited to class variables? and this is still compatible with inherits params class (to ease migration)?
Totally understand the need for proof of concept, should there be experimental branch v.s. production branch (i.e. Linux kernel)? Would appreciate an official notice when a final pattern is decided for long term support. 

Thanks,

Nan 

Spencer Krum

unread,
Oct 11, 2013, 3:01:12 PM10/11/13
to puppe...@googlegroups.com, puppet...@googlegroups.com
Thanks for sending this out Eric. When will there be a release of Puppet with this functionality released? I'm excited to kick the tires on it.

Thanks,
Spencer


--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To post to this group, send email to puppe...@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-dev.
For more options, visit https://groups.google.com/groups/opt_out.



--
Spencer Krum
(619)-980-7820

Dan Bode

unread,
Oct 11, 2013, 3:01:19 PM10/11/13
to puppe...@googlegroups.com, puppet...@googlegroups.com
On Fri, Oct 11, 2013 at 11:09 AM, Eric Sorenson <eric.s...@puppetlabs.com> wrote:

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide. The problem that most testers ran into though is that for non-trivial modules they ended up having to re-implement the Puppet DSL logic encoded in their params.pp in convoluted, non-obvious ways. The solutions to this led to more contortions until we'd ended up with the ability to execute parser functions in the right-hand-side of a yaml value. So something which started out trying to help separate data from code ended up putting code back into data!

Additionally, even after multiple attempts to simplify the surface area and user experience with the bindings system (described in ARM-9) that underlay the data-in-modules implementation, users still found its complexity daunting. There are some important bits of scaffolding (like an actual type system for Puppet!) that will prove valuable as more of the future parser and evaluator work that Henrik is building makes its way into the product, but in the final analysis the data in modules feature was the wrong vehicle to introduce them.

Refocusing on the problems users were trying to solve (and here I have to give shout-outs to Ashley Penney for his [puppetlabs-ntp][] branch and the dynamic duo of Spencer Krug/William van Hevelingen for their [startrek][] module) and the problems with the 'params' pattern lent some clarity. We've gotten into a situation of disparity with regard to hiera and data bindings, because data bindings enable module _users_ to use their site-wide hiera data but don't provide moduel _authors_ the same affordance. But rather than introduce additional complexity, we can close the gap for existing code patterns.

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.

+1

Really happy to see this solved in a way that will not lead to complex migrations to Puppet 4.

Although, to play devil's advocate, two concerns:

the special nature of params as a namespace suffix:
- how do users know not to use this namespace for anything else?
- What if user declares resources in params? Does this fail? Do they always get realized when anything else from that namespace is applied?

the magic mapping from <x>::parameter <x>::params::parameter may be something hard to grok for new users who are not already familiar with the params pattern. This is probably solvable with documentation and --debug logging, but still worth noting.
 

Chuck

unread,
Oct 11, 2013, 7:26:09 PM10/11/13
to puppe...@googlegroups.com, puppet...@googlegroups.com

I see the best aspect of data in modules is that it allows the clear separation of variables per module in hiera.  This is important because module developers don't need access to a central global hiera that is "static".  For our use we need to break variables down by environment and datacenter, hiera is great for this.  And data in modules created a very clear scope of control for module authors.  We were basically treating data in modules as a distributed hiera v1 which works really well.  The only addition that would have been nice is the adding of the classname in front of all variables as you are proposing for your params.pp.

Managing a "central" hiera directory structure can be painful when you have 20 - 50 developers that need to create hiera variables.  I really think data in modules helps this considerably.  I agree that about moving logic into hiera is not beneficial as you just end up with more code.

Chuck

unread,
Oct 11, 2013, 8:12:32 PM10/11/13
to puppe...@googlegroups.com, puppet...@googlegroups.com

I see hiera data in modules just being a very useful extension of the current hiera implementation.

Why

1)  separate out the variables my dev teams use into the modules so that it is easier to package their changes.  Promoting their module includes all of their heir data.

2)  In an enterprise environment this allows other modules to access that data when the modules are not totally independent.  Eg.. I have a apache_module_v1 and a apache_business_unit_v2 that uses the apache module.

3)  access controls by adding hiera into modules as opposed to a main hiera directory structure.  The developers just need access to their modules. 

4)  We version our modules so that they call all be active at the same time and the old modules only have maintenance updates.  Hiera in modules means the central hiera is not touched by the developers.

        eg.  apache_module_v1

               apache_module_v2

               apache_module_v3


What would be nice but not necessary:

1) defined variable automatically have classname added to avoid global conflicts.


eg.

Module: apache

  variable: port

  becomes global hiera:   apache::port

Alessandro Franceschi

unread,
Oct 13, 2013, 6:40:23 AM10/13/13
to puppet...@googlegroups.com, puppe...@googlegroups.com
Thanks for the update Eric, very useful to understand the ongoing works on data in modules.


On Friday, October 11, 2013 9:01:19 PM UTC+2, Dan Bode wrote:



On Fri, Oct 11, 2013 at 11:09 AM, Eric Sorenson <eric.s...@puppetlabs.com> wrote:

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide. The problem that most testers ran into though is that for non-trivial modules they ended up having to re-implement the Puppet DSL logic encoded in their params.pp in convoluted, non-obvious ways. The solutions to this led to more contortions until we'd ended up with the ability to execute parser functions in the right-hand-side of a yaml value. So something which started out trying to help separate data from code ended up putting code back into data!

Additionally, even after multiple attempts to simplify the surface area and user experience with the bindings system (described in ARM-9) that underlay the data-in-modules implementation, users still found its complexity daunting. There are some important bits of scaffolding (like an actual type system for Puppet!) that will prove valuable as more of the future parser and evaluator work that Henrik is building makes its way into the product, but in the final analysis the data in modules feature was the wrong vehicle to introduce them.

Refocusing on the problems users were trying to solve (and here I have to give shout-outs to Ashley Penney for his [puppetlabs-ntp][] branch and the dynamic duo of Spencer Krug/William van Hevelingen for their [startrek][] module) and the problems with the 'params' pattern lent some clarity. We've gotten into a situation of disparity with regard to hiera and data bindings, because data bindings enable module _users_ to use their site-wide hiera data but don't provide moduel _authors_ the same affordance. But rather than introduce additional complexity, we can close the gap for existing code patterns.

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.

+1

Really happy to see this solved in a way that will not lead to complex migrations to Puppet 4.

Although, to play devil's advocate, two concerns:

the special nature of params as a namespace suffix:
- how do users know not to use this namespace for anything else?
- What if user declares resources in params? Does this fail? Do they always get realized when anything else from that namespace is applied?

the magic mapping from <x>::parameter <x>::params::parameter may be something hard to grok for new users who are not already familiar with the params pattern. This is probably solvable with documentation and --debug logging, but still worth noting.

Yes worth noting, but as you said proper documentation might suffice.
Anyway +1 also for me on the lookup to params.pp (Dan this, + Puppet 3 data bindings, reminds me  https://github.com/example42/puppi/blob/master/lib/puppet/parser/functions/params_lookup.rb ;-)

I'd like to add to the complexity cases , in case you haven't still sorted it out, the situations where some of the modules' params change according to user provided values to other params.
For example look here:
https://github.com/stdmod/puppet-elasticsearch/blob/master/manifests/init.pp#L124
the path of the configuration dir ($dir param here) change according to the value of the $install param,
and in order to manage this the code in the linked line is necessary.
I wonder how such a case could be managed with data in modules (without replicating a similar logic).
 
al

Erik Dalén

unread,
Oct 14, 2013, 3:19:10 AM10/14/13
to Puppet Developers, puppet...@googlegroups.com
Isn't that possible to solve by allowing one hiera value to use another hiera value for interpolation?



--
Erik Dalén

Alessandro Franceschi

unread,
Oct 14, 2013, 5:28:40 AM10/14/13
to puppet...@googlegroups.com, Puppet Developers
Didn't know of this function which actually looks interesting.
Is the hiera value usable in the same hierarchy structure ?
I mean, would it be possibile to have an hiera.yaml like:
---
version: 3
hierarchy:
  - category: 'osfamily'
  - category: 'operatingsystem'
  - category: '^{install}'
  - category: 'environment'
  - category: 'common'
      paths:
        - 'is_virtual/${is_virtual}'
        - 'common'

Is so, then it might actually work. 

jcbollinger

unread,
Oct 14, 2013, 10:16:05 AM10/14/13
to puppet...@googlegroups.com


On Friday, October 11, 2013 1:09:23 PM UTC-5, Eric Sorenson wrote:

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide.


I guess I wasn't following this closely enough to realize that getting rid of the "params" class pattern was an objective.  I thought this was a somewhat more general initiative.

[...]

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.


Do I understand correctly that you set out to get rid of the ::params class pattern, but now you favor an approach that depends on that pattern?  Why is that better than being more general: enable an implicit lowest-priority hierarchy level for values of form 'modulename::variable', drawing on data from per-module data files such as modules/modulename/data.yaml?

 
John

David Schmitt

unread,
Oct 15, 2013, 1:52:39 AM10/15/13
to puppet...@googlegroups.com
On 14.10.2013 16:16, jcbollinger wrote:
> So the proposed solution at this point is:
> - enable an implicit data-binding lookup against the hiera-puppet
> backend for a value of 'classname::variable' in the file
> 'modules/classname/manifests/params.pp', which simplifies class
> definition and provides consistency with other hiera backends. As a
> module author, you'd still leave your logic for variables in
> params.pp, but they'd be implicitly looked up via data bindings as
> the class is declared, after consulting site-wide hiera.
>
>
>
> Do I understand correctly that you set out to get rid of the ::params
> class pattern, but now you favor an approach that depends on that
> pattern? Why is that better than being more general: enable an implicit
> lowest-priority hierarchy level for values of form
> 'modulename::variable', drawing on data from per-module data files such
> as modules/modulename/data.yaml?

AIUI, prototyping revealed that many params classes contain logic that
cannot be expressed in .yaml to calculate default values. Making the
fallback lookup check the params class, the logic can be preserved while
avoiding verbose boilerplate code like the params_lookup calls in
Alessandro's modules.


Regards, David

jcbollinger

unread,
Oct 15, 2013, 9:08:10 AM10/15/13
to puppet...@googlegroups.com


I'm not saying that categorically getting rid of ::params classes is a viable target.  In fact, I don't really understand why it was ever an objective in the first place.  On the other hand, I don't see why it makes sense for Puppet to give special significance to that pattern, either.  A more general data-in-modules feature such as I describe would give users the option to avoid ::params classes in some cases, and I'm inclined to think that it would be easier to implement, to understand, and to use.


John

R.I.Pienaar

unread,
Oct 15, 2013, 9:35:00 AM10/15/13
to puppet...@googlegroups.com
there are many reasons to avoid params.pp. It's *code* not *data* and it's
one file that tends to include data for many different roles/sources/uses.

You have to consider the main reasons for separating data from code in order
to understand the motivation.

When you have a params.pp you end up with stuff like this:

https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/params.pp#L28-140

No-one would call that maintainable or readable vs say having AIX.json,
Debian.json and so forth.

For a community member who wants to add support for a new OS this
simplifies things a LOT. They can see what operating systems are supported
already and they can easily add a new one by dropping a single file.

This improves the contributor life cycle significantly:

* Adding FooOS support will not break existing supported OS support.
FooOS.json is only going to be read on FooOS machines.
* They do not have to worry about complex merge conflicts on busy modules
such as the ones you'd find internal to large companies vs many team
members editing a single param.pp
* There's no syntax and stuff to bother about, its pretty easy evaluate
the data and to pre/post commit check this stuff. Contributor dont have
to test extensively to ensure he didnt accidentally mess up params.pp
complex nested statements in some subtle manner
* In large environments if you have strict change control etc, the previous
points help things a lot, you can easily reason about the implications and you
can be sure they wont affect existing systems. It's just data that will affect
a small subset users.

This improves the maintainers life because:

* They can find it easier to merge new OS support because the change is contained
in seperate files and easy to evaluate
* Fewer complex merge commits and easier, cleaner, commit history
* The code is simpler and generally easier to maintain in the long term

This improves the module user life because:

* He can just look at existing data files and know without having to parse complex
nested case statements what the available overridable data is and what the keys
would be etc.

There are more but these are the basics

From a Puppet system perspective its important that this feature behave consistently
and predictably with the current more or less universally accepted standard of data
seperation - hiera. And hiera is all about data, the hiera puppet backend has been
broken for ages and not been missed because it does not provide a solution that solves
the above points. And so the data separation around params.pp will not solve the real
problems either.

Data simply should not be mixed with logic - because then it becomes code again with
all the related problems.

Chuck

unread,
Oct 15, 2013, 1:13:21 PM10/15/13
to puppet...@googlegroups.com
I totally agree with R.I. on this.

John Julien

unread,
Oct 15, 2013, 10:37:25 PM10/15/13
to puppet...@googlegroups.com


On Tuesday, October 15, 2013 8:08:10 AM UTC-5, jcbollinger wrote:


I'm not saying that categorically getting rid of ::params classes is a viable target.  In fact, I don't really understand why it was ever an objective in the first place.  On the other hand, I don't see why it makes sense for Puppet to give special significance to that pattern, either.  A more general data-in-modules feature such as I describe would give users the option to avoid ::params classes in some cases, and I'm inclined to think that it would be easier to implement, to understand, and to use.


John


I agree with this. It seems the proposed new solution adds complexity and non-intuitive value lookups that could confuse users.  I think the data-in-modules feature sounds useful.  Perhaps as John suggests a change in scope is needed.  Instead of getting rid of params.pp the scope should be to move hierarchical data out of params.pp and leave variables derived from logic in params.pp.
 

John Julien

unread,
Oct 15, 2013, 10:45:31 PM10/15/13
to puppet...@googlegroups.com, puppe...@googlegroups.com


On Friday, October 11, 2013 7:12:32 PM UTC-5, Chuck wrote:

What would be nice but not necessary:

1) defined variable automatically have classname added to avoid global conflicts.


eg.

Module: apache

  variable: port

  becomes global hiera:   apache::port

+1
Defining a variable in a module but having it's scope be global seems counter intuitive and could lead to conflicts.  If someone wants a topscope variable they should probably define it in a topscope location (ENC, Centralized Hiera Files, etc)

Alessandro Franceschi

unread,
Oct 16, 2013, 5:54:14 AM10/16/13
to puppet...@googlegroups.com
It's difficult to disagree with such statements, and actually I do agree with all of them.

Just would like to point out  pair of notes, not necessarily in contradiction with what you wrote:
- The params pattern was probably the best choice up to now, data in modules probably provides an alternative and conceptually better way to do the same things, which will also the have practical benefits you well pointed out, but, to my understanding,  it has some issues that have to be addressed:
-- Is more difficult to write modules, especially in some specific cases, as Eric said, we can cope with that but that's worth considering, because complexity is never a welcomed word when talking about code. 
-- It's not clear , at least to me, if some real use cases are covered, such as the ones where some class parameters or internal vars change according to the value provided to other parameters.
The install case I think is quite clear to express this, and I still haven't understood if it's possible to have in the module's hiera.yaml something like this:
---
version: 3
hierarchy:
  - category: 'osfamily'
  - category: 'operatingsystem'
  - category: '^{install}' # Is this possible? Is this the correct syntax?
  - category: 'environment'
  - category: 'common'
      paths:
        - 'is_virtual/${is_virtual}'
        - 'common'

-- If a case like the above can't be expressed as "pure" hiera data, we'll have to turn back in having some data in code (incidentally I don't consider it a mortal sin, as we have been doing this in modules all the time up to now), either reverting to hiera_puppet or with the usual code gymnics with selectors, cases and ifs. 

my2 alyays naive c
al

R.I.Pienaar

unread,
Oct 16, 2013, 6:14:12 AM10/16/13
to puppet...@googlegroups.com


----- Original Message -----
> From: "Alessandro Franceschi" <a...@lab42.it>
> To: puppet...@googlegroups.com
> Sent: Wednesday, October 16, 2013 10:54:14 AM
> Subject: Re: [Puppet Users] Re: Status of Data in modules
>
>
>
indeed, params.pp is fine in the absence of something better. The goal has to
be to separate data from code though. To ask why data in params.pp is a problem
or to ask why it's the right approach is simply to ask why hiera exists at all.

> -- Is more difficult to write modules, especially in some specific cases,
> as Eric said, we can cope with that but that's worth considering, because
> complexity is never a welcomed word when talking about code.

indeed - in cases where you have to associate some deriving logic with some
data item the temptation is there to put that logic in the data. This is a
mistake.

There's nothing wrong with a hybrid model where you have data - pure data -
in a data file and then a class similar in spirit to params.pp to take that data
and massage it and create derived data.

The missing thing here and probably the elephant in the room is validation
of that data. The proposed type implementation in the current thing thats
in 3.3.x is a mistake.

It's a trojan horse to get some half baked type system into Puppet via the
back door, it's at odds with everything else in Puppet and simply not the right
way to go about it, additional complexity that doesn't seem to have any place
in Puppet. As a means of providing data validation it's very naive - saying
data should be of type Integer is not enough of a validation.

We should rather approach this in a way that there be some descriptive language
that describes the data - ie. the foo::bar key has to be a Integer between 10
and 20 - this should be something a module author provides and that any data
be it from paramaterised classes, data bindings or otherwise are subject to
this validation.

> -- It's not clear , at least to me, if some real use cases are covered,
> such as the ones where some class parameters or internal vars change
> according to the value provided to other parameters.
> The install case I think is quite clear to express this, and I still
> haven't understood if it's possible to have in the module's hiera.yaml
> something like this:
>
> ---
> version: 3
> hierarchy:
> - category: 'osfamily'
> - category: 'operatingsystem'
>
> - category: '^{install}' # Is this possible? Is this the correct syntax?

hiera does support interpolation of data yes - I think that's what you mean?
not sure where install comes from, I am guessing its a in-scope variable?


>
> - category: 'environment'
> - category: 'common'
> paths:
> - 'is_virtual/${is_virtual}'
> - 'common'
>
>
> -- If a case like the above can't be expressed as "pure" hiera data, we'll
> have to turn back in having some data in code (incidentally I don't
> consider it a mortal sin, as we have been doing this in modules all the
> time up to now), either reverting to hiera_puppet or with the usual code
> gymnics with selectors, cases and ifs.


It's a critical feature thats always been there - If I understand what you
mean.


R.I.Pienaar

unread,
Oct 16, 2013, 6:15:20 AM10/16/13
to puppet...@googlegroups.com
Sorry - it's critical but also missing in the proposed data in module implementation
which is a oversight.



>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users...@googlegroups.com.
> To post to this group, send email to puppet...@googlegroups.com.
> Visit this group at http://groups.google.com/group/puppet-users.

Alessandro Franceschi

unread,
Oct 16, 2013, 7:06:40 AM10/16/13
to puppet...@googlegroups.com
Yes, is a parameter of  the class  it allow's users to decide how to install the class/module's application: via package, downloading and extracting a tarball from the official site or whatever.
When used, some module's parameters change (for example the paths of files) and therefore the values of these paths are no more the operatingsystem specific ones but depend on other variables ( something like $install_destination_dir, for example).
So the issue here is: if $install == 'upstream' (retrieve the software as a zip/tarball  from the upstream site and not use the OS package) and $install_destination_dir == '/opt' , for example, then the paths of the configuration files is different.
To my understanding in order to reproduce this logic in hiera data, we have to add a hierarchy level according to the value of $install and in the relevant yaml file (or whatever datastore)  interpolate $install_destination_dir in order to provide the correct $config_file_path .

Actually my sample hiera.yml could be just :


--- 
version: 3 
hierarchy: 
  - category: 'osfamily' 
  - category: 'operatingsystem' 
  - category: 'install'  # Class' scope variable , what the correct way to reference it?
...

so maybe it's not that hard to do.

jcbollinger

unread,
Oct 16, 2013, 11:23:17 AM10/16/13
to puppet...@googlegroups.com


On Tuesday, October 15, 2013 8:35:00 AM UTC-5, R.I. Pienaar wrote:


Eric observes -- and this comes as no surprise to me -- that the reason it didn't work well to eliminate params.pp was that sometimes you need code to help set appropriate default parameter values.  Or at least that is the practical conclusion that arose from field testing the original approach.  I am all for pushing the actual data out of manifests, but you still need a place to hang code.  Pushing code out into the data to allow ::params classes to be dropped does not achieve the objective of separating data from code.

 
one file that tends to include data for many different roles/sources/uses.


That's a fair observation, though a bit usage-oriented.  Nothing prevents users from creating a separate class::params class for every single class, which would at least partially address that.  In any case, the proposed new data in modules approach does nothing to address that concern -- if anything, it magnifies that problem.  If the objective is still to avoid params.pp, then why does the revised data in modules proposal provide data only to ::params classes?

 

You have to consider the main reasons for separating data from code in order
to understand the motivation.

When you have a params.pp you end up with stuff like this:

   https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/params.pp#L28-140



That sort of thing is not really the issue.  I think we pretty much all agree that pushing that out to an external data source is a win.  The issue is with cases where the rules for choosing parameter defaults are more complex than a simple switch based on one fact.  As I understand Eric, field testing showed that users wanted to be able to perform essentially arbitrary logic to compute the data provided to modules.  I agree with his conclusion that rolling such logic into the data is as bad as putting the data into ::params classes.  Thus both are needed, and ::params classes or an equivalent serve a useful continuing role.



From a Puppet system perspective its important that this feature behave consistently
and predictably with the current more or less universally accepted standard of data
seperation - hiera.  And hiera is all about data, the hiera puppet backend has been
broken for ages and not been missed because it does not provide a solution that solves
the above points.  And so the data separation around params.pp will not solve the real
problems either.


 
We agree on most of what you said, but it doesn't seem very responsive to the comments to which they ostensibly reply.  I am in no way arguing against the idea of the data in modules subsystem.  It is a fantastic idea, and long past due.  I am concerned, however, about the new approach Eric proposed.  I suggested a more general approach than (my understanding of) the one he described, one not tied specifically to ::params classes.  Inasmuch as you disfavor ::params classes, I would think that you would find much to like about my counterproposal.  Indeed, I think my proposal is very much like the original prototype you floated.

I do think it is a mistake to focus on eliminating all need for ::params classes as a goal of the initiative, however.  Likely most need for them can be redirected to a relatively simple data-in-modules subsystem, and that would be well, but the initiative does not fail if some need for the ::params class pattern remains.


John

R.I.Pienaar

unread,
Oct 16, 2013, 11:33:43 AM10/16/13
to puppet...@googlegroups.com


----- Original Message -----
> From: "jcbollinger" <John.Bo...@stJude.org>
> To: puppet...@googlegroups.com
> Sent: Wednesday, October 16, 2013 4:23:17 PM
> Subject: Re: [Puppet Users] Re: Status of Data in modules
>
>
>
> didn't work well to eliminate params.pp was that sometimes you *need* code
> to help set appropriate default parameter values. Or at least that is the
> practical conclusion that arose from field testing the original approach.
> I am all for pushing the actual data out of manifests, but you still need a
> place to hang code. Pushing code out into the data to allow ::params
> classes to be dropped does not achieve the objective of separating data
> from code.
>
>
>
> > one file that tends to include data for many different roles/sources/uses.
> >
>
>
> That's a fair observation, though a bit usage-oriented. Nothing prevents
> users from creating a separate class::params class for every single class,
> which would at least partially address that. In any case, the proposed new
> data in modules approach does nothing to address that concern -- if
> anything, it magnifies that problem. If the objective is still to avoid
> params.pp, then why does the revised data in modules proposal provide data *
> only* to ::params classes?
>
>
>
> >
> > You have to consider the main reasons for separating data from code in
> > order
> > to understand the motivation.
> >
> > When you have a params.pp you end up with stuff like this:
> >
> >
> > https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/params.pp#L28-140
> >
> >
>
> That sort of thing is not really the issue. I think we pretty much all
> agree that pushing that out to an external data source is a win. The issue
> is with cases where the rules for choosing parameter defaults are more
> complex than a simple switch based on one fact. As I understand Eric,
> field testing showed that users wanted to be able to perform essentially
> arbitrary logic to *compute* the data provided to modules. I agree with
> his conclusion that rolling such logic into the data is as bad as putting
> the data into ::params classes. Thus both are needed, and ::params classes
> or an equivalent serve a useful continuing role.
>
>
>
> > From a Puppet system perspective its important that this feature behave
> > consistently
> > and predictably with the current more or less universally accepted
> > standard of data
> > seperation - hiera. And hiera is all about data, the hiera puppet backend
> > has been
> > broken for ages and not been missed because it does not provide a solution
> > that solves
> > the above points. And so the data separation around params.pp will not
> > solve the real
> > problems either.
> >
> >
>
> We agree on most of what you said, but it doesn't seem very responsive to
> the comments to which they ostensibly reply. I am in no way arguing
> against the idea of the data in modules subsystem. It is a fantastic idea,
> and long past due. I *am* concerned, however, about the new approach Eric
> proposed. I suggested a more general approach than (my understanding of)
> the one he described, one not tied specifically to ::params classes.
> Inasmuch as you disfavor ::params classes, I would think that you would
> find much to like about my counterproposal. Indeed, I think my proposal is
> very much like the original prototype you floated.

Your comments are good and addressed in later replies, especially related to
data mangling. This is a common problem in all languages, data almost never
arrives in the final form and all programming languages have patterns for
retrieving data, validating and mangling it. We just need to introduce
similar patterns.

I, obviously, share your concern with the current round of proposals. Data
in module querying only params.pp is literally the worst possible suggestion
one can make in this regard. It would be a massive step backward. Might as
well just go ahead and unmerge hiera if the goal is to not learn anything from
its design and incredibly wide adoption.


> I do think it is a mistake to focus on eliminating all need for ::params
> classes as a goal of the initiative, however. Likely *most* need for them
> can be redirected to a relatively simple data-in-modules subsystem, and
> that would be well, but the initiative does not fail if some need for the
> ::params class pattern remains.

yeah, as per the other replies - eliminate *storing data* in params.pp but
validate/mangle in something like params.pp. That is in the event that
no-one delivers a layer of data validation around data bindings and hiera.

Eric Sorenson

unread,
Oct 21, 2013, 9:14:59 PM10/21/13
to puppet...@googlegroups.com
Another round of thanks for the replies to this thread. I apologize that almost as soon as I posted it, I got pulled off onto another project and wasn't able to follow up until now. Replies inline below, and there are probably a couple more coming to different branches (damn I miss Usenet threading!)


John Bollinger wrote:
> We agree on most of what you said, but it doesn't seem very responsive to
> the comments to which they ostensibly reply.  I am in no way arguing
> against the idea of the data in modules subsystem.  It is a fantastic idea,
> and long past due.  I *am* concerned, however, about the new approach Eric
> proposed.  I suggested a more general approach than (my understanding of)
> the one he described, one not tied specifically to ::params classes.
> Inasmuch as you disfavor ::params classes, I would think that you would
> find much to like about my counterproposal.  Indeed, I think my proposal is
> very much like the original prototype you floated.

John I didn't see a more detailed description of what you're proposing; is this section (quoted from upthread) what you're referring to?

Do I understand correctly that you set out to get rid of the ::params class pattern, but now you favor an approach that depends on that pattern? 

Heh, well when you put it that way...
 
Why is that better than being more general: enable an implicit lowest-priority hierarchy level for values of form 'modulename::variable', drawing on data from per-module data files such as modules/modulename/data.yaml?

If I understand this correctly this is slightly different (and probably inadequate from RI's standpoint), because it just adds another 'category' (in the ARM-9 sense) to the end of each lookup, and what RI and others propose is to have another _complete hiera invocation_ inside the module owning a class parameter's namespace the end of each unsuccessful site-hiera lookup. Separate hiera.yaml config file with its own hierarchy defined, and a tree of data files. (params.pp does this by letting old-school puppet DSL logic determine your "hierarchy")

I also talked to a user today who wants data from modules (by doing hash key merge on a parameter's class::subclass::varname) from *any* module in the modulepath to contribute, say, sudoers rules to the sudo module from other site-written modules that require particular sudoers stanzas. So I'm trying to consider how to pull all of this together without making a O(n^n) complexity explosion.
 

RI replied:

Your comments are good and addressed in later replies, especially related to
data mangling.  This is a common problem in all languages, data almost never
arrives in the final form and all programming languages have patterns for
retrieving data, validating and mangling it.  We just need to introduce
similar patterns.  

This is really interesting, and not something that's come up so far AFAIK. It ties in somewhat to https://projects.puppetlabs.com/issues/20199  , needing a way to indicate the data type of something that's looked up implicitly with data bindings, but introduces another layer around retrieving and modifying data as it flows back towards puppet, which I hadn't considered.  That is what the "code-in-data" people are asking for, like https://github.com/puppetlabs/hiera/pull/152 that ended up with arbitrary puppet functions inside hiera curly brace expansion.  Would love thoughts on how to do that in a generally useful, lightweight way.
 
I, obviously, share your concern with the current round of proposals.  Data
in module querying only params.pp is literally the worst possible suggestion
one can make in this regard.  It would be a massive step backward. Might as
well just go ahead and unmerge hiera if the goal is to not learn anything from
its design and incredibly wide adoption.

Oh surely there's way worse suggestions out there  :)
 
> I do think it is a mistake to focus on eliminating all need for ::params
> classes as a goal of the initiative, however.  Likely *most* need for them
> can be redirected to a relatively simple data-in-modules subsystem, and
> that would be well, but the initiative does not fail if some need for the
> ::params class pattern remains.
 
yeah, as per the other replies - eliminate *storing data* in params.pp but
validate/mangle in something like params.pp.  That is in the event that
no-one delivers a layer of data validation around data bindings and hiera.


So it doesn't seem helpful to get data-bindings integration via puppet code, even as a first step?

I definitely agree hiera data in general needs a way to do validation, but the semantics you described of "requiring an integer between 10 and 20" would be additional complexity on top of Henrik's type system. (That work was foundational BTW, not specific to the data-in-modules binder as you said up-thread, so it can be reused independently of ARM-9)

--eric0

R.I.Pienaar

unread,
Oct 22, 2013, 4:37:41 AM10/22/13
to puppet...@googlegroups.com


----- Original Message -----
> From: "Eric Sorenson" <eric.s...@puppetlabs.com>
> To: puppet...@googlegroups.com
> Sent: Tuesday, October 22, 2013 2:14:59 AM
> Subject: Re: [Puppet Users] Re: Status of Data in modules
>
No, I don't think it's much more work to get to a better place - as is evident
from a PR I sent that does this in about 10% of the code of whats shipped in
3.3.0.

Given appropriate data in modules the puppetlabs/ntp module removes the params.pp
entirely and becomes just this: http://p.devco.net/437/ combined with a number
of json files - one per OS.

The key point here is that the data layer provides just the data and that - as
now - the DSL layer provides validation and mangling. Not much mangling in
this example but you can see how here you'd derive data and everything would
just access $ntp::foo

This is a massive simplification and improvement over having this duplication:

https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/init.pp#L2-21

AND

https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/params.pp#L2-21

and well the whole of params.pp really.

It's clear here that there's no big hurdle in taking pure data validating it and
deriving new data from it, we already have a DSL thats reasonably ok at this
and the new parser makes it even better at that task - though not being able to
reassign variables in the scope of a class still suck.

In the case where you'd like the validation/mangling in a different place you
can move that to a class that is delegated for this purpose, this example is
the first thing I tried, it's not great but enough to show the pattern one
might come up with... http://p.devco.net/438

This receives the external data into the ntp class and then has a class that
takes that data, validates and mangles it and exposes the resulting data as
$ntp::model::foo. It's similar in spirit as the params.pp is today, provides
validation etc separate from the data and achieves the same end goal as embedding
logic in the data without doing that. Errors are richer and clearer the complexity
of what data you can derive is greater and overall it's just superior to trying
to embed even this simple bit of logic into the data. This is valid for both this
weird model class and the first example ( http://p.devco.net/437/ )

This shows validation and data mangling that exceeds what could comfortably
be done with code inside the data. And means existing approaches wrt unit tests
etc remains valid. You did plan to come up with a unit test framework, linter,
parser validator etc for your data right since now it would contain logic? I'd
love to see how these would look and I hope the answer to the code-in-data crowd
is not that they don't think its needed.

So to be clear when I said patterns for validation and mangling of data needs
to be come up with I do not say we need to invent new things in puppet, just
use puppet. Eventually sure we might add some schema to the data but as a
first iteration on data in modules I think this is a suitable start point and
would naturally progress towards data schema etc.

A bit more on the code in data model that people are saying is indispensable.
You're basically saying you could never use a database that did not have stored
procedures. The stored procedure approach has been discredited far and wide, I
won't get into it, google it many people have written much on the subject and
if you look at any of the big frameworks today stored procs just don't feature.
In the case of Puppet they would trap your data into a form only parsable by
Puppet. Hiera has always had the ability to be used as a gem to use your
data outside of puppet. The pluggable backend has been a major win in it's
adoption because people want outside data in Puppet (something we're losing in
the current round of suggestions). Suggesting the only usable data would be
ones that only Puppet could parse is quite short sighted.

> I definitely agree hiera data in general needs a way to do validation, but
> the semantics you described of "requiring an integer between 10 and 20"
> would be additional complexity on top of Henrik's type system. (That work

correct, thats why I think bringing in the type system as part of this is a
mistake.

> was foundational BTW, not specific to the data-in-modules binder as you
> said up-thread, so it can be reused independently of ARM-9)

this has such wide implications that I do not believe enough work has been
done to see how a type system would effect the rest of puppet, so bringing
it in as part of this work is not the right time. It's a overcomplex distraction
when what puppet desperately need is to focus on the data problem and stop
punting on it and stop over engineering and trying to solve 100 problems in
one go, it's not needed. The PR I sent does this in a very focussed manner
in line with current design and future proof way.

It's no secret the future parser is far from ready for general use and the
current trajectory you're taking with data in modules is to postpone their
usefulness even further into the future till a time where the future parser
is usable, performant and bug free. Meanwhile every user of Puppet is being
held back by a module system that lacks a clear data system.

jcbollinger

unread,
Oct 22, 2013, 10:13:14 AM10/22/13
to puppet...@googlegroups.com


On Monday, October 21, 2013 8:14:59 PM UTC-5, Eric Sorenson wrote:
Another round of thanks for the replies to this thread. I apologize that almost as soon as I posted it, I got pulled off onto another project and wasn't able to follow up until now. Replies inline below, and there are probably a couple more coming to different branches (damn I miss Usenet threading!)

John Bollinger wrote:
> We agree on most of what you said, but it doesn't seem very responsive to
> the comments to which they ostensibly reply.  I am in no way arguing
> against the idea of the data in modules subsystem.  It is a fantastic idea,
> and long past due.  I *am* concerned, however, about the new approach Eric
> proposed.  I suggested a more general approach than (my understanding of)
> the one he described, one not tied specifically to ::params classes.
> Inasmuch as you disfavor ::params classes, I would think that you would
> find much to like about my counterproposal.  Indeed, I think my proposal is
> very much like the original prototype you floated.

John I didn't see a more detailed description of what you're proposing; is this section (quoted from upthread) what you're referring to?

Yes.
 

Do I understand correctly that you set out to get rid of the ::params class pattern, but now you favor an approach that depends on that pattern? 

Heh, well when you put it that way...
 


Let's also keep in mind that the purpose of the ::params class pattern is not really to serve as a per-module general data repository.  Rather, it is specifically to provide a means for indirection of class parameter defaults.  To the extent that ::params classes now do serve as data repositories, it is -- or should be -- in service to that purpose, not to a broader one.  Data in modules is a complementary, but more general, approach whereby default values expressed in DSL code can in some cases be replaced by default values drawn from per-module data.  Where data are consumed by a module in other ways or for other purposes, there is no particular reason why a ::params class should be involved.

 
Why is that better than being more general: enable an implicit lowest-priority hierarchy level for values of form 'modulename::variable', drawing on data from per-module data files such as modules/modulename/data.yaml?

If I understand this correctly this is slightly different (and probably inadequate from RI's standpoint), because it just adds another 'category' (in the ARM-9 sense) to the end of each lookup, and what RI and others propose is to have another _complete hiera invocation_ inside the module owning a class parameter's namespace the end of each unsuccessful site-hiera lookup. Separate hiera.yaml config file with its own hierarchy defined, and a tree of data files. (params.pp does this by letting old-school puppet DSL logic determine your "hierarchy")



I don't have any particular objection to implementing data-in-modules as a separate full-fledged lookup against a per-module fallback hierarchy, but the qualitative differences from what I suggested are subtle.  For the most part, I think it's just a question of how many levels you can or do add to the bottom of the logical hierarchy, whether it's implemented via one call to the hiera subsystem or two.  There is a difference, however, in the behavior of lookups that collected data from across the hierarchy, i.e. hiera_hash() and hiera_array().  Those aren't relevant to class parameter binding (at this point), but it is worth considering what semantics are wanted there, and whether there might be a way for the caller to choose.

 
I also talked to a user today who wants data from modules (by doing hash key merge on a parameter's class::subclass::varname) from *any* module in the modulepath to contribute, say, sudoers rules to the sudo module from other site-written modules that require particular sudoers stanzas. So I'm trying to consider how to pull all of this together without making a O(n^n) complexity explosion.


I'm with R.I. in suggesting that you get something solid and fundamentally sound out soon, even if it doesn't address every user request on the first go (or ever).  I understand how a confederated data source such as you now describe could be useful, but I think such a feature would require a significant effort in its own right.

Furthermore, I think you are fast approaching the point where the data subsystem cannot automagically do the right thing in every case.  I don't think it would be a sin to require some features to be explicitly declared or invoked by DSL code.  For example, perhaps you want a data access function that allows the caller to somehow specify the scope of the data to search.  Maybe a couple of releases down the line.


John


R.I.Pienaar

unread,
Oct 22, 2013, 10:46:45 AM10/22/13
to puppet...@googlegroups.com


----- Original Message -----
> From: "jcbollinger" <John.Bo...@stJude.org>
> To: puppet...@googlegroups.com
> Sent: Tuesday, October 22, 2013 3:13:14 PM
> Subject: Re: [Puppet Users] Re: Status of Data in modules
>
>
>
> broader one. Data in modules is a *complementary*, but more general,
Absolutely, there will never be a single data solution that solves 100% of problems
but luckily we have multiple approaches - hiera, ENC, node terminii etc. This user
can also write his own hiera backend to achieve his goal. Thats the point.

Aim for a solid fit for the 80% of problems, inform and educate by the solution
by being prescriptive but with extension points for the 20% of users who have really
complex problems who can carry the cost of the complexity solving them brings.

For the rest the design of the tool informs how they approach writing manifests
and for green fields even how they build infrastructures.

There is no chance that a 100% solution will be found for the data problem, just
give up. Sometimes it's ok - as in this use case Eric mentions - to just say no
that is not in the interest of the larger % of community and we're prioritizing
shipping something over pleasing 100% of people.

Just say no, point to extension points. ship something - as long as that something
isn't impossible to extend for mortals like the arm9 work, because then you have to
solve 100% of the problems.

Jon Shanks

unread,
Nov 27, 2013, 5:31:32 AM11/27/13
to puppet...@googlegroups.com, puppe...@googlegroups.com
Hey, 

Just to jump in at the end, (been following the thread) and looked at the implementatino of data in modules, but found that the complexity surrounding it was a bit much for people who were not experienced. Also, troubleshooting issues with data, i.e. some form of outcome of a puppet run that didn't match what was expected, created an extra overhead in trying to identify where the issue lied. We also really only want to return data for modules which are included on a host.

I therefore wrote a wrapper function which essentially utilises the hierarchy but with an addition

 - modules/%{module_name}/%{klass}

It retreives the classes from the API, using the node indirection.

%{klass} gets defined in the scope as it iterates, (ephemeral_from) and merges the data it finds, so, if a module as a dependency on another module i.e. 'foreman' has sudo rules it requires, you could place those within modules/sudo/foreman.yaml so that you are containing data relating to a module, without managing different hierarchical configurations throught your module structure. 

For defaults, we defined that as modules/%{module_name}/defaults, which is the last in the hierarchy. 

This probably would haev been nicer if the data sources could utlises the returned classes in the hierarchy and naturally iterate over the array, to save the wrapper functinoaity to do it. 

Obviously this wouldn't solve all the problems faced with modules or the principles of the forge modules being usable easily out of the box, but it solved the issue we had in categorising data within modules. Thought i would mention our use case. 

Jon


On Friday, 11 October 2013 19:09:23 UTC+1, Eric Sorenson wrote:

Thanks to everyone who kicked the tires on the experimental data in modules feature included in Puppet 3.3.0. We got a lot of feedback, some cool proof-of-concept modules, and a definitive conclusion to the experiment.

The idea of including a module-specific hiera backend is centered around one primary use case: replacing the 'params class pattern', a common idiom in Puppet modules that's described in the [Using Parameterized Classes][param-classes] guide. The problem that most testers ran into though is that for non-trivial modules they ended up having to re-implement the Puppet DSL logic encoded in their params.pp in convoluted, non-obvious ways. The solutions to this led to more contortions until we'd ended up with the ability to execute parser functions in the right-hand-side of a yaml value. So something which started out trying to help separate data from code ended up putting code back into data!

Additionally, even after multiple attempts to simplify the surface area and user experience with the bindings system (described in ARM-9) that underlay the data-in-modules implementation, users still found its complexity daunting. There are some important bits of scaffolding (like an actual type system for Puppet!) that will prove valuable as more of the future parser and evaluator work that Henrik is building makes its way into the product, but in the final analysis the data in modules feature was the wrong vehicle to introduce them.

Refocusing on the problems users were trying to solve (and here I have to give shout-outs to Ashley Penney for his [puppetlabs-ntp][] branch and the dynamic duo of Spencer Krug/William van Hevelingen for their [startrek][] module) and the problems with the 'params' pattern lent some clarity. We've gotten into a situation of disparity with regard to hiera and data bindings, because data bindings enable module _users_ to use their site-wide hiera data but don't provide moduel _authors_ the same affordance. But rather than introduce additional complexity, we can close the gap for existing code patterns.

So the proposed solution at this point is:
- enable an implicit data-binding lookup against the hiera-puppet backend for a value of 'classname::variable' in the file 'modules/classname/manifests/params.pp', which simplifies class definition and provides consistency with other hiera backends. As a module author, you'd still leave your logic for variables in params.pp, but they'd be implicitly looked up via data bindings as the class is declared, after consulting site-wide hiera.

R.I.Pienaar

unread,
Dec 8, 2013, 5:19:37 PM12/8/13
to puppet...@googlegroups.com
I have released a solution to this @ http://www.devco.net/archives/2013/12/08/better-puppet-modules-using-hiera-data.php

Need as wide as possible feedback and testing, lets move forward.

Alessandro Franceschi

unread,
Dec 8, 2013, 5:40:55 PM12/8/13
to puppet...@googlegroups.com
Wow, this looks promising: sane, plain and easy to use. 
Going to test it soon.
Looking at things in perspective, how do you think this approach will go along with the one implemented in Puppet 3.3 ?

al

R.I.Pienaar

unread,
Dec 8, 2013, 7:37:54 PM12/8/13
to puppet...@googlegroups.com


----- Original Message -----
> From: "Alessandro Franceschi" <a...@lab42.it>
> To: puppet...@googlegroups.com
> Sent: Sunday, December 8, 2013 10:40:55 PM
> Subject: Re: [Puppet Users] Re: Status of Data in modules
>
> Wow, this looks promising: sane, plain and easy to use.
> Going to test it soon.
> Looking at things in perspective, how do you think this approach will go
> along with the one implemented in Puppet 3.3 ?

In another thread it was mentioned by Hendrik that the 3.3.x one is being removed
which prompted me to make my code into a module.

Fabio Sangiovanni

unread,
Dec 31, 2013, 3:31:06 AM12/31/13
to puppet...@googlegroups.com, puppe...@googlegroups.com
Hi everybody,

is there any news about this topic?
I think it would be great to know Puppetlabs' official position on the matter (if any), and the status of activities about it (issues on tickets.puppetlabs.com don't say much).

Thanks!

Trevor Vaughan

unread,
Jan 1, 2014, 5:45:53 PM1/1/14
to puppet...@googlegroups.com, puppe...@googlegroups.com
I also would like to know if this is going to get mainlined.

It would be an excellent addition for overall module reuse.

Trevor


--
You received this message because you are subscribed to the Google Groups "Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
tvau...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --

Fabio Sangiovanni

unread,
Jan 3, 2014, 8:11:50 AM1/3/14
to puppet...@googlegroups.com, puppe...@googlegroups.com
Ok, I get it. First rule of data in modules: you don't talk about data in modules.

Eric Sorenson

unread,
Jan 3, 2014, 12:09:54 PM1/3/14
to puppet...@googlegroups.com, puppe...@googlegroups.com


On Jan 3, 2014, at 5:11 AM, Fabio Sangiovanni <fsangi...@gmail.com> wrote:

> Ok, I get it. First rule of data in modules: you don't talk about data in modules.
>

Hah! No, I just don't like to reply until I have something substantial to report. Did you see this thread from a couple of weeks ago? It was only on puppet-dev so if you're following this in puppet-users you may have missed it: https://groups.google.com/d/topic/puppet-dev/f0KrpOtfKRY/discussion

Next steps out of that were that I'm pulling together a google doc, which I haven't finished yet due to holidays.

I'm excited that people are working with the module_data implementation. The main technical concern I have is forward and backward compatibility: how should a module express that it needs a particular implementation of DIM?

Fabio Sangiovanni

unread,
Jan 3, 2014, 12:23:02 PM1/3/14
to puppet...@googlegroups.com, puppe...@googlegroups.com
Hi,

thanks for your answer :)
No, I'm not suscribed to puppet-dev, just to puppet-users, so I definetly missed it, my bad.
Glad to hear things are moving!

Thanks again for the follow up and keep up with the good work :)

Gregory Orange

unread,
Mar 24, 2015, 3:53:36 AM3/24/15
to puppet...@googlegroups.com
I've been trying to use the module_data module, but cannot get it to
bring in values. Is there a worked example somewhere, including site.pp
and hiera.yaml?

https://github.com/zipkid/puppet3-hiera_data_in_module hasn't helped - I
get the class defaults from init.pp, not the data from the yaml files.

Cheers,
Greg.

Fraser Goffin

unread,
Apr 2, 2015, 7:56:11 AM4/2/15
to puppet...@googlegroups.com
Here's a simple example that works for me :-

First you obviously have the ripienaar/module_data module installed and available to your puppet apply (I'm using masterless puppet).

My folder structure :-

- puppet
|---- modules
|------- sonarqube
|---------- data
|------------- hiera.yaml
|------------- common.yaml
|------------- osfamily
|---------------- windows.yaml
|---------------- ubuntu.yaml
|------------- version
|---------------- 4.5.yaml
|---------------- 5.0.yaml

..\puppet\modules\sonarqube\data\hiera.yaml

:hierarchy:
- "osfamily/%{::osfamily}"
- "version/%{sonarqube::sonarqube_majmin_version}"
- common

:backends:
- yaml

:yaml:
  :datadir: puppet://sonarqube/data

common.yaml :-

sonarqube::sonar_runner_version: '2.4'
sonarqube::sonar_runner_zip: "sonar-runner-dist-%{hiera('sonarqube::sonar_runner_version')}.zip"
sonarqube::sonarqube_server_hostname: 'localhost'
sonarqube::sonarqube_server_port: '9000'
sonarqube::sonarqube_database_type: 'mysql'
...

4.5.yaml :-

sonarqube::sonarqube_full_version: '4.5.2'
sonarqube::sonarqube_installer_zip: "sonarqube-%{hiera('sonarqube::sonarqube_full_version')}.zip"
...

windows.yaml :-

sonarqube::sonarqube_package_location: 'C:/puppet-installs/sonarqube'
sonarqube::sonarqube_apps_install_basefolder: 'C:/Apps'
...


Then in  ...\puppet\modules\profiles\manifests\sonarqube.pp   (I'm following the roles and profiles pattern) :-

  class {'::sonarqube':
    #sonarqube_install_path       => $sonarqube_install_path,
    #sonarqube_package_location   => $sonarqube_package_location,
    #sonarqube_installer_zip      => $sonarqube_installer_zip,
    sonarqube_majmin_version     => $sonarqube_majmin_version,
    #sonar_runner_version         => $sonar_runner_version,
    #sonarqube_proxy_host        => $sonarqube_proxy_host,
    #sonarqube_proxy_port        => $sonarqube_proxy_port
     ...
   }

Now, how you set the values you passed to the parameterised class (::sonarqube) is up to you (note: do hiera lookups only in your PROFILE class). First you could set all/some of them explicitly either by simply assigning a literal or doing a hiera call (obviously the first option here takes hiera completely out of the equation since it has the highest precedence when resolving the class params) :-

$sonarqube_server_port = '9000'

or (given the above common.yaml) :-

$sonarqube_majmin_version = hiera('sonarqube::sonarqube_server_port')

You might take the view that you are only going to pass explicit values for those that you want to over-ride (i.e. all others will be resolved by automatic parameter lookup). Keep in mind that, resolution order in the target class is :-

1. Explict param value passed.
2. Automatic param lookup (Hiera ... which goes to SITE level hiera FIRST and then to MODULE level)
3. Default param value (I would recommend NOT doing that in most cases - better to catch config errors rather than the potential of a difficult to find false positive)
4. Error

Not sure where your lookup is falling down, perhaps its resolution order (2) above ???

HTHs

Fraser.

Gregory Orange

unread,
Apr 6, 2015, 11:49:58 PM4/6/15
to puppet...@googlegroups.com
Hi Fraser,

On 02/04/15 19:56, Fraser Goffin wrote:
> Here's a simple example that works for me :-
--snip--

Wonderful, thank you. Using your work plus
https://github.com/zipkid/puppet3-hiera_data_in_module/tree/master/modules/test,
I've got it working. I _think_ my only problem was that I was using a
define - behaviour is different to classes. I wonder if that is on purpose.

Greg.
Reply all
Reply to author
Forward
0 new messages