Puppet 4 API, questions about custom functions and "data binding" in modules

299 views
Skip to first unread message

Francois Lafont

unread,
Jun 11, 2015, 5:24:31 PM6/11/15
to puppet...@googlegroups.com
Hi,

I'm learning Puppet 4 and especially the Puppet 4 API for custom
functions and "data binding" in modules. Currently, my sources are:

-
http://puppet-on-the-edge.blogspot.fr/2015/01/puppet-40-data-in-modules-and.html
-
http://puppet-on-the-edge.blogspot.be/2015/02/puppet-40-data-in-modules-part-ii.html
-
https://github.com/puppetlabs/puppet-specifications/blob/master/language/func-api.md#function-api

But I have questions yet. ;)

1. About the "data binding" in a modules, if I understand well the
`./mymodule/lib/puppet/functions/mymodule/data.rb` function replaces
completely the "params" pattern, so that the "params" pattern is
completely useless in a module using the Puppet 4 API. Is is correct?

2. Always about the "data binding" in a modules, is it correct to say
that the `data.rb` function replaces too the default values in the
declaration of a class? For instance with:

class mymodule (
$param1 = 'default1',
$param2 = 'default2',
) {
...
}

I can remove the default values in the class and put it in the `data.rb`
function. Is it correct?

3. In a module, if I have 2 public classes foo.pp and bar.pp, is it
possible to have 2 different `data.rb` functions, one for foo.pp and one
for bar.pp? Or maybe I can have only one `data.rb` function which
provides simultaneously the default values for foo.pp *and* bar.pp?

4. In a custom function with the Puppet 4 API, in the `dispatch` method,
is it possible to provide more complex types than 'Array' or 'String',
for instance is it possible to provide types like "an array of strings"
or "an non empty array of non empty string"?

5. In a custom function with the Puppet 4 API, is it possible to get the
value of a variable in a module or of a fact? With the Puppet 3 API,
it's possible with `lookupvar('xxx')` but it works no longer with the
Puppet 4 API.

6. In a custom function with Puppet 4 API, what is the difference between:

call_function(:fail, "Error, blabla blabla")

and

call_function("fail", "Error, blabla blabla") ?

That's all. ;)
Thanks in advance for your help.

--
François Lafont

Henrik Lindberg

unread,
Jun 11, 2015, 11:54:30 PM6/11/15
to puppet...@googlegroups.com
On 2015-11-06 19:24, Francois Lafont wrote:
> Hi,
>
> I'm learning Puppet 4 and especially the Puppet 4 API for custom
> functions and "data binding" in modules. Currently, my sources are:
>
> -
> http://puppet-on-the-edge.blogspot.fr/2015/01/puppet-40-data-in-modules-and..html
Cool, you are an early user, we expect to add more to this, and the road
may be a bit bumpy at the beginning :-)

>
> 1. About the "data binding" in a modules, if I understand well the
> `./mymodule/lib/puppet/functions/mymodule/data.rb` function replaces
> completely the "params" pattern, so that the "params" pattern is
> completely useless in a module using the Puppet 4 API. Is is correct?
>

It is an alternative to using the "params" pattern, yes. Now, since it
is possible to write the "mymodule::data" function in the puppet
language it should also be much easier and the code should look very
similar to the puppet logic for the "params" pattern.

(One more related tip is to take a look at the deep case match feature
when writing logic that have many if/then/else and nested such expressions).

> 2. Always about the "data binding" in a modules, is it correct to say
> that the `data.rb` function replaces too the default values in the
> declaration of a class? For instance with:
>
> class mymodule (
> $param1 = 'default1',
> $param2 = 'default2',
> ) {
> ...
> }
>
> I can remove the default values in the class and put it in the `data.rb`
> function. Is it correct?
>
Yes you can remove the defaults if you are sure you have values for them
in the data function - this because they will automatically get values.

A reason to keep them is for documentation purposes; they may help
increase the understanding of how to use the class.

> 3. In a module, if I have 2 public classes foo.pp and bar.pp, is it
> possible to have 2 different `data.rb` functions, one for foo.pp and one
> for bar.pp? Or maybe I can have only one `data.rb` function which
> provides simultaneously the default values for foo.pp *and* bar.pp?
>

There is only one function that is called by the "data in modules"
framework. There is nothing stopping you from implementing it by calling
other functions and composing the result. Thus, modularizing the design.

Say something like:

function mymodule::data() {
mymodule::data::classa() + mymodule::data::classb()
}

function mymodule::data::classa() {
$prefix = "mymodule::data::classa"
{ "${prefix}::param1" => valuea1,
"${prefix}::param2" => valuea2,
}
}
function mymodule::data::classb() {
$prefix = "mymodule::data::classb"
{ "${prefix}::param1" => valueb }
}

> 4. In a custom function with the Puppet 4 API, in the `dispatch` method,
> is it possible to provide more complex types than 'Array' or 'String',
> for instance is it possible to provide types like "an array of strings"
> or "an non empty array of non empty string"?
>
Yes, types have parameters for things like that; min, max lenghts,
contain data types. There is a rich type system.
(See the documentation, or my blog posts on the topic).

Array[String[1]] # array of non empty strings, array may be empty
Array[String] # array of strings (possibly empty string), -"-
Array[String[1], 1] # non empty array of non empty strings

Read more here:
http://puppet-on-the-edge.blogspot.se/2014/02/the-puppet-type-system-blog-posts.html

> 5. In a custom function with the Puppet 4 API, is it possible to get the
> value of a variable in a module or of a fact? With the Puppet 3 API,
> it's possible with `lookupvar('xxx')` but it works no longer with the
> Puppet 4 API.
>

Access to the content of the calling scope is a bad idea for general
purpose code - it is far better to have pure functions (they process
what they are given as arguments and return a value).

It is possible to get either the closure_scope (the scope where the
function is defined), or the calling_scope. It is however required to
write a so called "system function", this because the API of the system
functions is more avanced, and that we may have reasons to change it in
the future. It is specified though (see language specification, an look
at the implementation of the more advanced functions (like the iterative
functions; each etc.)

Note that closure_scope is enough to lookup fully qualified variables
(and global variables). You invoke lookupvar on the closure_scope.

e.g. if you need a variable say $::osfamily

closure_scope.lookupvar('::osfamily')

If you really, really need the calling_scope, this shows you how:
https://github.com/puppetlabs/puppet-specifications/blob/master/language/func-api.md#calling-scope-support


> 6. In a custom function with Puppet 4 API, what is the difference between:
>
> call_function(:fail, "Error, blabla blabla")
>
> and
>
> call_function("fail", "Error, blabla blabla") ?
>

You should use the string form 'fail', not the symbol form (:fail).
There is no gain using the symbol form since the search for the function
will need to split up the name based on namespace anyway.

I also noted that you posted questions on my blog - I responded there as
well:
http://puppet-on-the-edge.blogspot.se/2015/01/puppet-40-data-in-modules-and.html?showComment=1434066591043#c2269583057836628025

You asked if it was a good idea to call hiera from within a data
function. And my response there was:

"There is a new function named lookup that combines the behavior of all
hiera functions, it is also aware of data in modules. It is a bad idea
to call hiera from within a data function because your default data is
then no longer default and your module has a dependency on hiera.
Instead, you let users simply override values either in their
environment, or in their global hiera. When you lookup a key
"class_a::param1", if it is defined in hiera, it wins over the default
data provided in the module. You can use the "lookup" function though -
it always looks up using both hiera and data in modules. Say your
module_a depends on module_b, and you want some default values in
module_a to use data configured for module_b - then you can use the
lookup function in your module_a::data() function. It will get the
correct (possibly overridden value) in the user's configuration. (If you
use hiera to do the same, it will not ever lookup inside the "data in
modules/environment" since it is a singleton global implementation
across all environments and modules. Alternatively, if you module_a is
designed to work with module_b, and both are using data functions you
can call module_b's data function directly (as a function), and pick up
values that you use a defaults in your module_a. This way you get the
defaults from module_b (not overridden by data in the environment, nor
in hiera). This relies on module_b::data() function being considered
part of module_b's API."

I should have added that you need to be careful not to create circular
lookups (both with hiera and with data in modules).

> That's all. ;)
> Thanks in advance for your help.
>
Hope my answers will help you out - if not ask again...

- henrik

--

Visit my Blog "Puppet on the Edge"
http://puppet-on-the-edge.blogspot.se/

Francois Lafont

unread,
Jun 16, 2015, 2:52:56 AM6/16/15
to puppet...@googlegroups.com
Hi,

Sorry Henrik for my late answer.

On 12/06/2015 01:54, Henrik Lindberg wrote:

> Cool, you are an early user, we expect to add more to this, and the road
> may be a bit bumpy at the beginning :-)

Generally, I'm not a "early user" but currently I need to create a new
Puppet infra from scratch and it seemed to me more logical to choose
directly Puppet 4.

>> 1. About the "data binding" in a modules, if I understand well the
>> `./mymodule/lib/puppet/functions/mymodule/data.rb` function replaces
>> completely the "params" pattern, so that the "params" pattern is
>> completely useless in a module using the Puppet 4 API. Is is correct?
>>
>
> It is an alternative to using the "params" pattern, yes. Now, since it
> is possible to write the "mymodule::data" function in the puppet
> language it should also be much easier and the code should look very
> similar to the puppet logic for the "params" pattern.

Ok, I see but if the data() function is a function in the puppet language,
should I change the name of the file? Should I take
`./mymodule/lib/puppet/functions/mymodule/data.pp` (.pp extension) instead
of `./mymodule/lib/puppet/functions/mymodule/data.rb`? In this case, should
I modify the content of the file `./mymodule/lib/puppet/bindings/mymodule/default.rb`?

In fact, when I want to just manipulate data, I think I prefer the
Ruby language because the fact that variables are immutable in the Puppet
language causes me problems sometimes (just when I want to manipulate
data, to merge 2 hashes, make a look in a hash to retrieve a specific
value etc. etc).

> (One more related tip is to take a look at the deep case match feature
> when writing logic that have many if/then/else and nested such expressions).
>
>> 2. Always about the "data binding" in a modules, is it correct to say
>> that the `data.rb` function replaces too the default values in the
>> declaration of a class? For instance with:
>>
>> class mymodule (
>> $param1 = 'default1',
>> $param2 = 'default2',
>> ) {
>> ...
>> }
>>
>> I can remove the default values in the class and put it in the `data.rb`
>> function. Is it correct?
>>
> Yes you can remove the defaults if you are sure you have values for them
> in the data function - this because they will automatically get values.
>
> A reason to keep them is for documentation purposes; they may help
> increase the understanding of how to use the class.

Ok, I see.

>> 3. In a module, if I have 2 public classes foo.pp and bar.pp, is it
>> possible to have 2 different `data.rb` functions, one for foo.pp and one
>> for bar.pp? Or maybe I can have only one `data.rb` function which
>> provides simultaneously the default values for foo.pp *and* bar.pp?
>>
>
> There is only one function that is called by the "data in modules"
> framework. There is nothing stopping you from implementing it by calling
> other functions and composing the result. Thus, modularizing the design.
>
> Say something like:
>
> function mymodule::data() {
> mymodule::data::classa() + mymodule::data::classb()
> }
>
> function mymodule::data::classa() {
> $prefix = "mymodule::data::classa"
> { "${prefix}::param1" => valuea1,
> "${prefix}::param2" => valuea2,
> }
> }
> function mymodule::data::classb() {
> $prefix = "mymodule::data::classb"
> { "${prefix}::param1" => valueb }
> }

Ok, I see.

>> 4. In a custom function with the Puppet 4 API, in the `dispatch` method,
>> is it possible to provide more complex types than 'Array' or 'String',
>> for instance is it possible to provide types like "an array of strings"
>> or "an non empty array of non empty string"?
>>
> Yes, types have parameters for things like that; min, max lenghts,
> contain data types. There is a rich type system.
> (See the documentation, or my blog posts on the topic).
>
> Array[String[1]] # array of non empty strings, array may be empty
> Array[String] # array of strings (possibly empty string), -"-
> Array[String[1], 1] # non empty array of non empty strings
>
> Read more here:
> http://puppet-on-the-edge.blogspot.se/2014/02/the-puppet-type-system-blog-posts.html

Ah ok. So I can use the same syntax that I can use in the parameters
of a puppet class declaration. That's cool and avoid some painful
checking in the code. Cool. :)

>> 5. In a custom function with the Puppet 4 API, is it possible to get the
>> value of a variable in a module or of a fact? With the Puppet 3 API,
>> it's possible with `lookupvar('xxx')` but it works no longer with the
>> Puppet 4 API.
>>
>
> Access to the content of the calling scope is a bad idea for general
> purpose code - it is far better to have pure functions (they process
> what they are given as arguments and return a value).

Ok, I understand well the "function" paradigm but if it's just in order
to *read* a fact, for instance to just read the 'lsbdistcodename' fact
and put it in a variable, is it really a bad idea? Ok I can add a parameter
to my function `fct(a, b, lsbdistcodename)` but if I can avoid this and just
have `fct(a, b)` and just get the 'lsbdistcodename' fact value in the
function (just for reading), where is the problem in this specific case?

> It is possible to get either the closure_scope (the scope where the
> function is defined), or the calling_scope. It is however required to
> write a so called "system function", this because the API of the system
> functions is more avanced, and that we may have reasons to change it in
> the future. It is specified though (see language specification, an look
> at the implementation of the more advanced functions (like the iterative
> functions; each etc.)
>
> Note that closure_scope is enough to lookup fully qualified variables
> (and global variables). You invoke lookupvar on the closure_scope.
>
> e.g. if you need a variable say $::osfamily
>
> closure_scope.lookupvar('::osfamily')

Ok, that will be perfect for me. :)
In my mind, I just want to use facter or global variables, just for
reading, in the body of the custom function and typically use the
arguments of the function for the other types of variables.

> If you really, really need the calling_scope, this shows you how:
> https://github.com/puppetlabs/puppet-specifications/blob/master/language/func-api.md#calling-scope-support
>
>
>> 6. In a custom function with Puppet 4 API, what is the difference between:
>>
>> call_function(:fail, "Error, blabla blabla")
>>
>> and
>>
>> call_function("fail", "Error, blabla blabla") ?
>>
>
> You should use the string form 'fail', not the symbol form (:fail).
> There is no gain using the symbol form since the search for the function
> will need to split up the name based on namespace anyway.

Ok. It's recorded. ;)

> I also noted that you posted questions on my blog - I responded there as
> well:
> http://puppet-on-the-edge.blogspot.se/2015/01/puppet-40-data-in-modules-and.html?showComment=1434066591043#c2269583057836628025

Yes, sorry again for the duplication.

> You asked if it was a good idea to call hiera from within a data
> function. And my response there was:
>
> "There is a new function named lookup that combines the behavior of all
> hiera functions, it is also aware of data in modules. It is a bad idea
> to call hiera from within a data function because your default data is
> then no longer default and your module has a dependency on hiera.

Ok, I see, the `data()` function is just for "standard" default values.

> Instead, you let users simply override values either in their
> environment, or in their global hiera. When you lookup a key
> "class_a::param1", if it is defined in hiera, it wins over the default
> data provided in the module.

Maybe I'm wrong (and if yes I'would be very interested to listen arguments)
but I don't really like this pattern of "class_a::param1"-keys in hiera. I
prefer organizing my data in hiera in separated logical themes:

- a "network" theme (ie an hash entry in hiera),
- a "snmp" theme (an hash entry in hiera too),
- a "ntp" theme (an hash entry in hiera too)
- etc.

so that my data in hiera don't necessary match with the parameters
of my modules (no "class_a::param1" key in hiera). In fact I do not
want to have hiera data which are stuck to the structure of the
parameters of the modules, I prefer organize the structure of my
hiera data independently of the parameters of modules. I find that,
with the pattern of "class_a::param1" keys, sometimes (often) the
same data is necessary for a "class_a" and for a "class_b" so that
"class_a::data_foo" and "class_b::data_foo" should have a duplicated
value. I know that interpolations is possible in hiera but I prefer
to organize my data by separated logical themes. Of course, then,
when a module retrieves data from different "themes" in hiera, I need
to build a new hash which will match with the structure of the
parameters of the module and I often need to use a ruby functions to
build the new hash which is a merge of data from the "network" theme
and from the "snmp" theme etc. etc. or sometimes the construction
of the new hash is more complex than a simple merge etc.

Am I wrong? Should I use the "class_a::param1"-keys pattern
instead and should I use hiera interpolation when I have the same
value in "class_a::param1" and in "class_b::param1" (to avoid
duplication)? In fact, I have absolutely no certainties on this
point.

> You can use the "lookup" function though -
> it always looks up using both hiera and data in modules. Say your
> module_a depends on module_b, and you want some default values in
> module_a to use data configured for module_b - then you can use the
> lookup function in your module_a::data() function.

You mean: I can use `lookup('module_b::param1')` in the module_a::data()
function. Is it correct?

> It will get the
> correct (possibly overridden value) in the user's configuration. (If you
> use hiera to do the same, it will not ever lookup inside the "data in
> modules/environment" since it is a singleton global implementation
> across all environments and modules.

Ok, I think I have understood. But, in the example you describe here,
you are always in the "class_a::param1"-keys pattern. Correct?

> Alternatively, if you module_a is
> designed to work with module_b, and both are using data functions you
> can call module_b's data function directly (as a function), and pick up
> values that you use a defaults in your module_a. This way you get the
> defaults from module_b (not overridden by data in the environment, nor
> in hiera). This relies on module_b::data() function being considered
> part of module_b's API."

Ok, if I understand well with lookup function I allow overriding but
with module_b::data() this is not the case.

> I should have added that you need to be careful not to create circular
> lookups (both with hiera and with data in modules).

Yes, for instance with a `lookup('class_a::param1')` in the class_a::data()
function if I understand well.

Sorry but I add a new question. Could the way explained below be a good
practice?

For a module:

1. in `./mymodule/lib/puppet/functions/mymodule/data.rb` I defined the
default "standard" values of the parameters of mymodule/init.pp class.

2. And I create a class `./mymodule/profile.pp` where I do something like
that:

---------------------------------------
class mymodule::profile {

# In this class I use hiera lookups.
$h1 = hiera_hash('nagios')
$h2 = hiera_hash('snmp')
$foo = hiera('foo')
$bar = hiera('bar')

# I use a internal function of mymodule to merge and maybe
# to do some more complex operations in order to build a new
# hash which is suitable for the public class of mymodule.
$h = ::mymodule::internal_function($h1, $h2)

class { '::mymodule':
foo => $foo,
bar => $bar,
var => $h,
}
}
---------------------------------------

In other words, I create in the module a specific public class
::mymodule::profile which retrieves data from hiera in a way
which matches well with the organization of my hiera data.
Could it be a good practice?

> Hope my answers will help you out

Oh yes. Thx a lot Henrik for your explanations which are
very helpful for me. ;)

François Lafont

Henrik Lindberg

unread,
Jun 17, 2015, 5:01:52 PM6/17/15
to puppet...@googlegroups.com
On 2015-16-06 4:52, Francois Lafont wrote:
> Hi,
>
> Sorry Henrik for my late answer.
>
> On 12/06/2015 01:54, Henrik Lindberg wrote:
>
>> Cool, you are an early user, we expect to add more to this, and the road
>> may be a bit bumpy at the beginning :-)
>
> Generally, I'm not a "early user" but currently I need to create a new
> Puppet infra from scratch and it seemed to me more logical to choose
> directly Puppet 4.
>
>>> 1. About the "data binding" in a modules, if I understand well the
>>> `./mymodule/lib/puppet/functions/mymodule/data.rb` function replaces
>>> completely the "params" pattern, so that the "params" pattern is
>>> completely useless in a module using the Puppet 4 API. Is is correct?
>>>
>>
>> It is an alternative to using the "params" pattern, yes. Now, since it
>> is possible to write the "mymodule::data" function in the puppet
>> language it should also be much easier and the code should look very
>> similar to the puppet logic for the "params" pattern.
>
> Ok, I see but if the data() function is a function in the puppet language,
> should I change the name of the file? Should I take
> `./mymodule/lib/puppet/functions/mymodule/data.pp` (.pp extension) instead
> of `./mymodule/lib/puppet/functions/mymodule/data.rb`? In this case, should
> I modify the content of the file `./mymodule/lib/puppet/bindings/mymodule/default.rb`?
>
Functions in puppet are under <module>/functions/ and not under
<module>/lib/puppet/functions (where only ruby functions should live).
You do not have to change the bindings - it just called
modulename::data() and does not know or care if it is implemented in
ruby or puppet.

> In fact, when I want to just manipulate data, I think I prefer the
> Ruby language because the fact that variables are immutable in the Puppet
> language causes me problems sometimes (just when I want to manipulate
> data, to merge 2 hashes, make a look in a hash to retrieve a specific
> value etc. etc).
>

Clear, with puppet 4 you should be able to do most things (+ merges
hashes, you can concatenate arrays etc), the thing you cannot do is
change variables, but you can always use local scopes, use the iterative
functions etc. This reduces the need to have spaghetti logic and code
that requires variables in the first place - i.e. a more functional
approach.
That is not a problem, it is a global variable so does not depend on
calling context. In puppet:

function fct($a, $b) {
# get a fact
$facts[lsbdistcodename]
What you want to avoid is creating dependencies - you want your modules
to work independently (just it, and its dependencies) without making
assumptions that certain data must be bound in hiera in a particular way
across all environments.

As an example if you have a data function for modulea and want to give
the default value - you could do:

function mymodule::data() {
{ thekey => lookup('network::ip') }
}

Now you depend on 'network::ip' existing in hiera and having a value.
You could specify a default value if the key is missing. Now you have a
default value embedded in your module and if you need to change the
default, you need to find all such places to change it.

If you instead make modulea depend on a module 'networking' and let
networking contain the default values, you have specified something that
is consistent in terms of dependencies (no need to add a default to the
lookup). If you want to update the defaults for all usage of
network::ip, create a new version of the network module.

If you need to override that in one environment, add an
environment::data() function that returns the ip specific for that
environment.

And, finaly, in a pinch, if you need to change something across all
environments Right Now (you need to do it right away and do not have
time to make the code changes/check in etc.), then change the
network::ip key in hiera.

>> You can use the "lookup" function though -
>> it always looks up using both hiera and data in modules. Say your
>> module_a depends on module_b, and you want some default values in
>> module_a to use data configured for module_b - then you can use the
>> lookup function in your module_a::data() function.
>
> You mean: I can use `lookup('module_b::param1')` in the module_a::data()
> function. Is it correct?
>
You can look up that key everywhere, yes.

>> It will get the
>> correct (possibly overridden value) in the user's configuration. (If you
>> use hiera to do the same, it will not ever lookup inside the "data in
>> modules/environment" since it is a singleton global implementation
>> across all environments and modules.
>
> Ok, I think I have understood. But, in the example you describe here,
> you are always in the "class_a::param1"-keys pattern. Correct?
>
>> Alternatively, if you module_a is
>> designed to work with module_b, and both are using data functions you
>> can call module_b's data function directly (as a function), and pick up
>> values that you use a defaults in your module_a. This way you get the
>> defaults from module_b (not overridden by data in the environment, nor
>> in hiera). This relies on module_b::data() function being considered
>> part of module_b's API."
>
> Ok, if I understand well with lookup function I allow overriding but
> with module_b::data() this is not the case.
>
Not sure I understand what you are saying. The lookup function looks up
the value in hiera, environment, module - the looked up key with the
highest precedence wins (by default). So overrides are possible. The
data function just specifies the data for what it is in (a module, or
the environment)

>> I should have added that you need to be careful not to create circular
>> lookups (both with hiera and with data in modules).
>
> Yes, for instance with a `lookup('class_a::param1')` in the class_a::data()
> function if I understand well.
>
yes.
If you want to modularize in a good way, and use the style of "theme"
and functional decomposition, then you want to avoid using hiera since
it only looks up data across all environments. You have to start
including the environment in your hierarchy and have hiera "dip into"
environments to find data there.

You then only use hiera for installation specific overrides, and panic
changes, everything else is data in modules and environments.

(At some point I should write some blog posts about this, but I don't
have time at the moment).

Regards
- henrik

>> Hope my answers will help you out
>
> Oh yes. Thx a lot Henrik for your explanations which are
> very helpful for me. ;)
>
> François Lafont
>


Francois Lafont

unread,
Jun 25, 2015, 4:19:27 AM6/25/15
to puppet...@googlegroups.com
Hi,

Sorry again for my late answer.

On 17/06/2015 19:01, Henrik Lindberg wrote:

> Functions in puppet are under <module>/functions/ and not under <module>/lib/puppet/functions (where only ruby functions should live).
> You do not have to change the bindings - it just called modulename::data() and does not know or care if it is implemented in ruby or puppet.

Ok.

> Clear, with puppet 4 you should be able to do most things (+ merges hashes, you can concatenate arrays etc), the thing you cannot do is change variables, but you can always use local scopes, use the iterative functions etc. This reduces the need to have spaghetti logic and code that requires variables in the first place - i.e. a more functional approach.

Ok, indeed with Puppet 4 it's more flexible but sometimes
I find it's more simple to use ruby code with mutable variables.

>> Ok, I understand well the "function" paradigm but if it's just in order
>> to *read* a fact, for instance to just read the 'lsbdistcodename' fact
>> and put it in a variable, is it really a bad idea? Ok I can add a parameter
>> to my function `fct(a, b, lsbdistcodename)` but if I can avoid this and just
>> have `fct(a, b)` and just get the 'lsbdistcodename' fact value in the
>> function (just for reading), where is the problem in this specific case?
>>
> That is not a problem, it is a global variable so does not depend on calling context. In puppet:
>
> function fct($a, $b) {
> # get a fact
> $facts[lsbdistcodename]
> }

Ok, that's perfect.

> What you want to avoid is creating dependencies - you want your modules to work independently (just it, and its dependencies) without making assumptions that certain data must be bound in hiera in a particular way across all environments.
>
> As an example if you have a data function for modulea and want to give the default value - you could do:
>
> function mymodule::data() {
> { thekey => lookup('network::ip') }
> }
>
> Now you depend on 'network::ip' existing in hiera and having a value.
> You could specify a default value if the key is missing. Now you have a default value embedded in your module and if you need to change the default, you need to find all such places to change it.
>
> If you instead make modulea depend on a module 'networking' and let networking contain the default values, you have specified something that is consistent in terms of dependencies (no need to add a default to the lookup). If you want to update the defaults for all usage of network::ip, create a new version of the network module.
>
> If you need to override that in one environment, add an environment::data() function that returns the ip specific for that environment.
>
> And, finaly, in a pinch, if you need to change something across all environments Right Now (you need to do it right away and do not have time to make the code changes/check in etc.), then change the network::ip key in hiera.

Err... I'm not sure to well understand. I will explain below with a real example.

>> Ok, if I understand well with lookup function I allow overriding but
>> with module_b::data() this is not the case.
>>
> Not sure I understand what you are saying. The lookup function looks up the value in hiera, environment, module
> - the looked up key with the highest precedence wins (by default). So overrides are possible. The data function
> just specifies the data for what it is in (a module, or the environment)

If I understand well with lookup('foo::var') in the ::foo class, there will be
a look up:
- in hiera with the foo::var entry;
- in the environment with the environment::data() function (look up the value
of 'foo::var');
- in the foo module with the foo::data() function (look up the value of 'foo::var'.

Is it correct?

But what happens if I have just lookup('var') (ie the key is unqualified) in the
::foo class? Is there only a lookup in hiera? Because unqualified key seems to be
not allowed in a *::data() function?

> If you want to modularize in a good way, and use the style of "theme" and functional decomposition, then you want to avoid using hiera since it only looks up data across all environments. You have to start including the environment in your hierarchy and have hiera "dip into" environments to find data there.
>
> You then only use hiera for installation specific overrides, and panic changes, everything else is data in modules and environments.

In fact, I'm not sure to understand well. I take an real example.
I have a "network" module with the ::network class:

-------------------------------------
class network ( $interfaces, ) {
# Configure the file /etc/network/interfaces of a Debian host.
}
-------------------------------------

If I understand well I can set default values in the "data" function
of the "network" module and set the default like this (for instance):

-------------------------------------
network::interfaces = {
'eth0' => { 'method' => 'dhcp' }
}
-------------------------------------

And if I want to have another parameters for a specific node, I can put these
in the $fqdn.yaml of the node:

-------------------------------------
network::interfaces:
eth0:
method: 'static'
options:
address: '172.31.1.2'
netmask: '255.255.0.0'
gateway: '172.31.0.1'
eth1:
method: 'static'
options:
address: '10.0.0.2'
netmask: '255.0.0.0'
-------------------------------------

Is it correct?

But, now, imagine I want to improve the "network" module. Currently, for each
node which have an address in the "172.31.0.0/16" network, I will probably use
the same gateway ie 172.31.0.1 and I want to avoid it. So I put this in the
"common.yaml" file:

-------------------------------------
# The list of all my networks with some specific values such as
# the gateway (if it exists) or the dns etc.
network-inventory:
mgt-network:
address: '172.31.0.0/16'
gateway: '172.31.0.1'
dns: [ '172.31.0.1', '172.31.0.2' ]
web-network:
address: '192.168.0.0/24'
gateway: '192.168.0.1'
nfs-network:
address: '10.0.0.0/8'
-------------------------------------

And I want to just put something like this in the $fqdn.yaml files:

-------------------------------------
# For each node, I don't want to repeat the same gateway, the
# same netmask etc. 'default' will be replaced by the correct
# value in the corresponding network in "network-inventory".
network::interfaces:
eth0:
method: 'static'
options:
address: '172.31.1.2/16'
netmask: 'default'
gateway: 'default'
eth1:
method: 'static'
options:
address: '10.0.0.2/8'
netmask: 'default'
-------------------------------------

And to "auto-complete" the 'default' values above, I have a '::network::profile'
class in the 'network' module with something like that:

-------------------------------------
class network::profile {

$network_inventory = hiera('network-inventory')
$interfaces = hiera('network::interfaces')

class { '::network':
interfaces => network::complete_interfaces($interfaces, $network_inventory)
}

}
-------------------------------------

where 'network::complete_interfaces' is a "smart" function which replaces each
'default' value by the right value (the function will use the CIDR address to
find the good network in $network_inventory hash and to replace a 'default'
value correctly).

How can I do that if I want to modularize in a good way (with the "data binding"),
and use the style of "theme" and functional decomposition etc.? If I understand well,
it's not a good idea to have hiera lookups in a module and it's exactly that I have
made in the example ::network::profile above. So how can I correctly do what I want
in a way that respects the Puppet 4 patterns?

> (At some point I should write some blog posts about this, but I don't have time at the moment).

The lack of time is an argument that I really can understand. ;)
No problem, your help has already been very useful for me.
Thx Henrik.

François Lafont

Henrik Lindberg

unread,
Jul 4, 2015, 12:45:44 AM7/4/15
to puppet...@googlegroups.com
>> And, finaly, in a pinch, if you need to change something across all environments Right Now (you need to do it right away and do not have time to make the code changes/check in etc.), then change the network::ip key in hiera..
>
> Err... I'm not sure to well understand. I will explain below with a real example.
>
>>> Ok, if I understand well with lookup function I allow overriding but
>>> with module_b::data() this is not the case.
>>>
>> Not sure I understand what you are saying. The lookup function looks up the value in hiera, environment, module
>> - the looked up key with the highest precedence wins (by default). So overrides are possible. The data function
>> just specifies the data for what it is in (a module, or the environment)
>
> If I understand well with lookup('foo::var') in the ::foo class, there will be
> a look up:
> - in hiera with the foo::var entry;
> - in the environment with the environment::data() function (look up the value
> of 'foo::var');
> - in the foo module with the foo::data() function (look up the value of 'foo::var'.
>
> Is it correct?
>

Yes, that is correct.

> But what happens if I have just lookup('var') (ie the key is unqualified) in the
> ::foo class? Is there only a lookup in hiera? Because unqualified key seems to be
> not allowed in a *::data() function?
>
Correct, not allowed in a module. It can only bind to names in the
module's own namespace. In your environment you may bind any key.
Your environment can naturally also call functions in modules and
arrange those contributions in any way it likes.
yes. Your global hiera data wins over what is in the defaults for your
module. But for automatic data binding it wipes out the entire key since
it does not do deep merging and there is no way to control automatic
data binding with deep merge options.

You can do this if you do explicit lookups (use the lookup function).
You can use additional functions in your environment to compute the
default gateway given the 'address' for the node. You can define that
function in a module, or in your environment. Here I used the 'network'
module (it may not make sense to you if your network module is generic,
and the ip addresses are specific to a site).

function network::default_gateway(String $ip) {
case $ip.split('[.]').map |$x| {$x+0} {
[172, 31, 0, Integer[0,16]] : { '172.31.0.1' }
# some other range : { 'n.n.n.n' }
# etc
default : { 'n.n.n.n' } # default if not matching any other range
}
}

This uses the 4.x deep matching ability in a case expression together
with the type system to match against a range after first having
converted an ip address as a string to an array of Integer values.

You can then call that with an address to get the gateway. Since you
know that you do not have circular lookups you can call that function
from your environment's data function.

In your environment data function:

function environment::data() {
{ network::interfaces = {
'eth0' => {
'method' => 'dhcp',
'options' => {
'gateway' => network::default_gateway(
lookup(network::interfaces::eth0::address))
}
}
}
}


> And to "auto-complete" the 'default' values above, I have a '::network::profile'
> class in the 'network' module with something like that:
>
> -------------------------------------
> class network::profile {
>
> $network_inventory = hiera('network-inventory')
> $interfaces = hiera('network::interfaces')
>
> class { '::network':
> interfaces => network::complete_interfaces($interfaces, $network_inventory)
> }
>
> }
> -------------------------------------
>
Almost, but don't use hiera calls if you want to also use the data bound
in the environment, and from modules - use lookup instead. You also want
to specify how a lookup should merge values - without merging, you will
need to re-specify everything "at the higher level", you only want the
"higher level" to override certain parts of a structure at a lower
level. Look at the merge options available in the
lookup function.

> where 'network::complete_interfaces' is a "smart" function which replaces each
> 'default' value by the right value (the function will use the CIDR address to
> find the good network in $network_inventory hash and to replace a 'default'
> value correctly).
>
ok, so maybe you can call such a function instead of the function I
invented.

> How can I do that if I want to modularize in a good way (with the "data binding"),
> and use the style of "theme" and functional decomposition etc.? If I understand well,
> it's not a good idea to have hiera lookups in a module and it's exactly that I have
> made in the example ::network::profile above. So how can I correctly do what I want
> in a way that respects the Puppet 4 patterns?
>

You have to consider what a module really knows; you can use anything
from any other module it depends on - such as functions or keys that
that module defines (i.e you know those functions and keys exist, they
can be documented, found, etc.). The bad thing to do is to just lookup
arbitrary keys that a user is expected to configure into their global hiera.

I think I would (I have not though this through), have a module with all
the defaults data bound in it, some defaults are computed via functions.
For some values where it is impossible to define a value, I would use a
default that the module's logic considers an error and inform the user,
"You must override the binding of the key ...."


>> (At some point I should write some blog posts about this, but I don't have time at the moment).
>
> The lack of time is an argument that I really can understand. ;)
> No problem, your help has already been very useful for me.

I hope what I put together above makes sense and that you can make use
of that to come up with a composition that works for you. I am certainly
interested in what you come up with - or help with further questions.

Regards
- henrik

> Thx Henrik.
Reply all
Reply to author
Forward
0 new messages