Node key merging/overloading - node inheritance vs hiera

Bostjan Skufca

unread,

Mar 8, 2015, 2:55:03 PM3/8/15

to puppet...@googlegroups.com

Hi,

I am currently looking to move away from node inheritance towards hiera, and I have a question how to achieve merge/overloading functionality with hiera.

I have written an elaborate example below, but let me just quickly summarize all that into a question:

With hiera:

- How would you go about when certain nodes need data merged from all scopes, but other nodes need data from just the last scope?

- What backend would you use?

- How would you best mimic the behaviour or node inheritance (regarding array appends and replacements)?

- Probably I am looking towards custom backend, right? :)

Thank you for your opinions,

b.

Full example goes like this:

- there is a 'tpl_base' node template definition with all default variables

- there is a 'tpl_base_dc1' node template which appends to inherited values from 'tpl_base'

- there is a 'tpl_base_dc1_special' node template which REPLACES certain values from 'tpl_base_dc1'

Let's implement this with node inheritance:

-----------------------------------------------------------------------------------------

node 'tpl_base'

{

#...(other vars)...

$syslog_servers = [ '9.9.9.51', '9.9.9.52' ] # Global syslog servers

}

node 'tpl_base_dc1'

inherits 'tpl_base'

{

#...(other vars)...

$syslog_servers += [ '1.1.1.53', '1.1.1.54' ] # Additional syslog servers for nodes in DC1

}

node 'tpl_base_dc1_special'

inherits 'tpl_base_dc1'

{

#...(other vars)...

$syslog_servers = [ '1.1.1.55', '1.1.1.56' ] # REPLACE syslog servers (note the = vs += operator)

}

node 'srv-0.no-dc'

inherits 'tpl_base'

{

include 'syslog_ng'

}

node 'srv-1.dc1'

inherits 'tpl_base_dc1'

{

include 'syslog_ng'

}

node 'srv-2-special.dc1'

inherits 'tpl_base_dc1_special'

{

include 'syslog_ng'

}

-----------------------------------------------------------------------------------------

The result is:

- nodes from all datacenters log to 9.9.9.51 and 9.9.9.52 syslog servers

- nodes from dc1 additionaly log to dc1-specific logservers, 1.1.1.53 and 1.1.1.54

- SPECIAL nodes from dc1 log do specially designated log servers (.55 and .56) and not to other log servers (consider they are logging security-sensitive data which must not be visible on common log servers

- this aligns neatly with module/class definitions, as they do not have to care about how data arrays are costructed (defined, appended, replaced, whatever), they just use whatever is given to them

Now lets remodel this into hiera scopes:

-----------------------------------------------------------------------------------------

# /etc/hiera.yaml

--- :hierarchy:

- "%{::clientcert}"

- "tpl_%{::domain}" <-- one way to include dcX-specific configuration

- "tpl_base"

# tpl_base.yaml
syslog_servers:

- 9.9.9.51

- 9.9.9.52

# tpl_dc1.yaml

syslog_servers:

- 1.1.1.53

- 1.1.1.54

# tpl_dc1-special.yaml

syslog_servers:

- 1.1.1.55

- 1.1.1.56

-----------------------------------------------------------------------------------------

When data is ported into hiera, there are two options available for retrieving data:

a) hiera()

b) hiera_merge()

These would be the results:

1. hiera() would work fine for srv-0.no-dc (just global syslog servers)

2. hiera() would work fine for srv-2-special (just specific servers for special nodes)

3. hiera_merge() would work fine for srv-0

4. hiera_merge() would work fine for srv-1 (merges base and dc1-specific syslog servers)

5. hiera() would NOT work fine for srv-1 (gets just dc1-specific syslog servers, as it is the most specific match)

5. hiera_merge() would NOT work fine for srv-2 (gets ALL syslog servers, despite only last two being a reqirement)

Problematic are last two cases, which (as it seems) are not supported with current hiera backends. Or am I wrong?

b.

Christopher Wood

unread,

Mar 9, 2015, 9:45:38 AM3/9/15

to puppet...@googlegroups.com

On Sun, Mar 08, 2015 at 11:55:03AM -0700, Bostjan Skufca wrote:
> Hi,
> I am currently looking to move away from node inheritance towards hiera,
> and I have a question how to achieve merge/overloading functionality with
> hiera.

If I read these questions correctly, first look into the hiera functions:

https://docs.puppetlabs.com/references/latest/function.html#hiera
https://docs.puppetlabs.com/references/latest/function.html#hieraarray
https://docs.puppetlabs.com/references/latest/function.html#hierahash

> I have written an elaborate example below, but let me just quickly
> summarize all that into a question:
> With hiera:
> - How would you go about when certain nodes need data merged from all
> scopes, but other nodes need data from just the last scope?

I've usually had a "classname::merge: true" key in hiera, controlling whether I use hiera() or hiera_hash() to obtain the data I need.

> - What backend would you use?

Using plain old yaml here, use whatever you feel like really.

> - How would you best mimic the behaviour or node inheritance (regarding
> array appends and replacements)?

Conditionals in the puppet manifests to figure out whether I should use hiera() or hiera_array()/hiera_hash() (strbool in the example is from stdlib).

class myclass ( $merge => false ) {
if str2bool($merge) {
$data = hiera_hash('myclass::data')
}
else {
$data = hiera('myclass::data')
}
file { '/tmp/important.txt':
content => template('myclass/imp.erb'), # $data used here
}
}

In your position I might try doing hiera('myclass::data', fail()) to mimic a class parameter with no default, in case there was no sensible default and catalog compilation should fail without this data. If I recall correctly, a failed hiera*() lookup function just means a variable set to undef. That's if a default isn't fine, of course.

As a last thing, have you possibly considered codifying your various include rules in a single template? Right now your "rules" on which syslog servers go where are all over the place (different hiera levels, puppet manifests) and it seems unnecessarily difficult to keep track of. If you can reduce your ruleset down to things like:

machines in datacenter A get syslogs X, Y, Z
machines of type B get syslogs F, G
machines in datacenter A of type E get syslogs R, T

Then you might find it easier to manage in code.

Of course, maybe the disparate syslog infrastructure is a sign that things have become tangly and you need to prune syslog listeners a bit? Or, to rephrase, maybe spend the time correcting your syslog infrastructure rather than dealing with it in puppet?

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit
> [2]https://groups.google.com/d/msgid/puppet-users/7205e414-506f-41b4-8c5d-c1e0a9da1d4e%40googlegroups.com.
> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com
> 2. https://groups.google.com/d/msgid/puppet-users/7205e414-506f-41b4-8c5d-c1e0a9da1d4e%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Bostjan Skufca

unread,

Mar 10, 2015, 10:59:41 PM3/10/15

to puppet...@googlegroups.com, christop...@pobox.com

On Monday, 9 March 2015 14:45:38 UTC+1, Christopher Wood wrote:

On Sun, Mar 08, 2015 at 11:55:03AM -0700, Bostjan Skufca wrote:
> With hiera:
> - How would you go about when certain nodes need data merged from all
> scopes, but other nodes need data from just the last scope?

I've usually had a "classname::merge: true" key in hiera, controlling whether I use hiera() or hiera_hash() to obtain the data I need.

And this hits the nail on the spot, even if unknowingly:)

The problem I am seeing here and which I am only now being able to articulate, is the clash of two contradictory elements:

1. Puppet development is pushed towards decoupling code (manifest) from data, a noble goal

2. Puppet provides two functions, hiera() and hiera_array(), and the very existence of more than one function to retrieve data destroys the notion, that code should be unaware of underlying data storage details.

Though your solution is actually quite nice.

In your position I might try doing hiera('myclass::data', fail()) to mimic a class parameter with no default, in case there was no sensible default and catalog compilation should fail without this data. If I recall correctly, a failed hiera*() lookup function just means a variable set to undef. That's if a default isn't fine, of course.

This one is useful, thanks!

Of course, maybe the disparate syslog infrastructure is a sign that things have become tangly and you need to prune syslog listeners a bit? Or, to rephrase, maybe spend the time correcting your syslog infrastructure rather than dealing with it in puppet?

Forget the word syslog please, the example is fictitious.

It was meant to illustrate a more general problem, but as it seems it was probably not the best approach I could have taken :)

b.

jcbollinger

unread,

Mar 11, 2015, 9:01:39 AM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Tuesday, March 10, 2015 at 9:59:41 PM UTC-5, Bostjan Skufca wrote:

On Monday, 9 March 2015 14:45:38 UTC+1, Christopher Wood wrote:
On Sun, Mar 08, 2015 at 11:55:03AM -0700, Bostjan Skufca wrote:
> With hiera:
> - How would you go about when certain nodes need data merged from all
> scopes, but other nodes need data from just the last scope?

I've usually had a "classname::merge: true" key in hiera, controlling whether I use hiera() or hiera_hash() to obtain the data I need.

And this hits the nail on the spot, even if unknowingly:)

The problem I am seeing here and which I am only now being able to articulate, is the clash of two contradictory elements:
1. Puppet development is pushed towards decoupling code (manifest) from data, a noble goal
2. Puppet provides two functions, hiera() and hiera_array(), and the very existence of more than one function to retrieve data destroys the notion, that code should be unaware of underlying data storage details.

Puppet in fact provides three functions functions for lookups: there is also hiera_hash().

In any case, you are quite right. Which sort of lookup is intended is an attribute of the data -- part of the definition of each key -- but it is not represented in or alongside the data. Each user of the data somehow has to know. That could be tolerated, inconvenient as it is, except that it is incompatible with automated data binding. This is an issue that has been recognized and acknowledged, though I'm uncertain whether it is actively being addressed.

John

Christopher Wood

unread,

Mar 11, 2015, 9:57:00 AM3/11/15

to puppet...@googlegroups.com

(Replying to two people in one email, hum.)

On Wed, Mar 11, 2015 at 06:01:39AM -0700, jcbollinger wrote:
> On Tuesday, March 10, 2015 at 9:59:41 PM UTC-5, Bostjan Skufca wrote:
>
> On Monday, 9 March 2015 14:45:38 UTC+1, Christopher Wood wrote:
>
> On Sun, Mar 08, 2015 at 11:55:03AM -0700, Bostjan Skufca wrote:
> > With hiera:
> > - How would you go about when certain nodes need data merged from
> all
> > scopes, but other nodes need data from just the last scope?
>
> I've usually had a "classname::merge: true" key in hiera, controlling
> whether I use hiera() or hiera_hash() to obtain the data I need.
>
> And this hits the nail on the spot, even if unknowingly:)
> The problem I am seeing here and which I am only now being able to
> articulate, is the clash of two contradictory elements:
> 1. Puppet development is pushed towards decoupling code (manifest) from
> data, a noble goal
> 2. Puppet provides two functions, hiera() and hiera_array(), and the
> very existence of more than one function to retrieve data destroys the
> notion, that code should be unaware of underlying data storage details.

I rather take your point, but isn't the requirement for different data handling just another data item? Is any code unaware of the underlying data structure? Even if you have a single type of data (plain string-like variables) your code is implicitly aware that it can treat them as that type.

I'm not really sure there's a way to automagically distinguish

"this is an array, do not retrieve its contents from all levels"
"this is an array, do retrieve its contents from all levels"

while still preserving our sanity.

(I've had some nasty run-ins with merging lookups and have decided they're mostly not for me, maybe the smarter people on this list are having better results.)

> Puppet in fact provides three functions functions for lookups: there is
> also hiera_hash().
>
> In any case, you are quite right. Which sort of lookup is intended is an
> attribute of the data -- part of the definition of each key -- but it is
> not represented in or alongside the data. Each user of the data somehow
> has to know. That could be tolerated, inconvenient as it is, except that
> it is incompatible with automated data binding. This is an issue that has
> been recognized and acknowledged, though I'm uncertain whether it is
> actively being addressed.

Could you possibly expound on the "Each user of the data somehow has to know" part? I'm having trouble with the notion that people would use puppet manifests and hiera data without knowing what's in them.

> John

>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit

> [2]https://groups.google.com/d/msgid/puppet-users/24d78255-435a-480a-94be-128a0e760c45%40googlegroups.com.

> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com

> 2. https://groups.google.com/d/msgid/puppet-users/24d78255-435a-480a-94be-128a0e760c45%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Luke Bigum

unread,

Mar 11, 2015, 10:59:12 AM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, March 11, 2015 at 1:57:00 PM UTC, Christopher Wood wrote:

> Puppet in fact provides three functions functions for lookups: there is
> also hiera_hash().
>
> In any case, you are quite right. Which sort of lookup is intended is an
> attribute of the data -- part of the definition of each key -- but it is
> not represented in or alongside the data. Each user of the data somehow
> has to know. That could be tolerated, inconvenient as it is, except that
> it is incompatible with automated data binding. This is an issue that has
> been recognized and acknowledged, though I'm uncertain whether it is
> actively being addressed.

Could you possibly expound on the "Each user of the data somehow has to know" part? I'm having trouble with the notion that people would use puppet manifests and hiera data without knowing what's in them.

I can't speak for John but I think I get his meaning, but if I don't, here's my own opinion ;-)

If a user of a module is reading that module's documentation and parameters, it seems a bit nasty to assume they user must also go read the Puppet module code in great detail to find out what type of Hiera call is being used. Passing data to the module should be simply defined, eg: "this parameter takes an array" or "this parameter is a comma separated string". For a module to assume that it can or should attempt to do some sort of deep merging seems overly complicated and it shifts the focus away from the user providing the right data to a well written module. Rather than have "classname::merge => true" I would advocate something like this which puts the user in complete control of the data reaching it's modules in a correct and easily testable manner:

class 'profile::dns' {
  #lookup my DNS data
  $hiera_dns_server_array = hiera_array('dns::server')
  $common_dns_server = '127.0.0.1'

  class { 'resolv':
   dns_servers => [ $hiera_dns_server_array, $common_dns_server ]
}

Something like this seems like I'm telling a module *how* to look up my own data, rather than passing the right data to the module:

class resolv (
  $dns_servers_key_name = 'dns_servers',
  $dns_servers_key_merge = false,
) {
  if ($dns_servers_key_merge) {
   $dns_servers = hiera_array($dns_servers_key_name)
  } else {
   $dns_servers = hiera($dns_servers_key_name)
  }
}

class { 'resolv': dns_servers_key_merge => true }

I'd also have to code it to selectively use Hiera or not (some people don't) and that would get even worse. The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

To answer Bostjan's original example, you have 3 "profiles" of syslog: one base, one dc1 and one dc1_special, and you assign those profiles to whatever node needs them.

-Luke

Bostjan Skufca

unread,

Mar 11, 2015, 12:25:05 PM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, 11 March 2015 14:57:00 UTC+1, Christopher Wood wrote:

(Replying to two people in one email, hum.)

I rather take your point, but isn't the requirement for different data handling just another data item? Is any code unaware of the underlying data structure? Even if you have a single type of data (plain string-like variables) your code is implicitly aware that it can treat them as that type.

Yes, certain dependency always exists, that can not be denied. But it should lean towards minimum amount of coupling.

But if we venture a peek into programming languages, for example towards functions that return arrays, which best matches our current discussion.

function callee ()

{

...

return $arrayOfData;

}

function caller ()

{

$newDataArray = callee();

}

The caller() gets very messy, if it is its responsibility to figure out if returned array from callee() is either:

- an array of keys and values

- an array of arrays of keys and values

This is what I am talking about - if callee just returns array of arrays it is not behaving very nicely :)

And this is exactly what hiera does.

If I think about this a little further, this is what hiera backends do. hiera*() functions on the other hand does a bit poor job at abstracting provider internal data, or does a good and simple job alright sacrificing some flexibility we had with class inheritance.

Anyhow, it seems writing custom backend providers is the way out.

I'm not really sure there's a way to automagically distinguish

"this is an array, do not retrieve its contents from all levels"
"this is an array, do retrieve its contents from all levels"

while still preserving our sanity.

Agreed. But if you use hiera with multiple scopes (common, dc, row, rack, node), each layer usually knows if data from parent scope should be merged with, or replaced.

Again, maybe it is just that default hiera backends do not allow for such flexibility. It should not be hard to switch that to custom provider, whose data model actually allows for such flexibility.

(I've had some nasty run-ins with merging lookups and have decided they're mostly not for me, maybe the smarter people on this list are having better results.)

Care to elaborate a bit, especially how did you overcome them (define all data for each node)?

b.

Bostjan Skufca

unread,

Mar 11, 2015, 12:35:36 PM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, 11 March 2015 15:59:12 UTC+1, Luke Bigum wrote:

On Wednesday, March 11, 2015 at 1:57:00 PM UTC, Christopher Wood wrote:

Could you possibly expound on the "Each user of the data somehow has to know" part? I'm having trouble with the notion that people would use puppet manifests and hiera data without knowing what's in them.

I can't speak for John but I think I get his meaning, but if I don't, here's my own opinion ;-)

If a user of a module is reading that module's documentation and parameters, it seems a bit nasty to assume they user must also go read the Puppet module code in great detail to find out what type of Hiera call is being used. Passing data to the module should be simply defined, eg: "this parameter takes an array" or "this parameter is a comma separated string". For a module to assume that it can or should attempt to do some sort of deep merging seems overly complicated and it shifts the focus away from the user providing the right data to a well written module.

Spot on, I believe.

Rather than have "classname::merge => true" I would advocate something like this which puts the user in complete control of the data reaching it's modules in a correct and easily testable manner:

class 'profile::dns' {
  #lookup my DNS data
  $hiera_dns_server_array = hiera_array('dns::server')
  $common_dns_server = '127.0.0.1'

  class { 'resolv':
   dns_servers => [ $hiera_dns_server_array, $common_dns_server ]
}

Something like this seems like I'm telling a module *how* to look up my own data, rather than passing the right data to the module:

class resolv (
  $dns_servers_key_name = 'dns_servers',
  $dns_servers_key_merge = false,
) {
  if ($dns_servers_key_merge) {
   $dns_servers = hiera_array($dns_servers_key_name)
  } else {
   $dns_servers = hiera($dns_servers_key_name)
  }
}

class { 'resolv': dns_servers_key_merge => true }

I'd also have to code it to selectively use Hiera or not (some people don't) and that would get even worse. The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

This is almost what I am looking for. I have an alternate approach: what if merging vs nonmerging is decided based on hiera key?

class resolv (
$dns_servers_key_name = 'dns_servers',

) {
if (hiera('dns_servers_key_merge')) { # <--- hiera is responsible for merging decision

   $dns_servers = hiera_array($dns_servers_key_name)
  } else {
   $dns_servers = hiera($dns_servers_key_name)
  }
}

Though I can forsee the problem in this case:

- layer 1 data, merge value has no effect here

- layer 2 data, merge true <--- correct, data from layer 1 and layer 2 is merged

- layer 3 data, merge false <--- if we stop here, only data from layer 3 is used

- layer 4 data, merge true <--- this would cause merging of data from all layers

b.

Bostjan Skufca

unread,

Mar 11, 2015, 12:39:50 PM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, 11 March 2015 15:59:12 UTC+1, Luke Bigum wrote:

The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

What do you mean by this? I am curious about what you think where the community is moving to? Are modules getting dumbed down? :)

b.

Luke Bigum

unread,

Mar 11, 2015, 6:31:38 PM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, March 11, 2015 at 4:35:36 PM UTC, Bostjan Skufca wrote:

Something like this seems like I'm telling a module *how* to look up my own data, rather than passing the right data to the module:

class resolv (
  $dns_servers_key_name = 'dns_servers',
  $dns_servers_key_merge = false,
) {
  if ($dns_servers_key_merge) {
   $dns_servers = hiera_array($dns_servers_key_name)
  } else {
   $dns_servers = hiera($dns_servers_key_name)
  }
}

class { 'resolv': dns_servers_key_merge => true }

I'd also have to code it to selectively use Hiera or not (some people don't) and that would get even worse. The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

This is almost what I am looking for. I have an alternate approach: what if merging vs nonmerging is decided based on hiera key?

That is my approach, that class would do an implicit Hiera lookup for those class parameters, I just illustrated the point with a resource-like declaration as an example. While the above method would work, I don't think I've made my point about not putting this personalised logic in the "resolv" module itself. The above example is not so good. Gary Larizza explains it very well here if you haven't seen it (https://www.youtube.com/watch?v=v9LB-NX4_KQ). That video should answer your questions in your second reply to me too, BTW.

The above code example is a bad idea for these reasons:

- the resolv module is tightly coupled to the data, it's in control of how it should look up data, rather than just be *given* data

- you won't be able to replace that resolv module with the super awesome puppetlabs_resolv module because of your custom way of handling data

- it makes a *very* bad assumption that everyone uses Hiera, it is not compatible for people who use ENCs that supply all class parameters for example

- there's a higher barrier to entry on understanding the module, some people would have to read the body of the resolv module code to figure out what's going on (or there would be a long README)

- it's more complicated to test because the range of data it can take is more complicated

Now expand on my first example:

********************

class puppetlabs_resolv($dns_servers) {

file { '/etc/resolv.conf': content => template(...) }

}

class profile::dns_base {

#lookup my DNS data from Hiera

$hiera_dns_server_array = hiera_array('dns::server')

#and add a global DNS server I have

$common_dns_server = '127.0.0.1'

class { 'puppetlabs_resolv':

dns_servers => [ $hiera_dns_server_array, $common_dns_server ]

}

class profile::dns_special {

#don't do a hiera lookup, DNS here is special

$special_dns = '10.1.1.1'

class { 'puppetlabs_resolv':

dns_servers => [ $special_dns ]

}

node dc1 { include profile::dns }

node dc1_special { include profile::dns_special }

********************

The puppetlabs_resolv module I downloaded from GitHub does one thing well, resolv.conf, in a simple and easily understood manner, and it comes with Rspec tests, so I don't have to reinvent the wheel.

All of my business logic about how I get IP addresses into that resolv module is in my profile::dns* classes. These are *my* profile classes, I can do whatever crazy Hiera lookups and string manipulation I want/need to get the data into a format that puppetlabs_resolv takes. In other words my profiles are the "glue" between my data and the "building block" puppetlabs_resolv module. At any time I can replace puppetlabs_resolv with lukebigum_resolv (which is obviously better) with a few tweaks to my profiles. If I replace my data backend or get rid of Hiera entirely, my profile might have to be adjusted but I don't have to stop using that awesome lukebigum_resolv I downloaded.

Why the use of a second profile, profile::dns_special? It takes complexity out of Hiera. I don't need a complicated Hierarchy when I've got profiles, and I rarely need inheritance at all. I've got my "tpl_%{::domain}" which is where my profile::dns looks up data from, and anything that's special is actually a different implementation of how I usually do DNS, so it gets it's own profile, hence profile::dns_special. It is better to handle these exceptions in Puppet code because it's an *actual* language, rather than trying to model something complex into Hiera which is just a key-value store.

Your Hiera example where you have tpl_dc1.yaml and tpl_dc1-special.yaml is going to bite you. Your joke about mimicking node inheritance functionality in Hiera worries me a little, because it reminds me of some of my colleagues. Just because it can be modelled in Hiera, doesn't mean it should be. To give you an example, at my work place we can build an entire platform where each node's Hiera file looks like this:

---

ip_address_fourth_octet: 10

And the rest is abstracted, inherited and hidden away. In some ways it's really awesome, but it is also very hard to debug, and extraordinarily hard to understand. I once spent 2 hours tracing a string in a configuration file through too many Hiera files each with over a dozen levels of dictionary/hash depth, about 7 create_resource() calls, several exported resources and luckily only 3-4 recursive Hiera lookups. I was not happy by the end of that. Not long after my team lead forced us to re-read the Roles and Profiles design pattern and to watch that video ;-)

My recommendation to you is to seriously look at why you're relying on inheritance and merging so much, I think you could simplify a lot more. If you post a more relevant example and your Hierarchy, I'd be happy to discuss.

-Luke

Luke Bigum

unread,

Mar 11, 2015, 9:58:10 PM3/11/15

to puppet...@googlegroups.com

----- Original Message -----
> From: "Christopher Wood" <christop...@pobox.com>
> > Puppet in fact provides three functions functions for lookups: there is
> > also hiera_hash().
> >
> > In any case, you are quite right. Which sort of lookup is intended is
> > an
> > attribute of the data -- part of the definition of each key -- but it is
> > not represented in or alongside the data. Each user of the data somehow
> > has to know. That could be tolerated, inconvenient as it is, except
> > that
> > it is incompatible with automated data binding. This is an issue that
> > has
> > been recognized and acknowledged, though I'm uncertain whether it is
> > actively being addressed.
>
> Could you possibly expound on the "Each user of the data somehow has to know"
> part? I'm having trouble with the notion that people would use puppet
> manifests and hiera data without knowing what's in them.

I can't speak for John but I think I get his meaning, but if I don't, here's my own opinion ;-)

If a user of a module is reading that module's documentation and parameters, it seems a bit nasty to assume they user must also go read the Puppet module code in great detail to find out what type of Hiera call is being used. Passing data to the module should be simply defined, eg: "this parameter takes an array" or "this parameter is a comma separated string". For a module to assume that it can or should attempt to do some sort of deep merging seems overly complicated and it shifts the focus away from the user providing the right data to a well written module. Rather than have "classname::merge => true" I would advocate something like this which puts the user in complete control of the data reaching it's modules in a correct and easily testable manner:

class 'profile::dns' {
#lookup my DNS data

$hiera_dns_server_array = hiera_array('dns::server')

$common_dns_server = '127.0.0.1'

class { 'resolv':
dns_servers => [ $hiera_dns_server_array, $common_dns_server ]
}

Something like this seems like I'm telling a module *how* to look up my own data, rather than passing the right data to the module:

class resolv (
$dns_servers_key_name = 'dns_servers',
$dns_servers_key_merge = false,
) {
if ($dns_servers_key_merge) {
$dns_servers = hiera_array($dns_servers_key_name)
} else {
$dns_servers = hiera($dns_servers_key_name)
}
}

class { 'resolv': dns_servers_key_merge => true }

I'd also have to code it to selectively use Hiera or not (some people don't) and that would get even worse. The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

-Luke
---

LMAX Exchange, Yellow Building, 1A Nicholas Road, London W11 4AN
http://www.LMAX.com/

---
#1 Fastest Growing Tech Company in UK - Sunday Times Tech Track 100 (2014)

Awards
2015 Best FX Trading Venue - ECN/MTF - WSL Institutional Trading Awards
2014 Best Margin Sector Platform - Profit & Loss Readers' Choice Awards
2014 Best FX Trading Venue - ECN/MTF - WSL Institutional Trading Awards
2014 Best Infrastructure/Technology Initiative - WSL Institutional Trading Awards
2013 #15 Fastest Growing Tech Company in UK - Sunday Times Tech Track 100
2013 Best Overall Testing Project - The European Software Testing Awards
2013 Best Margin Sector Platform - Profit & Loss Readers' Choice Awards
2013 Best FX Trading Platform - ECN/MTF - WSL Institutional Trading Awards
2013 Best Executing Venue - Forex Magnates Awards
2011 Best Trading System - Financial Sector Technology Awards
2011 Innovative Programming Framework - Oracle Duke's Choice Awards
---

FX and CFDs are leveraged products that can result in
losses exceeding your deposit. They are not suitable
for everyone so please ensure you fully understand
the risks involved.

This message and its attachments are confidential,
may not be disclosed or used by any person other
than the addressee and are intended only for the
named recipient(s). This message is not intended for
any recipient(s) who based on their nationality,
place of business, domicile or for any other
reason, is/are subject to local laws or regulations
which prohibit the provision of such products and
services. This message is subject to the terms at
http://www.lmax.com/pdf/general-disclaimers.pdf
however if you cannot access these, please notify
us by replying to this email and we will send you
the terms. If you are not the intended recipient,
please notify the sender immediately and delete any
copies of this message.

LMAX Exchange is the trading name of LMAX Limited. LMAX
Limited operates a multilateral trading facility. LMAX
Limited is authorised and regulated by the Financial
Conduct Authority (firm registration number 509778)
and is a company registered in England and Wales
(number 6505809).

LMAX Hong Kong Limited is a wholly-owned subsidiary
of LMAX Limited. LMAX Hong Kong is licensed by the
Securities and Futures Commission in Hong Kong to
conduct Type 3 (leveraged foreign exchange trading)
regulated activity with CE Number BDV088.

Bostjan Skufca

unread,

Mar 11, 2015, 10:06:29 PM3/11/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, 11 March 2015 23:31:38 UTC+1, Luke Bigum wrote:

On Wednesday, March 11, 2015 at 4:35:36 PM UTC, Bostjan Skufca wrote:

Something like this seems like I'm telling a module *how* to look up my own data, rather than passing the right data to the module:

class resolv (
  $dns_servers_key_name = 'dns_servers',
  $dns_servers_key_merge = false,
) {
  if ($dns_servers_key_merge) {
   $dns_servers = hiera_array($dns_servers_key_name)
  } else {
   $dns_servers = hiera($dns_servers_key_name)
  }
}

class { 'resolv': dns_servers_key_merge => true }

I'd also have to code it to selectively use Hiera or not (some people don't) and that would get even worse. The second example of module design may be super awesomely flexible in terms of how I can structure my Hiera data, but it doesn't fit the direction the community is moving in terms of module design.

This is almost what I am looking for. I have an alternate approach: what if merging vs nonmerging is decided based on hiera key?

That is my approach, that class would do an implicit Hiera lookup for those class parameters, I just illustrated the point with a resource-like declaration as an example. While the above method would work, I don't think I've made my point about not putting this personalised logic in the "resolv" module itself. The above example is not so good. Gary Larizza explains it very well here if you haven't seen it (https://www.youtube.com/watch?v=v9LB-NX4_KQ). That video should answer your questions in your second reply to me too, BTW.

Tnx for the pointer, I will watch it soon.

The above code example is a bad idea for these reasons:

- the resolv module is tightly coupled to the data, it's in control of how it should look up data, rather than just be *given* data
- you won't be able to replace that resolv module with the super awesome puppetlabs_resolv module because of your custom way of handling data
- it makes a *very* bad assumption that everyone uses Hiera, it is not compatible for people who use ENCs that supply all class parameters for example
- there's a higher barrier to entry on understanding the module, some people would have to read the body of the resolv module code to figure out what's going on (or there would be a long README)
- it's more complicated to test because the range of data it can take is more complicated

Agreed.

Now expand on my first example:

********************
class puppetlabs_resolv($dns_servers) {
file { '/etc/resolv.conf': content => template(...) }
}

class profile::dns_base {
#lookup my DNS data from Hiera
$hiera_dns_server_array = hiera_array('dns::server')
#and add a global DNS server I have
$common_dns_server = '127.0.0.1'
class { 'puppetlabs_resolv':
dns_servers => [ $hiera_dns_server_array, $common_dns_server ]
}
}

class profile::dns_special {
#don't do a hiera lookup, DNS here is special
$special_dns = '10.1.1.1'
class { 'puppetlabs_resolv':
dns_servers => [ $special_dns ]
}
}

node dc1 { include profile::dns }
node dc1_special { include profile::dns_special }
********************

The puppetlabs_resolv module I downloaded from GitHub does one thing well, resolv.conf, in a simple and easily understood manner, and it comes with Rspec tests, so I don't have to reinvent the wheel.

All of my business logic about how I get IP addresses into that resolv module is in my profile::dns* classes. These are *my* profile classes, I can do whatever crazy Hiera lookups and string manipulation I want/need to get the data into a format that puppetlabs_resolv takes. In other words my profiles are the "glue" between my data and the "building block" puppetlabs_resolv module. At any time I can replace puppetlabs_resolv with lukebigum_resolv (which is obviously better) with a few tweaks to my profiles. If I replace my data backend or get rid of Hiera entirely, my profile might have to be adjusted but I don't have to stop using that awesome lukebigum_resolv I downloaded.

This introduces another layer into the system, but it makes sense. Especially if you rely on third-party modules.

Why the use of a second profile, profile::dns_special? It takes complexity out of Hiera. I don't need a complicated Hierarchy when I've got profiles, and I rarely need inheritance at all. I've got my "tpl_%{::domain}" which is where my profile::dns looks up data from, and anything that's special is actually a different implementation of how I usually do DNS, so it gets it's own profile, hence profile::dns_special. It is better to handle these exceptions in Puppet code because it's an *actual* language, rather than trying to model something complex into Hiera which is just a key-value store.

This is a nice articulation of the problem - hiera is not a language.

Your Hiera example where you have tpl_dc1.yaml and tpl_dc1-special.yaml is going to bite you. Your joke about mimicking node inheritance functionality in Hiera worries me a little, because it reminds me of some of my colleagues. Just because it can be modelled in Hiera, doesn't mean it should be. To give you an example, at my work place we can build an entire platform where each node's Hiera file looks like this:

---
ip_address_fourth_octet: 10

And the rest is abstracted, inherited and hidden away. In some ways it's really awesome, but it is also very hard to debug, and extraordinarily hard to understand. I once spent 2 hours tracing a string in a configuration file through too many Hiera files each with over a dozen levels of dictionary/hash depth, about 7 create_resource() calls, several exported resources and luckily only 3-4 recursive Hiera lookups. I was not happy by the end of that. Not long after my team lead forced us to re-read the Roles and Profiles design pattern and to watch that video ;-)

Nice horror story :)

b.

jcbollinger

unread,

Mar 12, 2015, 9:32:21 AM3/12/15

to puppet...@googlegroups.com, christop...@pobox.com

On Wednesday, March 11, 2015 at 8:57:00 AM UTC-5, Christopher Wood wrote:

(Replying to two people in one email, hum.)

On Wed, Mar 11, 2015 at 06:01:39AM -0700, jcbollinger wrote:
> On Tuesday, March 10, 2015 at 9:59:41 PM UTC-5, Bostjan Skufca wrote:
>
> On Monday, 9 March 2015 14:45:38 UTC+1, Christopher Wood wrote:
>
> On Sun, Mar 08, 2015 at 11:55:03AM -0700, Bostjan Skufca wrote:
> > With hiera:
> > - How would you go about when certain nodes need data merged from
> all
> > scopes, but other nodes need data from just the last scope?
>
> I've usually had a "classname::merge: true" key in hiera, controlling
> whether I use hiera() or hiera_hash() to obtain the data I need.
>
> And this hits the nail on the spot, even if unknowingly:)
> The problem I am seeing here and which I am only now being able to
> articulate, is the clash of two contradictory elements:
> 1. Puppet development is pushed towards decoupling code (manifest) from
> data, a noble goal
> 2. Puppet provides two functions, hiera() and hiera_array(), and the
> very existence of more than one function to retrieve data destroys the
> notion, that code should be unaware of underlying data storage details.

I rather take your point, but isn't the requirement for different data handling just another data item?

No, it is metadata. The metadata could be lumped in with the data the regular data -- and in fact, the default back end provides no other alternative if you want to provide that metadata at all -- but that's untidy, and it doesn't play nicely with automated data binding.

Is any code unaware of the underlying data structure? Even if you have a single type of data (plain string-like variables) your code is implicitly aware that it can treat them as that type.

You're commingling two different concepts: the structure of the data provided by Hiera to Puppet, and the structure of the data in the external storage on which Hiera relies. Puppet needs to know about the former, but it shouldn't have to know or care about the latter. THAT's the whole point. The fact that there are three different Hiera lookup functions, and that they can return different data for the same key -- even data with different structure -- makes Puppet sensitive to the internal layout of Hiera's data files.

I'm not really sure there's a way to automagically distinguish

"this is an array, do not retrieve its contents from all levels"
"this is an array, do retrieve its contents from all levels"

while still preserving our sanity.

Well Hiera doesn't offer either, so your sanity is safe.

Seriously, although hiera_array('my::key') does return an array value, that does not necessarily mean that hiera('my::key') will do so too. Neither function says "the data for 'my::key' is an array". The latter is not looking up the same thing as the former, and again, that's the problem. Puppet should need only to know the key, not which type of lookup to use.

And it would be possible. For example, the YAML back end could be modified to refer to an ancillary metadata file that flagged certain keys for array or hash-merge lookup. That's a bit ugly, but sometimes ugly happens when you have to retrofit.

> Puppet in fact provides three functions functions for lookups: there is
> also hiera_hash().
>
> In any case, you are quite right. Which sort of lookup is intended is an
> attribute of the data -- part of the definition of each key -- but it is
> not represented in or alongside the data. Each user of the data somehow
> has to know. That could be tolerated, inconvenient as it is, except that
> it is incompatible with automated data binding. This is an issue that has
> been recognized and acknowledged, though I'm uncertain whether it is
> actively being addressed.

Could you possibly expound on the "Each user of the data somehow has to know" part? I'm having trouble with the notion that people would use puppet manifests and hiera data without knowing what's in them.

Each user of the data (generally a Puppet class or defined type) has to know whether he is supposed to use an ordinary priority lookup, an array lookup, or a hash-merge lookup for each key, because the value retrieved has the intended content only if the correct form of lookup is used. Generally speaking, the value obtained via a different form of lookup has little, if any, significance. It's not a question of choosing which one you want for a particular purpose, but rather of divining which is the (only) one appropriate for the data.

People run into this issue in practice when they try to use automated data binding with data set up for array or hash-merge lookup.

John

Christopher Wood

unread,

Mar 13, 2015, 12:40:50 PM3/13/15

to puppet...@googlegroups.com

(I will agree to a point, this is often situational based on the company culture.)

> Again, maybe it is just that default hiera backends do not allow for such
> flexibility. It should not be hard to switch that to custom provider,
> whose data model actually allows for such flexibility.
>
> (I've had some nasty run-ins with merging lookups and have decided
> they're mostly not for me, maybe the smarter people on this list are
> having better results.)
>
> Care to elaborate a bit, especially how did you overcome them (define all
> data for each node)?
> b.

Abstracting the details a bit, I had a key in common.yaml which was fine at first but as time went on was not appropriate for all nodes. That was fine because some subsets of nodes overrode the setting at a higher level. Many months later I felt it would simplify our hiera configuration (fewer duplicate hash elements that could be consolidated farther down the tree) if I used a hiera_hash(). I did not recall the existence of this default setting, nor did I grep for it (which would have saved me). A stack of nodes got an incorrect setting and much sadface was had. I had a few more in the lab but this was the first one that escaped. The fact that I failed to check for existing hiera keys in other levels is quite obviously my fault, however, it's also important that the underlying architecture (deep levels, complex merged hiera data) and culture (a default for everything) enabled the resulting error condition.

Explaining the fix needs an example (fictional): I'm licenced per-datacentre for Bespoke Proprietary Edition (BPE) plus some testing licences which activate additional features, and I have a stub class and short hiera tree.

common (true for everything)
type (true for hosts of that type, e.g. frontend, db, core)
datacentre (true for hosts in that location)
node (true for individual hosts)

class bpe ( $licencing ) {
# stuff happens here
}

My desirable behaviour for BPE nodes which do not exist in a licenced datacentre is to not apply any licence key and thus to not run the daemon. (Your mileage may vary, my rationale is that my company agreed to these restrictions and I will stick to them.)

My desirable behaviour for the puppetmaster compiling the catalog for an unlicenced host is to error out and fail the catalog compilation, highlighting the missing data at the earliest possible stage of the build. (Your mileage may vary, my rationale is that a server without all its requirements in place should not build. The "Finished catalog run" seems to instill a bit of false confidence that everything worked.)

Putting these together, my bpe::licencing key would go no lower than the datacentre level. If we roll out a new location somebody would have to add the key to newdc.yaml before BPE hosts will build there. If somebody wants to try out the test licensing they can add all licences to a node in one of the datacentres. The error message signals the admin to go get the required licences, or at least to ask why it's erroring on licencing when building other hosts works.

In short, abandon the concept of a default where there isn't a legitimate default and be stricter about when servers build.

All that said, we do have a remaining hiera_hash() where it has never done any harm, our yum module. We've had a few build failures in the lab finding out which repos were mutually incompatible but this one has never been any problem in production.

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit

> [2]https://groups.google.com/d/msgid/puppet-users/a3b26c59-beb9-4dc1-a669-e253858d602a%40googlegroups.com.

> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com

> 2. https://groups.google.com/d/msgid/puppet-users/a3b26c59-beb9-4dc1-a669-e253858d602a%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Christopher Wood

unread,

Mar 13, 2015, 1:12:12 PM3/13/15

to puppet...@googlegroups.com

I grant that I'm not seeing the whole picture; I'm perfectly fine with the notion that code/data/metadata/structure are all subsets of the information required to correctly manage a host. I presume structure has to go somewhere and if it's not in the pp file it's just somewhere else I will have to know about and account for so I'm not really seeing what difference it makes. For instance, what breaks with the current thing that wouldn't if puppet just got data and the hiera_array vs hiera_hash determination was made elsewhere?

> I'm not really sure there's a way to automagically distinguish
>
> "this is an array, do not retrieve its contents from all levels"
> "this is an array, do retrieve its contents from all levels"
>
> while still preserving our sanity.
>
> Well Hiera doesn't offer either, so your sanity is safe.
>
> Seriously, although hiera_array('my::key') does return an array value,
> that does not necessarily mean that hiera('my::key') will do so too.
> Neither function says "the data for 'my::key' is an array". The latter is
> not looking up the same thing as the former, and again, that's the
> problem. Puppet should need only to know the key, not which type of
> lookup to use.
>
> And it would be possible. For example, the YAML back end could be
> modified to refer to an ancillary metadata file that flagged certain keys
> for array or hash-merge lookup. That's a bit ugly, but sometimes ugly
> happens when you have to retrofit.

I don't know that this is better or worse than having structural information about hiera in my pp files. I go from:

having two places where things go (hiera and puppet)
having structural information in each (yaml anchor/alias etc., puppet data bindings and hiera functions)

To:

having three places where things go (hiera, hiera metadata, puppet)
having structural information in two (yaml anchor/alias etc., hiera key flagging)

I've added a place and now I have more to think about, plus it's not obvious from my puppet code where my data is coming from. and I have a lab host where I don't actually want things tagged as merge-only to be merged while I'm experimenting. Ouch my brain.

> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [1]puppet-users...@googlegroups.com.
> To view this discussion on the web visit

> [2]https://groups.google.com/d/msgid/puppet-users/dce8db40-d48c-452c-8b7b-edaec89f56b5%40googlegroups.com.

> For more options, visit [3]https://groups.google.com/d/optout.
>
> References
>
> Visible links
> 1. mailto:puppet-users...@googlegroups.com

> 2. https://groups.google.com/d/msgid/puppet-users/dce8db40-d48c-452c-8b7b-edaec89f56b5%40googlegroups.com?utm_medium=email&utm_source=footer
> 3. https://groups.google.com/d/optout

Bostjan Skufca

unread,

Mar 13, 2015, 1:43:25 PM3/13/15

to puppet...@googlegroups.com, christop...@pobox.com

On Friday, 13 March 2015 17:40:50 UTC+1, Christopher Wood wrote:

On Wed, Mar 11, 2015 at 09:25:04AM -0700, Bostjan Skufca wrote:
> On Wednesday, 11 March 2015 14:57:00 UTC+1, Christopher Wood wrote:
> (I've had some nasty run-ins with merging lookups and have decided
> they're mostly not for me, maybe the smarter people on this list are
> having better results.)
>
> Care to elaborate a bit, especially how did you overcome them (define all
> data for each node)?
> b.

My desirable behaviour for the puppetmaster compiling the catalog for an unlicenced host is to error out and fail the catalog compilation, highlighting the missing data at the earliest possible stage of the build. (Your mileage may vary, my rationale is that a server without all its requirements in place should not build. The "Finished catalog run" seems to instill a bit of false confidence that everything worked.)

A sane behaviour, which I also generally use, if applicable (and not just with puppet).

b.

jcbollinger

unread,

Mar 16, 2015, 11:29:31 AM3/16/15

to puppet...@googlegroups.com

On Friday, March 13, 2015 at 12:12:12 PM UTC-5, Christopher Wood wrote:

On Thu, Mar 12, 2015 at 06:32:21AM -0700, jcbollinger wrote:

> No, it is metadata. The metadata could be lumped in with the data the
> regular data -- and in fact, the default back end provides no other
> alternative if you want to provide that metadata at all -- but that's
> untidy, and it doesn't play nicely with automated data binding.
>
>
>
> Is any code unaware of the underlying data structure? Even if you have a
> single type of data (plain string-like variables) your code is
> implicitly aware that it can treat them as that type.
>
> You're commingling two different concepts: the structure of the data
> provided by Hiera to Puppet, and the structure of the data in the external
> storage on which Hiera relies. Puppet needs to know about the former, but
> it shouldn't have to know or care about the latter. THAT's the whole
> point. The fact that there are three different Hiera lookup functions,
> and that they can return different data for the same key -- even data with
> different structure -- makes Puppet sensitive to the internal layout of
> Hiera's data files.

I grant that I'm not seeing the whole picture; I'm perfectly fine with the notion that code/data/metadata/structure are all subsets of the information required to correctly manage a host. I presume structure has to go somewhere and if it's not in the pp file it's just somewhere else I will have to know about and account for so I'm not really seeing what difference it makes. For instance, what breaks with the current thing that wouldn't if puppet just got data and the hiera_array vs hiera_hash determination was made elsewhere?

The most prominent thing that breaks is automatic data binding. If the physical layout of your data for some key is designed for service via the hiera_array() or hiera_hash() function, then you cannot use automated data binding with that key. Or to put it the other way around, if you have a class parameter whose value you want to provide via automatic data binding (as you should), then you must structure the associated data for for priority lookup, not for array or hash-merge lookup.

In a sense, this is an encapsulation issue: the physical layout of the data is the implementation, and the keys are the interface. You shouldn't need to know anything about the implementation to use the interface, but currently, you do.

> And it would be possible. For example, the YAML back end could be
> modified to refer to an ancillary metadata file that flagged certain keys
> for array or hash-merge lookup. That's a bit ugly, but sometimes ugly
> happens when you have to retrofit.

I don't know that this is better or worse than having structural information about hiera in my pp files. I go from:

having two places where things go (hiera and puppet)
having structural information in each (yaml anchor/alias etc., puppet data bindings and hiera functions)

To:

having three places where things go (hiera, hiera metadata, puppet)
having structural information in two (yaml anchor/alias etc., hiera key flagging)

I've added a place and now I have more to think about, plus it's not obvious from my puppet code where my data is coming from. and I have a lab host where I don't actually want things tagged as merge-only to be merged while I'm experimenting. Ouch my brain.

The design I presented was for proof-of-concept purposes. There are probably better alternatives.

Even with that simple design, though, the information complexity does not increase. If you want to use any array- or hash-merge lookups at all then you already have to worry about manifests, data, and metadata, wherever each of those lives. The gain is in separation of concerns: when you're working with your manifests, you don't need to pay any attention to metadata, and when you're working with the data, you don't need to be worried (as much) about breaking manifests.

This may even have impacted you personally, in the real-life failure case you described. I speculate that if the physical data layout had more clearly been associated directly with the data, then you would have been more likely to look for (and find) the problematic default value when you changed to hash-merge lookups. Having separate lookup functions for array- and hash-merge lookup styles can be a distraction from the fact that the physical data layout is important. It is not, in general, safe to switch from one mode to another for any given key.

John

Reply all

Reply to author

Forward