Referencing a value from one level to another.

42 views
Skip to first unread message

Jim Donnellan

unread,
Apr 14, 2014, 2:41:42 PM4/14/14
to puppet...@googlegroups.com
Puppeteers,

I'm trying to get something done in puppet/hiera, and I'm curious if it's possible. A bit of background:

We're using puppet and hiera to build out and maintain Apache Solr, and we're using Solr in a cloud structure. What this ends up meaning configuration-wise is that we have our data broken off into shards, and servers that are hosting different replicas of the shards for redundancy. For a simplified sense, say it looks like this...

Server1 hosts:
  shard1, replica 1
  shard2, replica 1

Server2 hosts:
  shard1, replica 2
  shard2, replica 2

Server3 hosts:
  shard3, replica 1
  shard4, replica 1

...and so on. So each shard exists on multiple hosts, each host has multiple shards, but not every shard is on every host.

We were able to handle this just fine by using a hiera array to list the shards at the _host_.yaml level. No big deal. Done and done, works great in production.


The issue that has come up is that some of the shards (which are JVMs) have grown a bit and require a heap size greater than the default. Obviously this would be something we'd want to wrangle with puppet and hiera. We've come up with some initial attempts that seem to work, which involve just enhancing the hiera data when we declare the shards at a host level. So instead of:

Server1.yaml::
shards:
- shard1
- shard2

We have something like this:

Server1.yaml::
shards:
- shard1: 5G
- shard2: 7G

...which is workable. The thing I don't like about it is that I'm defining the heap size at the host level, even though they should be consistent for any given shard across servers. This is redundant at best, and leaves things open for inconsistency across servers at worst. I kind of want to raise the heap size declarations up above the host level, up to the application or environment level I guess. But I would still need to declare which shards are where at the host level. In short, I guess I need the deployment to look for what shards should be on a host at the host level, and then look up the chain a bit to see what heap size that shard should have.

Does this sound doable?

Thanks in advance,
Jim

jcbollinger

unread,
Apr 15, 2014, 9:11:36 AM4/15/14
to puppet...@googlegroups.com


At that level of abstraction, yes, it sounds doable, but the Devil is in the details.  The per-host shard data probably need to be references (by name) to shard details in some more general level of your hierarchy.  You can then use a defined type to declare all the shards for each server, based on the shared details for each shard.  From the data structure you describe, I suppose you probably already have something going in this direction.  So your data might look more like this:

Server1.yaml:
shards:
- shard1
- shard2

Server5.yaml:
shards:
- shard1
- shard5

common.yaml:
shard_details:
  shard1:
    max_heap: 5G
  shard2:
    max_heap: 7G
  shard5:
    max_heap: 4G


There are any number of ways you could tweak the data structure, but that general approach seems sound to me.


John

Jim Donnellan

unread,
Apr 17, 2014, 1:38:48 PM4/17/14
to puppet...@googlegroups.com
John,

Thanks so much; this worked as expected.

Reply all
Reply to author
Forward
0 new messages