How best to use hiera for Java options? (with hash_merge)

120 views
Skip to first unread message

Steven Post

unread,
Nov 17, 2014, 10:11:09 AM11/17/14
to puppet...@googlegroups.com
Hi,

I've been using puppet for over a year now, but now I have a problem, and I'm not seeing a solution.
I have a hash to set up Java application servers in hiera, these get 'deeper' merged so we don't need duplicate declarations of every little option.
Inside the manifest, the create_resources() function is called.

Now the problem here is that one of the options inside the hash is an array of Java options, such as heap size, the garbage collector, the OOM behaviour, etc.
Since arrays gets merged with hiera_hash in a 'deeper' merge, these options would be merged as wel.
The heap and permsize are actually already separate options, so they don't have this problem, but if I set '-XX:NewRatio=2' and I wan't to override this for some nodes or clusters, I would set it as '-XX:NewRatio=3', but then this option is passed down twice, but with different values.

How to solve this? Does anyone else have any experience to share about this?

Regards,
Steven

jcbollinger

unread,
Nov 17, 2014, 3:06:02 PM11/17/14
to puppet...@googlegroups.com
I'm not quite following.  Perhaps a bit more detail about your data structure would be helpful.

If I were setting up an options hash then I would use option names as the keys (e.g. 'XX:NewRatio') and option values as the values (e.g. 3).  If you were doing it that way, however, then you couldn't express different values for the same option in your hash, so you couldn't have the same option expressed twice.


John

Steven Post

unread,
Nov 18, 2014, 12:21:33 PM11/18/14
to puppet...@googlegroups.com


Hi,

I've been thinking about this for a day now, I think I definitively need a hash, not an array here.

An example of the outer hash, simplified for this example:
jboss_application::app_version::jvm_defaults:
  additional_java_opts:
    - '-XX:+UseConcMarkSweepGC'
    - '-XX:NewRatio=2'
    - '-XX:+CMSClassUnloadingEnabled'
    - '-XX:OnOutOfMemoryError=''kill -9 %p'''
    - '-XX:+HeapDumpOnOutOfMemoryError'

If you would change the array into a hash, how would you go about it?
The 'NewRatio' for example is easy, since actually is key + value ('NewRatio' + '2'), a maximum heap size would be easy as well using 'Xmx' as a key.
However the '+UseConcMarkSweepGC' or '+HeapDumpOnOutOfMemoryError' seem a bit trickier, especially if you want to disable the heapdump on certain nodes, or use a different GC algorithm (like GC1 for example).
Also some options are related to eachother, such as the NewRation, which doesn't work when using an adaptive policy (different GC), or the 'CMSClassUnloadingenabled', which is not needed in conjunction with the default GC (parallel mark sweep).
Perhaps I need to split the options in different variables?

Regards,
Steven

jcbollinger

unread,
Nov 18, 2014, 6:33:49 PM11/18/14
to puppet...@googlegroups.com


I would do this:

  additional_java_opts:
    '-XX:+UseConcMarkSweepGC': null
    '-XX:NewRatio': 2
    '-XX:+CMSClassUnloadingEnabled': null
    '-XX:OnOutOfMemoryError': 'kill -9 %p'
    '-XX:+HeapDumpOnOutOfMemoryError': null

or perhaps this:

  additional_java_opts:
    'UseConcMarkSweepGC': true
    'NewRatio': 2
    'CMSClassUnloadingEnabled': true
    'OnOutOfMemoryError': 'kill -9 %p'
    'HeapDumpOnOutOfMemoryError': true

If the YAML parser doesn't recognize null as the primitive it is (in YAML 1.2) then perhaps it would be better to use an empty string instead, or else use the second form.  The other part of the equation is how those data are used.  Since you are at liberty to change the form of the data, you must also be at liberty to change the template, resource type, or whatever that consumes the data.  Changes certainly will be needed there, and your choice for the form of the data may be influenced by how you choose to update their consumer.

 
The 'NewRatio' for example is easy, since actually is key + value ('NewRatio' + '2'), a maximum heap size would be easy as well using 'Xmx' as a key.
However the '+UseConcMarkSweepGC' or '+HeapDumpOnOutOfMemoryError' seem a bit trickier, especially if you want to disable the heapdump on certain nodes, or use a different GC algorithm (like GC1 for example).
Also some options are related to eachother, such as the NewRation, which doesn't work when using an adaptive policy (different GC), or the 'CMSClassUnloadingenabled', which is not needed in conjunction with the default GC (parallel mark sweep).
Perhaps I need to split the options in different variables?


There is a tension between flexibility and precision.  What you have now, or something related to it, is very flexible, but allows nonsensical combinations of options to be specified.  If you don't like that then you don't have to do things that way, but to do otherwise will require more complicated code.


John

Steven Post

unread,
Nov 19, 2014, 9:55:15 AM11/19/14
to puppet...@googlegroups.com
Hi,

So if I understand correctly, I can use 'null' as a value, and it will be used instead of the value somewhere lower on the hierarchy?
If that is the case, my problem is solved (I think). Completely preventing the nonsensical stuff is not the goal here, but it should be possible in hiera to avoid it by being able to remove already set options.

I can indeed modify the consumer of data as well, but since this is already used in production, I need to be a bit careful with the changes I do.
I don't know if our setup already supports yaml 1.2, our hiera version currently is 1.3.1

I'll test this out and get back with my findings.

Regards,
Steven

jcbollinger

unread,
Nov 19, 2014, 2:14:48 PM11/19/14
to puppet...@googlegroups.com


On Wednesday, November 19, 2014 3:55:15 AM UTC-6, Steven Post wrote:
Hi,

So if I understand correctly, I can use 'null' as a value, and it will be used instead of the value somewhere lower on the hierarchy?


YAML 1.2 defines the (unquoted) token "null" to be a scalar value representing the same (no)thing as Ruby nil.  A YAML parser that doesn't doesn't understand that will take that token as a scalar representing the four-character string "null". I don't happen to know which way Hiera goes, which might in fact vary with the underlying version of Ruby, but either way, null is a value of some kind.

When you perform a "deeper" hash merge you should find that corresponding hashes at different levels of your data hierarchy are merged, where "corresponding" is defined with respect to nested hashes according to the chain of keys required to drill down through the data to each hash.  In such a merge, you should find that the (key, value) pairs from the higher-priority level are retained, and where the hash from the lower-priority level has keys that do not appear in the higher level hash, those keys and their associated values end up in the merged result.

 
If that is the case, my problem is solved (I think). Completely preventing the nonsensical stuff is not the goal here, but it should be possible in hiera to avoid it by being able to remove already set options.



Hash merging always results in a hash whose key set is the union of the key sets of the merged hashes.  The value associated with each key is the question, and with hiera the questions are
  1. Whether nested hashes are merged at all (vs. hashes appearing as values in higher-priority levels completely replacing lower-level values), and
  2. If nested hashes are merged, which value appears in the result for keys that appear in both original hashes.
At this time, those questions can be answered only globally in the Hiera configuration or, to some extent on a whole-lookup basis via explicit lookup functions (hiera() vs hiera_hash()).  That is a recognized flaw in Hiera, in that the appropriate form of lookup is (should be) a characteristic of the data, not of the query.  With that said, I don't think YAML is not a particularly good vehicle for expressing data that must carry such fine distinctions, so I suspect that the current limitations will be with us for a long time.

 
I can indeed modify the consumer of data as well, but since this is already used in production, I need to be a bit careful with the changes I do.
I don't know if our setup already supports yaml 1.2, our hiera version currently is 1.3.1


You're missing my point there, I think: if you modify the form of the data, then almost surely you must modify the data consumer.  How else will it know what to do with your modified data structure?

 

I'll test this out and get back with my findings.



I look forward to hearing how it turns out.


John

Steven Post

unread,
Nov 19, 2014, 3:55:11 PM11/19/14
to puppet...@googlegroups.com
Hi,

My problem is solved now, I'll see if I can change the topic title to reflect that.

The final solution looks like this:
jboss_application::app_version::jvm_defaults:
  additional_java_opts:
    useconcmarksweepgc: '-XX:+UseConcMarkSweepGC'
    newratio: '-XX:NewRatio=2'
    cmsclassunloadingenabled: '-XX:+CMSClassUnloadingEnabled'
    onoutofmemoryerror: '-XX:OnOutOfMemoryError=''kill -9 %p'''
    heapdumponoutofmemoryerror: '-XX:+HeapDumpOnOutOfMemoryError'

I know it looks a bit like some kind of 'Frankenstein' solution, but it does the job.
Setting a value to 'null' causes the value from the same key (if present) from the next step in the hierarchy to be used, so that is not very useful.
However this is overcome by using 'false' as a boolean value (so no quotes) and checking that in the template like this:
<%- if @additional_java_opts and ! @additional_java_opts.empty? -%>
  <%- @additional_java_opts.keys.sort.each do |key| -%>
    <%- if @additional_java_opts[key] -%>
      <%- %>JAVA_OPTS="$JAVA_OPTS <%= @additional_java_opts[key] %>"
    <%- end -%>
  <%- end -%>
<%- end -%>

This does not prevent anything weird occurring when someone makes a typo in a key, but at least I can be really specific in my configurations and don't need to alter the template or manifest for every possible option of the JVM.
The 'sort' is needed because the order of the hash entries is not predetermined in Ruby 1.8 (it is in Ruby 1.9).


> That is a recognized flaw in Hiera, in that the appropriate form of lookup is (should be) a characteristic of the data, not of the query.

Very true, for this reason I already use a (ugly) work-around in this case: https://www.2realities.com/blog/2014/07/05/puppet-hiera-hash-merge-and-automatic-parameter-lookup/


> You're missing my point there, I think: if you modify the form of the data, then almost surely you must modify the data consumer.

I was merely pointing out I need to be careful when I change it, not that I can ignore it.

Thanks for your insights.

Regards,
Steven
Reply all
Reply to author
Forward
0 new messages