In the larger env it takes about 70 minutes, if it manages to finish at all. Initially, as a "quick" test, I was running puppetdb without postgres and had to give it 5GB to get it to finish at all (70 mins). With postgres 8.4, load on the puppetmaster is significantly reduced, but with 512MB for puppetdb (128 + 1MB per node, and then double it for good measure) puppetdb still runs out of memory. I set it to 1GB and puppedb just crashed again (I've got dumps). Trying with 2GB now. I haven't fiddled with thread settings, but my puppet agents aren't deamonized or 'croned', I run them using mcollective or manually. So there's only a single puppet agent running during this test, on the core nagios server. It seems that there's a ruby process taking 100% of one core during this run and nothing else "dramatic" seems to be happening (except for puppetdb dying of course).
The culprits being these two lines in two manifest files:
./nsca/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>> ./nrpe/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>>
replacing them with unchained:
File <<| tag == $get_tag |>> Nagios_host <<| tag == $get_tag |>>
causes it to run even with 1GB for puppetdb (still a 16GB vm) in under 10 mins, which is acceptable.
This is tracked under bug #18804
Seems that chaining exported resources might not be too efficient and produces lots of data that could be the reason for puppetdb crashing.The culprits being these two lines in two manifest files:
./nsca/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>> ./nrpe/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>>
replacing them with unchained:
File <<| tag == $get_tag |>> Nagios_host <<| tag == $get_tag |>>causes it to run even with 1GB for puppetdb (still a 16GB vm) in under 10 mins, which is acceptable.
On Thursday, January 24, 2013 5:24:58 AM UTC-6, Daniel wrote:Seems that chaining exported resources might not be too efficient and produces lots of data that could be the reason for puppetdb crashing.The culprits being these two lines in two manifest files:
./nsca/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>> ./nrpe/server.pp: #File <<| tag == $get_tag |>> -> Nagios_host <<| tag == $get_tag |>>
And there may be your combinatorial problem. Supposing that the tags are not specific to the exporting node, the numbers of Files and Nagios_hosts increase linearly with the number of nodes, but the number of relationships among them increases with the square of the number of nodes.
Do you in fact need the order of application constraints that the chaining declared? All of them?
Depending on what relationships you actually need, you have various options:
1) If it doesn't actually matter whether Puppet syncs given Files before or after it syncs any particular Nagios_host, then you never needed any relationships at all, and simply removing the chaining as you did is the best solution.
2) If specific Files must be synced before specific Nagios_hosts, then you should express those relationships and not any others. In particular, if you only need relationships among Files and Nagios_hosts exported by the same node, then you should declare the relationships on the exporting side instead of on the collecting side. (And if they always come in such pairs then perhaps you should wrap then in a defined type and export that instead).
3) If you really need something along the lines that the chaining expressed -- all Files with a given tag synced before all Nagios_hosts bearing that tag -- then you can break the combinatorial problem by introducing an artificial barrier resource for each tag: