I still haven't solved my Nagios Contact management issue yet, but I'm getting hammered by not having monitors on servers that have higher level monitoring needs that nagios can support.
Kind regards,
Sean
Al,
I'm sorry to bug you, I know you're fresh off of PuppetConf and have a load on your todo. I have two topics I'm hoping you can provide shed some light on...
A) Could you recommend a tool that works within the confines of your monitoring modules that can be used to provide RRD/Trending of data collected by the monitors (I'm using nagios)? I've been trying to look at pnp4nagios, but its not playing as well as I had hoped. This is a situation where we have two separate packages, one (pnp4nagios) requires modifications to the other's (nagios) config. Can I use an exported resource to tell nagios to use a pnp4nagios config template?...otherwise we'd include the pnp4nagios class (which might provide a nagios.cfg template), then source the custom template in the nagios class, which just seems like it would be confusing...as opposed to your Site Class concept to store site specific files.
B) Do you have any thoughts on how to deal with the need for more fine-grained monitors for Nagios? I have spent some time adding a few files manually, and while this works its clearly not optimal especially for service checks for hosts that E42 monitor/nagios/nrpe manages. Consider monitors for things like:
- DHCPD where we want to monitor the process and port, but we also might want to monitor the DHCP scope usage.
- Apache where we want to monitor idle workers, SSL certificate status, process, port, etc.
- Tomcat where we want to monitor a host of JMX attributes beyond process and port.
I still haven't solved my Nagios Contact management issue yet, but I'm getting hammered by not having monitors on servers that have higher level monitoring needs that nagios can support.
Hi Sean
For trending I've used the Munin module, which works well.
It doesn't need to be set as monitor tool, you just have to inlcude in in your monitored nodes, set the IP of the munin server.
On the munin server set the parameter $munin_server_local to true.The module automatically executes a daily cronjob (you can disable it) to autoconfigure the plugins (it should understand by itself what's there to monitor, in some case you might need to configure the singleplugins, with credentials and other params (munin::params).
Munin main drawback is that it doesn't scale well, but something better should be done with the newest versions (such as graphing on demand).
On the relevant role classes (or anyway in a class included by your nodes) you can place custom defines like monitor::plugin which may use any kind of Nagios plugin, and eventually configure it both on the central nagios server and locally for puppi checks (if you have $monitor_tool = [ 'puppi' , 'nagios' ].If you use only Nagios and are not interested in the puppi checks, just can place directly nagios define: nagios::service.Note that you might need to configure also NRPE, so it might make sense to use monitor::plugin in any case, as it can configure the relevant NRPE entries.
Give a look at the options and the code of https://github.com/example42/puppet-monitor/blob/master/manifests/plugin.pp for some details.
Hope these infos will help.About contacts management, is the problem somewhere in the Nagios module (missing or buggy features)?
Note that in any case you can add custom Nagios configurations placing the files you want without using a dedicated define.
Good luckal
Al,
Thanks for you speedy response! I have more comments inline below...
On Tuesday, September 3, 2013 5:19:39 PM UTC-4, Alessandro Franceschi wrote:Hi SeanFor trending I've used the Munin module, which works well.
It doesn't need to be set as monitor tool, you just have to inlcude in in your monitored nodes, set the IP of the munin server.
On the munin server set the parameter $munin_server_local to true.The module automatically executes a daily cronjob (you can disable it) to autoconfigure the plugins (it should understand by itself what's there to monitor, in some case you might need to configure the singleplugins, with credentials and other params (munin::params).
Munin main drawback is that it doesn't scale well, but something better should be done with the newest versions (such as graphing on demand).
I guess I was spoiled at my last place of employment with BigBrother and later Xymon, in terms of Graphs. [1] The ~2 hour, ~2 day, ~2 week, ~2 month, ~2 year roll up graphs were great for studying and understanding behaviour of servers. Like Nagios its config file based, but unlike the trending is built in. That said I am not a fan of Xymon other than the trending feature. You mention that munin runs daily, I'm wondering if can build graphs regularly at a more granular interval than days and weeks. I'm not sure if the feature is gone in Xymon or if we had a custom hack, but I don't see the ~2 hour graph on their page.
Both pnp4nagios and nagiosgraph (which I just discovered today) have integration into the host template and service templates to add the action_url entry for graphs on mouseovers. The also add to /etc/nagios/objects/command.cfg and /etc/nagios/nagios.cfg to enable data collection. It would seem that some of this could be done with custom baseservices, host, and nagios cfg templates, but I would have to add a service_template parameter to override from the top scope like you do with host_template for any other service not in the base.
On the relevant role classes (or anyway in a class included by your nodes) you can place custom defines like monitor::plugin which may use any kind of Nagios plugin, and eventually configure it both on the central nagios server and locally for puppi checks (if you have $monitor_tool = [ 'puppi' , 'nagios' ].If you use only Nagios and are not interested in the puppi checks, just can place directly nagios define: nagios::service.Note that you might need to configure also NRPE, so it might make sense to use monitor::plugin in any case, as it can configure the relevant NRPE entries.
Give a look at the options and the code of https://github.com/example42/puppet-monitor/blob/master/manifests/plugin.pp for some details.
I was thinking that might be where to go... I have a wrapper Module to manage monitoring, that simply includes nrpe and sets allowed_hosts. Perhaps I should subclass that for various custom monitors. I am curious though since nagios::plugin includes nrpe, would that cause failure on duplicate class definition when the agent runs?
Hope these infos will help.About contacts management, is the problem somewhere in the Nagios module (missing or buggy features)?
Note that in any case you can add custom Nagios configurations placing the files you want without using a dedicated define.
No, not a problem with the module. At the moment the only servers in Foreman/Puppet/Nagios are mine, so I haven't needed to deal with multiple contacts or how to apply them to hosts/services. Its one of many todos to resolve as I roll all of this stuff out.
Thanks again for your thoughts.
I was hoping to be able to make a conditional using $enablepnp to reset various template params from the default to the pnp4nagios provided version, but I just learned that params are not aloud to be changed within the same scope, once they're defined. As a result, I exposed all the templates to be overridden by the top scope. In my case, that means Foreman params, which I can live with.
I'm not sure how to accomplish what I was looking to do. I want the module to say, "If enablepnp = true, then use this other set of params and resources," but I'm not sure I can do this via myclass or a set subclass.
Ok, for munin, the demo looked like it didn't provide a very short term granularity. Anyway, munin might be a nice option, but what I don't want at the moment is "one more place to go" for data to manage systems. Other tools in the mix are Splunk, Foreman, and now Nagios. The main Nagios server needs to be the one stop shop for system/service availability and notifications (as soon as I wrap up the contacts config :) )
In the meantime, what I've done is used your standard42 template to create a pnp4nagios module, and modified your nagios module to integrate with it. It's a total KLUDGE at the moment, and doesn't accomplish the end result elegantly like I'd like, but it works. Feel free to take a look at the two git repos...I was hoping to be able to make a conditional using $enablepnp to reset various template params from the default to the pnp4nagios provided version, but I just learned that params are not aloud to be changed within the same scope, once they're defined. As a result, I exposed all the templates to be overridden by the top scope. In my case, that means Foreman params, which I can live with.
I'm not sure how to accomplish what I was looking to do. I want the module to say, "If enablepnp = true, then use this other set of params and resources," but I'm not sure I can do this via myclass or a set subclass.