First attempt to combine check_mk and salt

432 views
Skip to first unread message

Dan Garthwaite

unread,
May 29, 2013, 7:51:52 PM5/29/13
to salt-...@googlegroups.com
This is heavily borrowed from Tor Hveem.  He's awesome.

Lots of us use nagios.  check_mk was written to make nagios configuration and forking sane.  The admin just focuses on the data struction (python dicts) and check_mk creates the nagios configuration files, generates individual python scripts (one execution per host), and is installed as an agent on hosts that is called only once-per-host for all services.  Oh - and it has auto-inventory of services on hosts.  It does so many things, better, that it turns people off.  

But the one thing it doesn't do is transport the text stream from the agent to the nagios server.  I wanted to both invoke the remote agent via salt, and transport the stream back to the nagios server.  I'm using salt's publish system.

Here is the gist of it:

Gist put all the files out of order, so let me paste my notes here:

On salt master

[edit]

Create check_mk module

GOAL: Return the output of /usr/bin/check_mk_agent if present.

Create /srv/salt/_modules/check_mk.py

#!/usr/bin/env python
'''Support for running check_mk_agent over salt'''
import os
import salt.utils
from salt.exceptions import SaltException

def __virtual__():
    ''' Only load the module if check-mk-agent is installed '''
    if os.path.exists('/usr/bin/check_mk_agent'):
        return 'check_mk'
    return False

def agent():
    ''' Return the output of check_mk_agent '''
    return __salt__['cmd.run']('/usr/bin/check_mk_agent')
[edit]

Push new module out to minions

salt \* saltutil.sync_modules
[edit]

On icinga server

[edit]

Give user nagios sudo salt access

GOAL: Allow 'nagios' user to

sudo salt-call

commands.

Cmnd_Alias SALT_PUBLISH = /usr/bin/salt-call
nagios ALL=NOPASSWD: SALT_PUBLISH
[edit]

Change /etc/check_mk/main.mk

GOAL: Configure check_mk to use check_mk.agent to retrieve stats for all hosts tagged with 'salt'

all_hosts += [
  "web001|vbox",
]

datasource_programs += [
  ( "sudo /usr/bin/salt-call publish.publish <HOST> check_mk.agent -l quiet | sed 1d ", ['salt'], ALL_HOSTS ),
]
[edit]

Test

$ sudo -u nagios check_mk -v -II web001
Removing 1 checks from /var/lib/check_mk/autochecks/tcp_conn_stats-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/ntp.time-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/uptime-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/kernel.util-2013-05-21_18.52.27.mk.
Removing 2 checks from /var/lib/check_mk/autochecks/df-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/mem.used-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/mem.vmalloc-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/cpu.loads-2013-05-21_18.52.27.mk.
Removing 3 checks from /var/lib/check_mk/autochecks/kernel-2013-05-21_18.52.27.mk.
Removing 2 checks from /var/lib/check_mk/autochecks/ps.perf-2013-05-21_18.52.27.mk.
Removing 1 checks from /var/lib/check_mk/autochecks/cpu.threads-2013-05-21_18.52.27.mk.
Removing 2 checks from /var/lib/check_mk/autochecks/mounts-2013-05-21_18.52.27.mk.
Skipping web001, not an snmp host
Calling external programm sudo /usr/bin/salt-call publish.publish web001 check_mk.agent -l quiet | sed 1d
cpu.loads         1 new checks
cpu.threads       1 new checks
df                2 new checks
web001: Invalid output from agent or invalid configuration: list index out of range
kernel            3 new checks
kernel.util       1 new checks
web001: Invalid output from agent or invalid configuration: 'index'
mem.used          1 new checks
mem.vmalloc       1 new checks
mounts            2 new checks
ntp.time          1 new checks
ps.perf           2 new checks
tcp_conn_stats    1 new checks
uptime            1 new checks

Dan Garthwaite

unread,
May 29, 2013, 7:53:06 PM5/29/13
to salt-...@googlegroups.com
Ultimately I had to revert back to 'ssh' because within 30 minutes I started receiving a stream of alerts - the checks were timing out.

Corey Quinn

unread,
May 29, 2013, 8:00:51 PM5/29/13
to salt-...@googlegroups.com
Are you on IRC? I'll be attempting to replicate this tomorrow. Your post is most fortunately timed.

-- Corey

On May 29, 2013, at 4:53 PM, Dan Garthwaite <d...@garthwaite.org> wrote:

Ultimately I had to revert back to 'ssh' because within 30 minutes I started receiving a stream of alerts - the checks were timing out.

--
You received this message because you are subscribed to the Google Groups "Salt-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to salt-users+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Dan Garthwaite

unread,
May 29, 2013, 8:51:10 PM5/29/13
to salt-...@googlegroups.com
If I remember, I'll be on.   dang or FSP_Dan.

You should review what inspired me: http://hveem.no/salt-icinga-nrpe-replacement

Mrten

unread,
May 30, 2013, 2:56:12 AM5/30/13
to salt-...@googlegroups.com
On 30/5/2013 02:51 , Dan Garthwaite wrote:
> If I remember, I'll be on. dang or FSP_Dan.
>
> You should review what inspired
> me: http://hveem.no/salt-icinga-nrpe-replacement

Keep us in the loop, most certainly interested!

M.
Reply all
Reply to author
Forward
0 new messages