Percona RDS status/load/storage commands return null after nagrestconf restarts nagios

144 views
Skip to first unread message

Cullen Philippson

unread,
May 16, 2017, 5:30:26 PM5/16/17
to nagrestconf-users
Hi Mark,

This is a strange one, I'm using v1.174.6 and I get null returns on my AWS RDS commands after doing an Apply Config(resulting in a Nagios restart) via nagrestconf.  When I stop/start the service from the command line things return to a working state, my AWS RDS commands return valid results. As you can see below the first restart PID=2434 is nagrestconf and the second is my manual restart PID=3837. Let me know what I can provide for more detail.


[05-16-2017 21:11:49] SERVICE ALERT: RDS-instance2;pmp_check_aws_rds_status;OK;SOFT;2;OK mysql 5.5.53. Status: available
Program Start[05-16-2017 21:10:19] Nagios 3.5.1 starting... (PID=3837)
Program End[05-16-2017 21:10:16] Caught SIGTERM, shutting down...
Service Warning[05-16-2017 21:09:52] SERVICE ALERT: RDS-instance2;pmp_check_aws_rds_load;WARNING;SOFT;1;(null)
[05-16-2017 21:09:52] SERVICE ALERT: RDS-instance2;pmp_check_aws_rds_status;UNKNOWN;SOFT;1;UNK Unable to get RDS instance
Program Start[05-16-2017 21:09:02] Nagios 3.5.1 starting... (PID=2434)
Program End[05-16-2017 21:09:01] Caught SIGTERM, shutting down...

Here is the link to the python script I am using:

Code:
https://github.com/percona/percona-monitoring-plugins/tree/master/nagios/bin

Docs:
https://www.percona.com/doc/percona-monitoring-plugins/LATEST/nagios/pmp-check-aws-rds.py.html

Thank you!



Mark Clarkson

unread,
May 17, 2017, 3:54:12 AM5/17/17
to Cullen Philippson, nagrestconf-users
On Tue, 2017-05-16 at 14:30 -0700, Cullen Philippson wrote:
Docs:
https://www.percona.com/doc/percona-monitoring-plugins/LATEST/nagios/pmp-check-aws-rds.py.html

Hi Cullen,
I think it's to do with this bit: "This plugin that is supposed to be run by Nagios, i.e. under nagios user,
should have permissions to read the config /etc/boto.cfg or ~nagios/.boto."

Relax the permissions to wherever you put boto.cfg, as in 'chmod 666 /path/to/boto.cfg' and see if it works.

If it doesn't work after relaxing permissions then check around the nagios cron restarter ('su root' then 'crontab -l' to see it). You may be missing environment variables when it restarts.

In '/usr/bin/restart_nagios' you have some lines:

# Restart nagios straight away.
$NAGIOSBIN -v $NAG_DIR/nagios.cfg &> /dev/null && /etc/init.d/$NAG_INITD restart

Insert a line in there so it's:

# Restart nagios straight away.
env >/tmp/env.delme
$NAGIOSBIN -v $NAG_DIR/nagios.cfg &> /dev/null && /etc/init.d/$NAG_INITD restart

Compare the env that you use to restart nagios to the env that cron restarts nagios with.

Pretty sure the problem is in that sort of area.

Cheers!
Mark.

Cullen Philippson

unread,
May 18, 2017, 12:40:24 PM5/18/17
to nagrestconf-users, cullen.p...@gmail.com
Mark,

That led me to the solution.  When run with nagrestconf the nagios home directory(where the .boto resides) was not included. Neither of directory is represented in the PATH env variable, but this worked.  My initial permissions settings of 600 didn't seem to matter in this case.

I created a sym link to /etc/boto.cfg

# ln -s /var/spool/nagios/.boto /etc/boto.cfg

Thank you!

Cullen

Mark Clarkson

unread,
May 19, 2017, 3:54:02 AM5/19/17
to nagrestconf-users, cullen.p...@gmail.com
On Thursday, 18 May 2017 17:40:24 UTC+1, Cullen Philippson wrote:

I created a sym link to /etc/boto.cfg

# ln -s /var/spool/nagios/.boto /etc/boto.cfg

 Nice simple fix! Many thanks for reporting back.
Reply all
Reply to author
Forward
0 new messages