The simplest way to do this, in my mind, is probably to use
check_cmd_output(), something like this:
* || check_cmd_output -t 10 -m '!/offset -?[0-9]{3,}\./' ntpdate -q
pool.ntp.org
This will capture the output of the command above (ntpdate -q
pool.ntp.org) and fail the check if either ntpdate returns non-zero or
if the output contains a number with 3 or more digits before the
decimal (i.e., 100 or higher). You could also use the same technique
with your "chronyc tracking" command as well:
* || check_cmd_output -t 2 -m '!/: [0-9]{3,}\.[0-9]+ seconds [a-z]+
of NTP time/' chronyc tracking
If you wanted to use your specific command, you should still be able
to use check_cmd_output(), but you may need to play around a bit with
quoting and such, and possibly invoke it via /bin/bash -c '...' or
split it up into 2 separate lines (one that sets a variable and one
that checks its value). I haven't tested it, but something like this
might work:
* || export NHC_NTP_TIME_DELTA=$(chronyc tracking|grep "System
time"|cut -d ":" -f2|cut -d " " -f2)
* || check_cmd_output -m 1 expr $NHC_NTP_TIME_DELTA '<' 100
So yeah, there are lots of ways to accomplish what you want, I think! :-)
Also, if you run ntpd locally on the nodes, you'll likely want to have
NHC check that too, via check_ps_service().
Hopefully something in the above will be helpful to you! :-)
Michael
--
Michael Jennings (KainX)
https://medium.com/@mej0/ <
m...@eterm.org>
Linux/HPC Systems Engineer, LANL.gov Author, Eterm (
www.eterm.org)
-----------------------------------------------------------------------
"The trouble with doing something right the first time is that nobody
appreciates how difficult it was." -- Walt West