Use of node_time to track NTP sync?

1,624 views
Skip to first unread message

eva...@plu.edu

unread,
Jul 17, 2017, 1:19:45 PM7/17/17
to Prometheus Users
We're trying to monitor time difference between our many VM servers and our main NTP server to alert on major differences.  The querey in the following rule seems to do this but once it starts the alert never clears.  I'm sure there's something simple that's wrong, but can't seem to find an acceptable alternative.  I've tried using scalar() and offset in various forms to no success.   Any suggestions on how to modify the alert so that it clears or on a different, better approach to the alert?

Dan Evans

ALERT TimeDrift
  IF abs(scalar(node_time{instance="ntp.mydomain.com:9100"}) - node_time{job="prometheus"}) > 10
  FOR 5m

Brian Brazil

unread,
Jul 17, 2017, 1:27:07 PM7/17/17
to eva...@plu.edu, Prometheus Users
On 17 July 2017 at 18:19, <eva...@plu.edu> wrote:
We're trying to monitor time difference between our many VM servers and our main NTP server to alert on major differences.  The querey in the following rule seems to do this but once it starts the alert never clears.  I'm sure there's something simple that's wrong, but can't seem to find an acceptable alternative.  I've tried using scalar() and offset in various forms to no success.   Any suggestions on how to modify the alert so that it clears or on a different, better approach to the alert?

The time will be when the Prometheus is scraped, which may only be every 15 or 60 seconds depending on your setup and thus will draft.

In Prometheus 2.0 what you can do is:

abs(node_time - timestamp(node_time))

which will give you a reasonably accurate answer.

Brian
 

Dan Evans

ALERT TimeDrift
  IF abs(scalar(node_time{instance="ntp.mydomain.com:9100"}) - node_time{job="prometheus"}) > 10
  FOR 5m

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6765dd52-6972-452e-b71d-1753919faf5e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Ben Kochie

unread,
Jul 18, 2017, 1:55:06 AM7/18/17
to eva...@plu.edu, Prometheus Users
If you have NTPd running on your systems, I suggest you use the helper script to gather NTP metrics.  This will give you the full state of NTP metrics.


I would like to make a real exporter for this, but there is no client library for Go that implements ntpq.

--

eva...@plu.edu

unread,
Jul 19, 2017, 4:24:02 PM7/19/17
to Prometheus Users
Thank you to both Brian and Ben for the replies.  Unfortunately, Brian, we've not yet upgraded to version 2 but when we do we'll test your suggestion.  Ben, we'll be looking at your python script to see if that will work for us.  

> Dan

najee...@gmail.com

unread,
Aug 3, 2017, 3:51:24 PM8/3/17
to Prometheus Users
What are you using for instance="ntp.mydomain.com:9100"? Is that an ntp server as a container?
Reply all
Reply to author
Forward
0 new messages