NTP Metrics.

997 views
Skip to first unread message

Yagyansh S. Kumar

unread,
May 19, 2020, 3:54:45 AM5/19/20
to Prometheus Users
Hi. I have my own NTP server configured at x.x.x.x . Now, I want to check if my 10 other servers are synchronized with my NTP server or not. I have gone through a lot of threads and found different opinions with different answers. Also, I guess node_ntp_drift_seconds is an old metrics and doesn't exists anymore in the update node_exporter version. What query/ combination of queries should I use to check the sync., if it is not in sync I also want to know the deviation. Totally confused on what to use for this and what is reliable.

Can someone help?
Thanks in advance!

Brian Candler

unread,
May 19, 2020, 5:02:06 AM5/19/20
to Prometheus Users
The ntp collector is disabled by default: you can turn it on with a command-line flag. However, the timex collector is enabled by default (e.g. node_timex_sync_status, node_timex_estimated_error_seconds)

For a rough idea of how the target clock compares to the prometheus server's clock, you can also just do:

node_time_seconds - timestamp(node_time_seconds)

Yagyansh S. Kumar

unread,
May 20, 2020, 12:48:28 AM5/20/20
to Prometheus Users
Thanks for the response Brian.

I have already enabled the NTP collector in all all my servers, but still cannot see the node_ntp_drift_seconds metrics giving the output. Apart from that, I have couple of questions here.
Firstly, why are we checking the target clock with Prometheus' server? What if it itself get unsyncronized? The whole idea of alerting goes out of the water in that case. Also, what does the node_ntp_sanity checks? How much is the variation in the clock that it takes into consideration to make the sanity 0(I know other factors also can make the sanity 0, but what is the criteria to call the clock unsynchronized. Same question for node_ntp_leap, if leaps turn 3, that means it is unsynchronized. Again what is the difference in clock timings that is takes to call the clock unsynchronized?

Secondly, according to you which one is better for keeping a track of clock Sync? Timex or NTP?

Brian Candler

unread,
May 20, 2020, 3:05:20 AM5/20/20
to Prometheus Users
On Wednesday, 20 May 2020 05:48:28 UTC+1, Yagyansh S. Kumar wrote:
Thanks for the response Brian.

I have already enabled the NTP collector in all all my servers, but still cannot see the node_ntp_drift_seconds metrics giving the output.

Looking through git history (git log -p), it was renamed to "node_ntp_offset_seconds":

# HELP node_ntp_offset_seconds ClockOffset between NTP and local clock.
# TYPE node_ntp_offset_seconds gauge
node_ntp_offset_seconds -0.015156364

The change was made in in c169b4b1c (Sep 19 2017) when the ntp collector was updated with more metrics.

The old ntp_drift was calculated here:

-       driftSeconds := resp.ClockOffset.Seconds()
-       log.Debugf("Set ntp_drift_seconds: %f", driftSeconds)
-       ch <- c.drift.mustNewConstMetric(driftSeconds)

and the corresponding code in current node_exporter is:

                offset: typedDesc{prometheus.NewDesc(
                        prometheus.BuildFQName(namespace, ntpSubsystem, "offset_seconds"),
                        "ClockOffset between NTP and local clock.",
                        nil, nil,
                ), prometheus.GaugeValue},
...
        ch <- c.offset.mustNewConstMetric(resp.ClockOffset.Seconds())


There is documentation about the ntp and timex metrics here:

Maybe these answer your other questions - or you can look at the source to see where each metric is collected from.

Yagyansh S. Kumar

unread,
May 20, 2020, 8:05:30 AM5/20/20
to Prometheus Users
Thanks a lot for pointing me in the correct direction.
Reply all
Reply to author
Forward
0 new messages