NTP metrics not available while using node exporter

546 views
Skip to first unread message

Pooja Chauhan

unread,
Oct 20, 2020, 8:39:42 AM10/20/20
to Prometheus Users
Hi ,
I want to monitor if the time goes out of sync with NTP server . There are issues related to this so need to address this asap .
Node exporter  : 

/node_exporter --collector.ntp --collector.ntp.server="10.0.0.0" --collector.ntp.server-is-local

i am unable to see any metrics related to NTP .
when i curl to 9100 and grep ntp only see :

node_scrape_collector_duration_seconds{collector="ntp"} 1.00033366
node_scrape_collector_success{collector="ntp"} 0

node exporter throws this error : 
vel=error msg="ERROR: ntp collector failed after 1.000249s: couldn't get SNTP reply: read udp 172.00.00.00:23857->10.0.0.0:123: i/o timeout" source="collector.go:132"

Pls help 



Brian Candler

unread,
Oct 20, 2020, 8:52:08 AM10/20/20
to Prometheus Users
If you're checking the ntp service on the host where node_exporter is running, then remove '--collector.ntp.server="10.0.0.0" --collector.ntp.server-is-local'
All you need is --collector.ntp

# curl -sS localhost:9100/metrics | grep 'ntp_'
# HELP node_ntp_leap NTPD leap second indicator, 2 bits.
# TYPE node_ntp_leap gauge
node_ntp_leap 0
# HELP node_ntp_offset_seconds ClockOffset between NTP and local clock.
# TYPE node_ntp_offset_seconds gauge
node_ntp_offset_seconds -0.000133765
# HELP node_ntp_reference_timestamp_seconds NTPD ReferenceTime, UNIX timestamp.
# TYPE node_ntp_reference_timestamp_seconds gauge
node_ntp_reference_timestamp_seconds 1.6031982374973688e+09
# HELP node_ntp_root_delay_seconds NTPD RootDelay.
# TYPE node_ntp_root_delay_seconds gauge
node_ntp_root_delay_seconds 0.005783081
# HELP node_ntp_root_dispersion_seconds NTPD RootDispersion.
# TYPE node_ntp_root_dispersion_seconds gauge
node_ntp_root_dispersion_seconds 0.002456665
# HELP node_ntp_rtt_seconds RTT to NTPD.
# TYPE node_ntp_rtt_seconds gauge
node_ntp_rtt_seconds 0.000417611
# HELP node_ntp_sanity NTPD sanity according to RFC5905 heuristics and configured limits.
# TYPE node_ntp_sanity gauge
node_ntp_sanity 1
# HELP node_ntp_stratum NTPD stratum.
# TYPE node_ntp_stratum gauge
node_ntp_stratum 2 

Pooja Chauhan

unread,
Oct 20, 2020, 9:35:56 AM10/20/20
to Prometheus Users
Yes i had tried it previosly but we do have our won NTP servers with which it did not showed correct information like , the output will be :

# HELP node_ntp_leap NTPD leap second indicator, 2 bits.
# TYPE node_ntp_leap gauge
node_ntp_leap 0
# HELP node_ntp_offset_seconds ClockOffset between NTP and local clock.
# TYPE node_ntp_offset_seconds gauge
node_ntp_offset_seconds -5.784e-06
# HELP node_ntp_reference_timestamp_seconds NTPD ReferenceTime, UNIX timestamp.
# TYPE node_ntp_reference_timestamp_seconds gauge
node_ntp_reference_timestamp_seconds 1.6032005254001284e+09
# HELP node_ntp_root_delay_seconds NTPD RootDelay.
# TYPE node_ntp_root_delay_seconds gauge
node_ntp_root_delay_seconds 0.097167968
# HELP node_ntp_root_dispersion_seconds NTPD RootDispersion.
# TYPE node_ntp_root_dispersion_seconds gauge
node_ntp_root_dispersion_seconds 0.139892578
# HELP node_ntp_rtt_seconds RTT to NTPD.
# TYPE node_ntp_rtt_seconds gauge
node_ntp_rtt_seconds 5.0566e-05
# HELP node_ntp_sanity NTPD sanity according to RFC5905 heuristics and configured limits.
# TYPE node_ntp_sanity gauge
node_ntp_sanity 1
# HELP node_ntp_stratum NTPD stratum.
# TYPE node_ntp_stratum gauge
node_ntp_stratum 5
node_scrape_collector_duration_seconds{collector="ntp"} 0.000197162
node_scrape_collector_success{collector="ntp"} 1

but the stratum value here is 5 , but output of ntpq -p is 


here stratum is 4 , I  dont think the ntp metrics are correct here as respect to our ntp .
I need guidance here .

Brian Candler

unread,
Oct 20, 2020, 3:02:51 PM10/20/20
to Prometheus Users
This is no longer a prometheus question, but I'll answer it briefly: ntpq -p lists the servers that you are sync'd from (-p = peers).  Any server which syncs *from* a stratum 4 server will itself be stratum 5, by definition.  The stratum is the number of steps away you are from a true time source: see https://en.wikipedia.org/wiki/Network_Time_Protocol#Clock_strata

Pooja Chauhan

unread,
Oct 21, 2020, 2:22:51 AM10/21/20
to Prometheus Users
Thank you for explaining me this :) 
So which  NTP parameter I should have alert on for out of sync time issue in my servers ? And what will be the query ?

Ben Kochie

unread,
Oct 21, 2020, 2:47:38 AM10/21/20
to Pooja Chauhan, Prometheus Users
Look at the node timex collector. There was a recent discussion about it on the google group.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/87f88024-925f-4d9d-96c2-4717ecd8c8b6n%40googlegroups.com.

Brian Candler

unread,
Oct 21, 2020, 3:37:02 AM10/21/20
to Prometheus Users
The metric "node_ntp_sanity 1" summaries several metrics into "ntp appears to be working properly".  I suggest you alert on that as a starting point (i.e. node_ntp_sanity != 1)

Also see https://groups.google.com/d/topic/prometheus-users/DOfNK5ypxPQ/discussion
Message has been deleted

Pooja Chauhan

unread,
Oct 23, 2020, 2:38:05 AM10/23/20
to Prometheus Users
Thank you for the replies . 
In case of  systemd-timesyncd   which should be used ntp sanity or some metrics from timex collector  for  alert on time out of sync?? ?   
Reply all
Reply to author
Forward
0 new messages