Hi, I've several instances that were all working perfectly for months. In the latest days I start to see problems related to date/time synchronization and NTPD.
I haven't changed anything in my configurations but here is what I see:
My /etc/ntp.conf is this:
server metadata.google.internal iburst
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict -6 ::1
restrict 127.127.1.0
logfile /var/log/ntpd.log
The ntpd service is up and running:
*******:/ # ps daux | grep ntp
root 1352 0.0 0.4 22016 13928 - Ss 8:54AM 0:00.04 |-- /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift
The log shows no errors whatsoever:
*******:/ # cat /var/log/ntpd.log
23 Nov 08:54:05 ntpd[1352]: Listen and drop on 0 v6wildcard [::]:123
23 Nov 08:54:05 ntpd[1352]: Listen and drop on 1 v4wildcard 0.0.0.0:123
23 Nov 08:54:05 ntpd[1352]: Listen normally on 2 vtnet0 [**MYLOCALIP**]:123
23 Nov 08:54:05 ntpd[1352]: Listen normally on 3 lo0 [::1]:123
23 Nov 08:54:05 ntpd[1352]: Listen normally on 4 lo0 [fe80::1%2]:123
23 Nov 08:54:05 ntpd[1352]: Listen normally on 5 lo0 127.0.0.1:123
23 Nov 08:54:05 ntpd[1352]: Listening on routing socket on fd #26 for interface updates
where **MYLOCALIP** is my local ip address.
If I stop and start the ntpd service everything seems working:
********:/ # ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*metadata.google 52.117.126.26 2 u 1 64 1 0.328 -272.00 172.488
********:/ # ntpq -c sysinfo
associd=0 status=0618 leap_none, sync_ntp, 1 event, no_sys_peer,
system peer: metadata.google.internal:123
system peer mode: client
leap indicator: 00
stratum: 3
log2 precision: -24
root delay: 1.332
root dispersion: 7991.809
reference ID: 169.254.169.254
reference time: dbdfd9b9.45b11621 Wed, Nov 23 2016 9:02:49.272
system jitter: 0.000000
clock jitter: 18.991
clock wander: 0.000
broadcast delay: -50.000
symm. auth. delay: 0.000
root@test-orologio:/var/log #
but, after some time (2-3 minutes), the synching procedure stops and I get this:
associd=0 status=0613 leap_none, sync_ntp, 1 event, spike_detect,
(spike_detect)
then this:
********:/ # ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
metadata.google 52.117.126.26 2 u 13 64 3 0.524 -3669.7 3348.20
********:/ # ntpq -c sysinfo
associd=0 status=0018 leap_none, sync_unspec, 1 event, no_sys_peer,
system peer: 0.0.0.0:0
system peer mode: unspec
leap indicator: 00
stratum: 3
log2 precision: -24
root delay: 1.184
root dispersion: 789.028
reference ID: 169.254.169.254
reference time: dbdfd9c3.4d614c5a Wed, Nov 23 2016 9:02:59.302
system jitter: 0.000000
clock jitter: 18.991
clock wander: 0.000
broadcast delay: -50.000
symm. auth. delay: 0.000
No more "*" before the metadata.google server name and no_sys_peer...
The following command
ntpdate -d metadata.google.internal
gives the following result:
transmit(169.254.169.254)
receive(169.254.169.254)
transmit(169.254.169.254)
receive(169.254.169.254)
transmit(169.254.169.254)
receive(169.254.169.254)
transmit(169.254.169.254)
receive(169.254.169.254)
server 169.254.169.254, port 123
stratum 2, precision -20, leap 00, trust 000
refid [169.254.169.254], delay 0.02588, dispersion 0.11003
transmitted 4, in filter 4
reference time: dbdfda89.e993bffe Wed, Nov 23 2016 9:06:17.912
originate timestamp: dbdfda89.e993c001 Wed, Nov 23 2016 9:06:17.912
transmit timestamp: dbdfda96.131fd2bf Wed, Nov 23 2016 9:06:30.074
filter delay: 0.02626 0.02589 0.02588 0.02592
0.00000 0.00000 0.00000 0.00000
filter offset: -11.8329 -11.9424 -12.0527 -12.1624
0.000000 0.000000 0.000000 0.000000
delay 0.02588, dispersion 0.11003
offset -12.052761
23 Nov 09:06:30 ntpdate[1560]: step time server 169.254.169.254 offset -12.052761 sec
so it seems that the ntp is able to call the server, get synch info but it does not apply them...
And my server is already off by 12 seconds in 5-6 minutes... what could be wrong?
I've tryied changing NTP server and using the NIST one with no luck. What should I look at? Someone else is seeing date/time sync problems on google cloud instances?