script to wait for NTP sync at boot

287 views
Skip to first unread message

Greg Troxel

unread,
Apr 18, 2019, 4:11:21 PM4/18/19
to weewx...@googlegroups.com
I realize that the standard advice, for good reason, is to use a
computer with a battery-backed RTC. However, for those not doing that,
there is an alternative, which is to wait for NTP to be happily synced.
I have tested this somewhat, and am unsure if there is a window when NTP
syncs and realizes it is far off and then adjusts, so perhaps the "are
things ok" logic should be better. But I send my script anyway in case
it helps someone, or someone can point out why this is confused.

On booting a RPI3 with the wifi wedged (not a weewx issue) and leaving
it for hours, and then making the network work, I had log lines that
looked like:

UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
UNSYNC 127.0.0.1: stratum 16, offset 0.000000, synch distance 0.000000
OK 127.0.0.1: stratum 3, offset -0.002891, synch distance 0.042083
Thu Apr 18 15:55:19 EDT 2019
associd=0 status=0618 leap_none, sync_ntp, 1 event, no_sys_peer,
system peer: 173.0.156.209:123
system peer mode: client
leap indicator: 00
stratum: 3
log2 precision: -19
root delay: 84.166
root dispersion: 998.679
reference ID: 173.0.156.209
reference time: e06354a6.4182be99 Thu, Apr 18 2019 15:55:18.255
system jitter: 8.636792
clock jitter: 0.965
clock wander: 0.000
broadcast delay: -50.000
symm. auth. delay: 0.000
remote refid st t when poll reach delay offset jitter
==============================================================================
[redacted addrs ] 3 s - 64 17 23.996 -1.688 5.556
[ ] 2 u 1 64 11 95.937 1.922 0.592
-[ ] 2 u 7 64 17 224.692 -10.475 1.508
[ ] 2 u 1 64 17 84.169 0.769 0.497
+[ ] 2 u 6 64 17 50.154 -2.841 1.415
*[ ] 2 s 3 64 17 26.173 -1.535 1.971
[ ] 3 s - 64 17 6.780 0.245 0.657
+[ ] 3 s 9 64 17 27.120 2.948 1.018



In this case the time was very close because the system had just done a
quick reboot, so it was only 45s slow or something like that.


The script:
----------------------------------------
#!/bin/sh

date
ntpq -c sysinfo
ntpq -pn

while sleep 5; do
trace=`ntptrace -n | head -1`
stratum=`echo $trace | sed -e 's/.*stratum \([0-9]*\),.*/\1/'`
if [ "$stratum" -lt 16 ]; then
echo OK $trace
break;
fi
echo UNSYNC $trace
done

date
ntpq -c sysinfo
ntpq -pn

bin/weewxd weewx.conf

vince

unread,
Apr 18, 2019, 8:59:07 PM4/18/19
to weewx-user
Is there something wrong with the checks in current weewx ?   It waits for the clock to be newer than the last modified date+time of weewx.conf, which seems to be a reasonable way to be current enough (by some definition).   All you need to do is ensure that the fake-hwclock kludge is not present on the system and of course install and enable ntpd (so systemd doesn't break things).

Greg Troxel

unread,
Apr 18, 2019, 10:03:40 PM4/18/19
to vince, weewx-user
I am not running Linux.

On NetBSD, by default, in the absence of a hardware clock, the system
clock is initialized to the last sync time of the root filesystem. I
gather that this is similar behavior to what fake-hwclock does. But, I
thought that in general waiting until the clock was known to be right
was good strategy.

(Also, I don't understand why bad times are a problem, because it seems
that archive records from the hardware have hardware timestamps. But,
with a 1.5h power outage, I experienced what I think was recording those
records at the wrong time.)

vince

unread,
Apr 19, 2019, 10:27:32 AM4/19/19
to weewx-user
On Thursday, April 18, 2019 at 7:03:40 PM UTC-7, Greg Troxel wrote:
On NetBSD, by default, in the absence of a hardware clock, the system
clock is initialized to the last sync time of the root filesystem.  I
gather that this is similar behavior to what fake-hwclock does.  But, I
thought that in general waiting until the clock was known to be right
was good strategy.


Sure.   You could alternately hard-set the date to some ancient time as part of your bootup sequence.  That would let the normal weewx 'is the date newer than the conf file' mechanisms work as-is, although I don't know what it would do to other stuff on your box.

(Also, I don't understand why bad times are a problem, because it seems
that archive records from the hardware have hardware timestamps.  But,
with a 1.5h power outage, I experienced what I think was recording those
records at the wrong time.)


Hardware doesn't emit archive records as far as I know.   Weewx aggregates things periodically 'into' archive records from the measurements it received in that time period.  And everything is based on the clock.   You would probably get fascinatingly bad archive data and graphs if the clock is moving wildly (example - a decade of people complaining about DST transitions)

Bottom line is always work off a good clock, no matter how you get it good.

tomn...@frontier.com

unread,
Apr 21, 2019, 9:22:06 PM4/21/19
to weewx-user
Here's an alternative that I'm putting into extensions.py.  Seems to do the trick, but could be a little more elegant...
python's re lib isn't as helpful as perl's about capturing values.

import re
import time
import subprocess

def is_ntp_up():
    ntpq_stat = sybprocess.call_check(["/usr/local/bin/ntpq", "-pn"], buffsize=1024, shell=False, stderr=subprocess.STDOUT)
    for line in ntpq_stat.split('\n'):
        if re.match("\*\d+\..*\s+\S+\s+\d+", line):
            fields = line.split()
            if int(fields[2]) > -1 and int(fields[2]) < 16:
                return 1
    return 0

wait_loop_limit = 60 # wait up to 5 minutes for ntp to sync
wait_loops      =  0
while is_ntp_up == 0:
    time.sleep(5)
    print "Waiting for NTP to come up"
    wait_loops += 1
    if wait_loops > wait_loop_limit:
        raise Exception(" No ntp sync, exiting now")

Chris

tomn...@frontier.com

unread,
Apr 22, 2019, 9:14:11 AM4/22/19
to weewx-user
Couple of corrections after looking at this again:

    ntpq_stat = subprocess.check_output(["/usr/local/bin/ntpq", "-pn"], shell=False, stderr=subprocess.STDOUT)

while is_ntp_up() == 0:

Chris

Tolip Wen

unread,
Jun 11, 2019, 3:21:12 AM6/11/19
to weewx-user
I don't know if this will help but;

I start ntpd with the "-g" switch. This causes it to just set the clock from server when it first starts.
My thinking is that after any power on/reboot situation I don't care if it jumps ahead a lot at one time.

I run weewx reports every 5 minutes so I still have to wait for new data, but my system clock is correct within a minute of boot.

# Start ntpd:
ntpd_start() {
  echo -n "Starting NTP daemon:  /usr/sbin/ntpd -g -u ntp:ntp"
  /usr/sbin/ntpd -g -u ntp:ntp
  echo
}

Andrew Milner

unread,
Jun 11, 2019, 3:29:13 AM6/11/19
to weewx-user
a minute is still too long really as weewx attempts a catchup process when it starts before resuming normal service - and needs the correct time in order to correctly retrieve any catch up records, set station clock etc etc - so although you may have 5 minutes until the first archive record is saved there may be processing taking place before that.
Reply all
Reply to author
Forward
0 new messages