Stick to PPS, even if the prefer server fails

alko...@googlemail.com

unread,

Mar 21, 2009, 9:51:20 AM3/21/09

to

Hi
My setup is as follows:

Server A <--- GPS
|
|
Server B <--- HP cesium frequency normal

Server A gets it's timestamps from GPS.
Server B gets it's timestamps from Server A and PPS from the HP
cesium. Server A is marked as prefer because otherwise PPS won't work.

My problem is:
If Server A loses the GPS signal, it falls back to LOCAL and begins to
drift away. In the beginning Server B doesn't care about that and
stays to the PPS. But later it disconnects PPS and runs on Server A
again. So both servers drift away. In my opinion Server B should stay
to the PPS signal. Is there anything I can do about that?

David J Taylor

unread,

Mar 21, 2009, 10:25:07 AM3/21/09

to

What other sources do you have configured for server B?

David

David Mills

unread,

Mar 21, 2009, 3:11:08 PM3/21/09

to

David,

You describe the intended behavior. The PPS should still be enabled,
even if the prefer source comes dark. Check with the ntptime program.
and confirm that the PPSTIME bit is lit.

Dave

David J Taylor wrote:

>_______________________________________________
>questions mailing list
>ques...@lists.ntp.org
>https://lists.ntp.org/mailman/listinfo/questions
>
>

Dave Hart

unread,

Mar 21, 2009, 8:31:52 PM3/21/09

to Dave Hart

On Mar 21, 1:51 pm, "alkope...@googlemail.com"

<alkope...@googlemail.com> wrote:
>
> If Server A loses the GPS signal, it falls back to LOCAL and begins to
> drift away. In the beginning Server B doesn't care about that and
> stays to the PPS. But later it disconnects PPS and runs on Server A
> again. So both servers drift away. In my opinion Server B should stay
> to the PPS signal. Is there anything I can do about that?

I believe you're limited to options that would keep the prefer peer
surviving NTP's clock selection gauntlet. I could be mistaken, but I
believe PPS with no prefer peer is intended to stop working.

Dr. Mills' comment "the PPS should still be enabled" confusede me at
first, but I think he's saying assuming your PPS is disciplining the
kernel clock directly, that aspect should continue to keep your local
clock from drifting from its cesium standard during a GPS outage
despite ntpd eventually losing its PPS(1) peer.

Cheers,
Dave Hart

David Mills

unread,

Mar 22, 2009, 11:22:57 PM3/22/09

to

Dave,

Once the kernel PPS is lit, it will stay lit even if the daemon loses
the prefered soure or even dies.This is the intended behavior, which I
confirmed just now. Veriy that ntptime shows PPSTIME lit after the
daemon loses the prefer peer or is stopped.

By the way, with just a remote peer and PPS, the intersection algorithm
is quite picky if the remote peer strays a bit. Use tinker mindisp to
something higher, like .05 s.

Dave

Dave Hart wrote:

Dave Hart

unread,

Mar 23, 2009, 12:00:14 AM3/23/09

to

Dr. Mills,

Thank you for your confirmation. The original poster should be
satisfied, assuming their kernel time is being disciplined directly by
their cesium standard. If they are unable to use the kernel
discipline for some reason, could they tinker _max_disp (I don't know)
to keep the upstream server with GPS surviving longer? If there is no
maximum dispersion tinker, "tinker dispersion 5" on the upstream
server with GPS should similarly delay the inevitable. (The default
is 15, units part per million)

Cheers,
Dave Hart

David Mills

unread,

Mar 23, 2009, 12:08:09 PM3/23/09

to

Dave,

Somehow using an HP 5071 to discipline the time without kernel support
seems bizarre, but it could be done in a spacecraft with only
intermittent contact with Earth. At the default rate of dispersion
increase of 15 us/s and distance threshold at the default 1.5 s, the
client will coast about 2.8 h. The tos maxdist 5 command would increase
that to 3.9 h. The downside to this is that this also reduces the number
of measurements when first synchronizing. Note that the kernel has no
distance threshold, so in principle could go forever without a prefer peer.

The distance metric is intended as a maximum error statistic and
includes components due to roundtrip delay, dispersion, absolute offset
and RMS jitter. It was never intended for use as a holdover for absentee
prefer peers. There are other ways to do this, all with unintended side
effects. A related problem has occured with simulated Moon missions, in
which the roundtrip light time is over 2 s. An increase in distance
threshold to 3 s fixes this, but something else is necessary for
missions beyond the Moon.

Dave

Dave Hart wrote:

alko...@googlemail.com

unread,

Mar 24, 2009, 2:10:04 PM3/24/09

to

On Mar 23, 5:00 am, Dave Hart <daveh...@gmail.com> wrote:
> Dr. Mills,
>
> Thank you for your confirmation. The original poster should be
> satisfied, assuming their kernel time is being disciplined directly by
> their cesium standard.

What do I have to do to use the kernel discipline?
At the moment I'm using a Linux kernel 2.6.28.1 with LinuxPPS patch
(http://wiki.enneenne.com/index.php/LinuxPPS_support). The PPS signal
is connected via RS232 DCD pin and used by the ntpd refclock_atom
driver.
I don't see the PPSTIME bit in ntptime:

monitor202[~]# ntpq -pn
remote refid st t when poll reach delay
offset jitter
==============================================================================
+192.168.12.3 .GPSi. 1 u 8 64 377 1.047
2.127 0.194
192.168.12.5 .INIT. 16 u - 128 0 0.000
0.000 0.000
o127.127.22.0 .PPS. 1 l 8 16 377 0.000
-0.027 0.028
192.168.12.11 192.168.12.3 13 u 10 64 377 0.979
-152.47 0.147

monitor202[~]# ntpq -crv
assID=0 status=21b4 leap_none, sync_atomic/PPS, 11 events, event_peer/
strat_chg,
version="ntpd 4.2...@1.1541-o Mon Dec 1 14:26:35 UTC 2008 (1)",
processor="i486", system="Linux/2.6.28.1elan", leap=00, stratum=2,
precision=-16, rootdelay=0.000, rootdispersion=0.351, peer=63523,
refid=PPS(0), reftime=cd739da1.0860c62e Tue, Mar 24 2009
17:57:53.032,
poll=4, clock=cd739da3.1eedef80 Tue, Mar 24 2009 17:57:55.120,
state=4,
offset=-0.032, frequency=77.428, jitter=0.026, noise=0.092,
stability=0.005, tai=0

monitor202[~]# ntptime
ntp_gettime() returns code 0 (OK)
time cd739daa.7ecb5320 Tue, Mar 24 2009 17:58:02.495, (.495290872),
maximum error 5074 us, estimated error 91 us, TAI offset 0
ntp_adjtime() returns code 0 (OK)
modes 0x0 (),
offset -30.903 us, frequency 77.428 ppm, interval 1 s,
maximum error 5074 us, estimated error 91 us,
status 0x2001 (PLL,NANO),
time constant 4, precision 0.001 us, tolerance 500 ppm,

David Mills

unread,

Mar 24, 2009, 5:29:33 PM3/24/09

to

alkopedia,

Your PPS signal is working, but not necesarily discipining the kernel. I
can't help you with Linux.

Dave

alko...@googlemail.com wrote:

alko...@googlemail.com

unread,

Mar 24, 2009, 7:53:56 PM3/24/09

to

On Mar 24, 10:29 pm, mi...@udel.edu (David Mills) wrote:
> alkopedia,
>

> Your PPS signal is working, but not necesarily discipining thekernel. I

> can't help you with Linux.
>
> Dave

Thanks for your help anyway. The PPS is working, that's right. But let
me explain my setup more detailed:
I've got to machines, called monitor201 and monitor202. Both are
connected to a cesium frequency standard PPS (which is synced to UTC
because it is part of BIPMs UTC calculation progress).
both monitors get their prefer timestamps from another 2 machines:
gps201 and dcf201 (which get their time by GPS and DCF77).
My problem is as follows: assumed gps201 and dcf201 will get down or
get wrong times by GPS/DCF77 at any time in the future, I want my
monitors still be synced to PPS (because this one has guaranteed UTC
time). Here is a picture for better understanding:
http://img-up.net/img/ntpmoni-pu5NrlP.png (red: PPS)
In other words: On the long run I want to trust the cesium more than
GPS/DCF77 timestamps.

If I've understood you right, this would be possible with the PPS
kernel discipline?

status 0x2001 (PLL,NANO) means that the kernel even doesn't see a PPS
signal to use for discipline. After searching the LinuxPPS mailing
list this seems to be some kind of bug, so I'll try to ask there
again.

David Mills

unread,

Mar 25, 2009, 3:23:47 PM3/25/09

to

alkopedia,

We do something like that here with two GPS receivers and two busy servers. However, we assign one GPS receiver and PPS signal to each server. In your case, you want to duplex the receivers and PPS signals. Suggestions:

1. Immediately fire the Linux OS and replace with FreeBSD. It has PPS support ex box and very good serial drivers. You need to disable the hardware/driver FIFOs to get good serial time. This of course will be secondary to the PPS.

2. Use the parallel printer port rather than a serial port for the PPS. This avoids a level converter and hardware slew delays, as well as the control-lead state machine of the serial driver. Expect time within a few microseconds with the PPS.

3. Configure two serial drivers and one PPS driver for each server. Mark both seral drivers as preferred for both server. The mitigation algorithm will choose the first selectable prefer peer it finds, so reverse the order of the serial drivers in each configuration file.

I think this configuration matches what you intend in your diagram. If you lose one or both GPS/DSF77 sources, the PPS signal will still wind the clocks. If you lose the PPS, the timing quality will follow the serial ports.

By sure to increase tos mindisp to something that avoids the intersection problem. I imagine 10 ms would be enough, depending on the performance of the GPS and DCF77 receivers.

I don't know how you feel about additional backup, such as running symmetric mode between the servers as we do. NIST doesn't want to do that on the basis that a failure of one NIST server should not result in delivery of stratum-2 time. They argue selection of redundent servers should be done by the clients.

Dave

alko...@googlemail.com wrote:

alko...@googlemail.com

unread,

Mar 25, 2009, 4:46:11 PM3/25/09

to

On Mar 25, 8:23 pm, mi...@udel.edu (David Mills) wrote:
> alkopedia,
>

> We do something like that here with two GPS receivers and two busy servers. However, we assign one GPS receiver and PPS signal to each server. In your case, you want to duplex the receivers and PPS signals. Suggestions:
>
> 1. Immediately fire the Linux OS and replace with FreeBSD. It has PPS support ex box and very good serial drivers. You need to disable the hardware/driver FIFOs to get good serial time. This of course will be secondary to the PPS.
>
> 2. Use the parallel printer port rather than a serial port for the PPS. This avoids a level converter and hardware slew delays, as well as the control-lead state machine of the serial driver. Expect time within a few microseconds with the PPS.
>
> 3. Configure two serial drivers and one PPS driver for each server. Mark both seral drivers as preferred for both server. The mitigation algorithm will choose the first selectable prefer peer it finds, so reverse the order of the serial drivers in each configuration file.
>
> I think this configuration matches what you intend in your diagram. If you lose one or both GPS/DSF77 sources, the PPS signal will still wind the clocks. If you lose the PPS, the timing quality will follow the serial ports.
>
> By sure to increase tos mindisp to something that avoids the intersection problem. I imagine 10 ms would be enough, depending on the performance of the GPS and DCF77 receivers.
>
> I don't know how you feel about additional backup, such as running symmetric mode between the servers as we do. NIST doesn't want to do that on the basis that a failure of one NIST server should not result in delivery of stratum-2 time. They argue selection of redundent servers should be done by the clients.
>
> Dave
>

> alkope...@googlemail.com wrote:
> >On Mar 24, 10:29 pm, mi...@udel.edu (David Mills) wrote:
>
> >>alkopedia,
>
> >>Your PPS signal is working, but not necesarily discipining thekernel. I
> >>can't help you with Linux.
>
> >>Dave
>
> >Thanks for your help anyway. The PPS is working, that's right. But let
> >me explain my setup more detailed:
> >I've got to machines, called monitor201 and monitor202. Both are
> >connected to a cesium frequency standard PPS (which is synced to UTC
> >because it is part of BIPMs UTC calculation progress).
> >both monitors get their prefer timestamps from another 2 machines:
> >gps201 and dcf201 (which get their time by GPS and DCF77).
> >My problem is as follows: assumed gps201 and dcf201 will get down or
> >get wrong times by GPS/DCF77 at any time in the future, I want my
> >monitors still be synced to PPS (because this one has guaranteed UTC
> >time). Here is a picture for better understanding:

> >http://img-up.net/img/ntpmoni-pu5NrlP.png(red: PPS)

> >In other words: On the long run I want to trust the cesium more than
> >GPS/DCF77 timestamps.
>
> >If I've understood you right, this would be possible with the PPS
> >kernel discipline?
>
> >status 0x2001 (PLL,NANO) means that the kernel even doesn't see a PPS
> >signal to use for discipline. After searching the LinuxPPS mailing
> >list this seems to be some kind of bug, so I'll try to ask there
> >again.
>
> >_______________________________________________
> >questions mailing list

> >questi...@lists.ntp.org
> >https://lists.ntp.org/mailman/listinfo/questions

Hi David!
Thanks for your suggestions. But there are several other problems :-)
1+2. The hardware I'm using was provided as it is and it consists of
Soekris net4501 embedded boards with only one PCI and one serial port
(no parallel port). The GPS and DCF77 receivers are Meinberg PCI
cards, which provide a Linux kernel driver and the ntpd can directly
use them with the reclock_meinberg driver. Therefore I've chosen Linux
as OS.

2. I don't use any level converters. The MAX3243C rs232 chip seems to
handle TTL levels fine. In the last 12 hours the offset to PPS was
between -5 and +5 microseconds.

Except for the PPSTIME issue the setup is working quite well, but I
have to check out the tos mindisp settings you've mentioned.

The complete system doesn't serve time to others. It's intention is
monitoring. I'm building this as a diploma thesis for a
telecommunication company. They have a large number of ntp servers
with GPS receivers (called SSU) across the country. So my system only
watches the offsets, that these SSUs have to assure that they are OK.

I'm not sure about symmetric mode. I guess this means to use the
"peer" option. Do you think it's a good idea to use peer between the 2
monitors? At the moment they work independent from each other. They
are doubled for the case that one of them breaks down.

Just for your information about LinuxPPS. Today I got response from
their mailing list:
"Kernel discipline has been talked about on the list before. It is an
optional
part of the specification and is currently not part of LinuxPPS.
Right now I
think the goal is to get the basic PPS stuff into the kernel and then
to work
at adding enhancements like kernel discipline.

In addition to basic PPS support PPSKIT had kernel discipline along
with
nanokernel capabilities. Unfortunately it never made it into the
kernel.
Perhaps because it was too ambitious. Current kernels are now
nanosecond
capable and there is a possibility that basic PPS capabilities will be
part of
the kernel sometime soon since the LinuxPPS patches were considered
for
inclusion in 2.6.29 and there has been some work done to correct the
issues
that prevented inclusion. Since it appears that this needs to be
added to the
kernel in small incremental steps perhaps the right thing to do would
be to
have a separate patch set for enabling kernel discipline so that work
on
getting basic PPS into the kernel can go forward unimpeded."

Greetings,
Thorsten

Unruh

unread,

Mar 25, 2009, 5:19:25 PM3/25/09

to

"alko...@googlemail.com" <alko...@googlemail.com> writes:

>> 1. Immediately fire the Linux OS and replace with FreeBSD. It has PPS sup=
>port ex box and very good serial drivers. You need to disable the hardware/=
>driver FIFOs to get good serial time. This of course will be secondary to t=
>he PPS.

I have no idea why and whether kernel PPS code is any better ( or worse)
than say PPS discipline using the shm PPS refclock using parallel port
interrupt. Ie, both can discipline
to about 1-2usec level. The main problem is that the ntp model is too slow
reacting to temperature induced drifts.

>>
>> 2. Use the parallel printer port rather than a serial port for the PPS. T=
>his avoids a level converter and hardware slew delays, as well as the contr=
>ol-lead state machine of the serial driver. Expect time within a few micros=
>econds with the PPS.

I agree here that parallel seems to be better in that the system reacts
faster to a parallel interrupt than a serial, especially if you can write a
special interrupt service routine and not have the driver service routing
imposing its delays into the chain.

>2. I don't use any level converters. The MAX3243C rs232 chip seems to
>handle TTL levels fine. In the last 12 hours the offset to PPS was
>between -5 and +5 microseconds.

>Except for the PPSTIME issue the setup is working quite well, but I
>have to check out the tos mindisp settings you've mentioned.

That has to do with the system allowing the serial nmea to take over in
case the PPS signal gets lost. It should do nothing if everything is
working.

>The complete system doesn't serve time to others. It's intention is
>monitoring. I'm building this as a diploma thesis for a
>telecommunication company. They have a large number of ntp servers
>with GPS receivers (called SSU) across the country. So my system only
>watches the offsets, that these SSUs have to assure that they are OK.

They can monitor only to the msec (or 1/10 msec) level unless the other systems are at the same place.

Dave Hart

unread,

Mar 25, 2009, 6:54:15 PM3/25/09

to Dave Hart

On Mar 25, 9:19 pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
>
> I have no idea why and whether kernel PPS code is any better ( or worse)
> than say PPS discipline using the shm PPS refclock using parallel port
> interrupt. Ie, both can discipline
> to about 1-2usec level.

Re-read the thread, then. A kernel clock disciplined by PPS allows
PPS to continue to discipline the clock when ntpd's PPS implementation
stops doing so because of a prefer peer problem.

> The main problem is that the ntp model is too slow
> reacting to temperature induced drifts.

I'm sure that's your main problem re ntpd, but it's completely
tangential to the thread.

> >The complete system doesn't serve time to others. It's intention is
> >monitoring. I'm building this as a diploma thesis for a
> >telecommunication company. They have a large number of ntp servers
> >with GPS receivers (called SSU) across the country. So my system only
> >watches the offsets, that these SSUs have to assure that they are OK.
>
> They can monitor only to the msec (or 1/10 msec) level unless the other systems are at the same place.

I think he's monitoring the self-reported offsets between the NTP
disciplined clock and the local GPS receiver. Network delay would not
be a factor if so.

Cheers,
Dave Hart

Unruh

unread,

Mar 25, 2009, 7:48:49 PM3/25/09

to

Dave Hart <dave...@gmail.com> writes:

>On Mar 25, 9:19=A0pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
>>
>> I have no idea why and whether kernel PPS code is any better ( or worse)
>> than say PPS discipline using the shm PPS refclock using parallel port
>> interrupt. Ie, both can discipline
>> to about 1-2usec level.

>Re-read the thread, then. A kernel clock disciplined by PPS allows
>PPS to continue to discipline the clock when ntpd's PPS implementation
>stops doing so because of a prefer peer problem.

I must admit that I am completely confused by this prefer peer problem. I
would think that ntp would use the PPS/refclock preferentially no matter
what external servers are doing. But clearly I am not understanding
something. If it does not that would seem to me to be a bug in ntp, not
something one should be rewriting kernel code to overcome.

>> The main problem is that the ntp model is too slow
>> reacting to temperature induced drifts.

>I'm sure that's your main problem re ntpd, but it's completely
>tangential to the thread.

I was not clear-- the main problem with ntp's ability to discipline the
system clock to better than a few microseconds is that ntp reacts too
slowly to temperature induced drifts. Ie, while ntp might be able to
discipline a clock with a constant rate to sub microsecond precision, ntp
in a running, temperature varying system cannot, in part because of ntp's
reaction to drift rate changes. I think that is germaine to the thread, but
opinions may vary.
Ie, if you use ntp with the addition of temperature measurements to
estimate the temp induced drift rate, you can discipline the clock much
better than with ntp on its own.
However if the only concern was losing PPS discipline due to prefer peer
problems, then I agree this is tangential.

>> >The complete system doesn't serve time to others. It's intention is
>> >monitoring. I'm building this as a diploma thesis for a
>> >telecommunication company. They have a large number of ntp servers
>> >with GPS receivers (called SSU) across the country. So my system only
>> >watches the offsets, that these SSUs have to assure that they are OK.
>>

>> They can monitor only to the msec (or 1/10 msec) =A0level unless the othe=

>r systems are at the same place.

>I think he's monitoring the self-reported offsets between the NTP
>disciplined clock and the local GPS receiver. Network delay would not
>be a factor if so.

I got the impression that he was using this machine to monitor a whole
bunch of other machine distributed over the USA to make sure that their
clocks were well disciplined by their own on board GPS receivers.

alko...@googlemail.com

unread,

Mar 25, 2009, 8:51:20 PM3/25/09

to

On Mar 26, 12:48 am, Unruh <unruh-s...@physics.ubc.ca> wrote:
> However if the only concern was losing PPS discipline due to prefer peer
> problems, then I agree this is tangential.

Yes, it was ;-)

> >I think he's monitoring the self-reported offsets between the NTP
> >disciplined clock and the local GPS receiver. Network delay would not
> >be a factor if so.
>
> I got the impression that he was using this machine to monitor a whole
> bunch of other machine distributed over the USA to make sure that their
> clocks were well disciplined by their own on board GPS receivers.

You're both right:
When I was talking about the -5 us to +5 us offset, I was talking
about the self reported offset of the NTP disciplined clock and the
local and the local PPS signal.
But Unruh is right. The main task of the whole system is to monitor a
bunch of machines (called SSU) distributed over Germany to make sure,
that they _provide_ the right time to their clients. Of course it is
not possible to monitor them with microsecond resolution but that's
not needed. Just imagine one of these situations:
- SSUs have a firmware bug, which leads to wrong timings
- SSUs have a bug and don't notice a leap second
- SSUs have a problem at receiving the GPS signal
All these things may lead to wrong timings in the network and probably
nobody will notice fast enough.

And of course for my monitoring system it is very unlikely that GPS
and DCF77 time signals will fail at the same time, but it is possible.
Therefore was my initial question if it is possible to sync to the PPS
signal, even if both prefered peers disappear.
Furthermore the used cesium normale is part of the UTC calculation of
the BIPM in Paris. So the cesium timings are constantly measured and
ensured by Circular T: ftp://ftp2.bipm.fr/pub/tai/publication/cirt/cirt.254
;-)

Greetings,
Thorsten

Unruh

unread,

Mar 25, 2009, 10:39:26 PM3/25/09

to

"alko...@googlemail.com" <alko...@googlemail.com> writes:

>On Mar 26, 12:48=A0am, Unruh <unruh-s...@physics.ubc.ca> wrote:
>> However if the only concern was losing PPS discipline due to prefer peer
>> problems, then I agree this is tangential.

>Yes, it was ;-)

>> >I think he's monitoring the self-reported offsets between the NTP

>> >disciplined clock and the local GPS receiver. =A0Network delay would not

>> >be a factor if so.
>>
>> I got the impression that he was using this machine to monitor a whole
>> bunch of other machine distributed over the USA to make sure that their
>> clocks were well disciplined by their own on board GPS receivers.

>You're both right:
>When I was talking about the -5 us to +5 us offset, I was talking
>about the self reported offset of the NTP disciplined clock and the

And my tangential comment was precisely about this \pm 5usec offset. I
believe that much of this is caused by the poor response of ntp to
temperature induced drift changes (but I have nothing better to suggest,
except perhaps the versions of ntp that were changed to also use the temp
to predict the drift rate and compensate automatically).
Now you may not care in which case again my comment is tangential.

>local and the local PPS signal.
>But Unruh is right. The main task of the whole system is to monitor a
>bunch of machines (called SSU) distributed over Germany to make sure,
>that they _provide_ the right time to their clients. Of course it is
>not possible to monitor them with microsecond resolution but that's
>not needed. Just imagine one of these situations:
>- SSUs have a firmware bug, which leads to wrong timings
>- SSUs have a bug and don't notice a leap second
>- SSUs have a problem at receiving the GPS signal
>All these things may lead to wrong timings in the network and probably
>nobody will notice fast enough.

>And of course for my monitoring system it is very unlikely that GPS
>and DCF77 time signals will fail at the same time, but it is possible.
>Therefore was my initial question if it is possible to sync to the PPS
>signal, even if both prefered peers disappear.

So, again I am confused.
I have
server tick.usask.ca
server ntp.ubc.ca
server 127.127.28.0 minpoll 4 prefer

where 127.127.28.0 is the PPS via shm and this system seems to have no
problems.
So again my confusion may be irrelevant (well my confusion certainly is
irrelevant but my situation may not be.) Or maybe I simply do not recognize
the problem.
And why is the pps not the preferred peer?
(By GPS I assume you mean an NMEA GPS source).

David Mills

unread,

Mar 25, 2009, 11:19:07 PM3/25/09

to

Dave,

The NTP discipline executes a frequency correction once per second; the
kernel executes a correction once per timer interrupt. Assume an
intrinsic oscilator frequency error of 100 PPM. Do the math.

Dave

Dave Hart wrote:

>On Mar 25, 9:19 pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
>
>
>>I have no idea why and whether kernel PPS code is any better ( or worse)
>>than say PPS discipline using the shm PPS refclock using parallel port
>>interrupt. Ie, both can discipline
>>to about 1-2usec level.
>>
>>
>
>
>Re-read the thread, then. A kernel clock disciplined by PPS allows
>PPS to continue to discipline the clock when ntpd's PPS implementation
>stops doing so because of a prefer peer problem.
>
>
>
>

>>The main problem is that the ntp model is too slow
>>reacting to temperature induced drifts.
>>
>>
>
>
>I'm sure that's your main problem re ntpd, but it's completely
>tangential to the thread.
>
>
>
>

>>>The complete system doesn't serve time to others. It's intention is
>>>monitoring. I'm building this as a diploma thesis for a
>>>telecommunication company. They have a large number of ntp servers
>>>with GPS receivers (called SSU) across the country. So my system only
>>>watches the offsets, that these SSUs have to assure that they are OK.
>>>
>>>

>>They can monitor only to the msec (or 1/10 msec) level unless the other systems are at the same place.

>>
>>
>
>
>I think he's monitoring the self-reported offsets between the NTP

>disciplined clock and the local GPS receiver. Network delay would not

>be a factor if so.
>

>Cheers,
>Dave Hart

David Mills

unread,

Mar 25, 2009, 11:28:07 PM3/25/09

to

Bill,If you had taken the trouble to look at the documentation , you
would have found the "Mitigation Rules and the Prefer Peer"
http://www.eecis.udel.edu/~mills/ntp/html/prefer.html page which clearly
describes how the contraptoin works.

Unruh wrote:

>Dave Hart <dave...@gmail.com> writes:

>
>
>
>>On Mar 25, 9:19=A0pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
>>
>>
>>>I have no idea why and whether kernel PPS code is any better ( or worse)
>>>than say PPS discipline using the shm PPS refclock using parallel port
>>>interrupt. Ie, both can discipline
>>>to about 1-2usec level.
>>>
>>>
>
>
>
>
>>Re-read the thread, then. A kernel clock disciplined by PPS allows
>>PPS to continue to discipline the clock when ntpd's PPS implementation
>>stops doing so because of a prefer peer problem.
>>
>>
>

>I must admit that I am completely confused by this prefer peer problem. I
>would think that ntp would use the PPS/refclock preferentially no matter
>what external servers are doing. But clearly I am not understanding
>something. If it does not that would seem to me to be a bug in ntp, not
>something one should be rewriting kernel code to overcome.
>
>
>

>>>The main problem is that the ntp model is too slow
>>>reacting to temperature induced drifts.
>>>
>>>
>
>
>
>
>>I'm sure that's your main problem re ntpd, but it's completely
>>tangential to the thread.
>>
>>
>

>I was not clear-- the main problem with ntp's ability to discipline the
>system clock to better than a few microseconds is that ntp reacts too
>slowly to temperature induced drifts. Ie, while ntp might be able to
>discipline a clock with a constant rate to sub microsecond precision, ntp
>in a running, temperature varying system cannot, in part because of ntp's
>reaction to drift rate changes. I think that is germaine to the thread, but
>opinions may vary.
>Ie, if you use ntp with the addition of temperature measurements to
>estimate the temp induced drift rate, you can discipline the clock much
>better than with ntp on its own.

>However if the only concern was losing PPS discipline due to prefer peer
>problems, then I agree this is tangential.
>
>
>
>
>

>>>>The complete system doesn't serve time to others. It's intention is
>>>>monitoring. I'm building this as a diploma thesis for a
>>>>telecommunication company. They have a large number of ntp servers
>>>>with GPS receivers (called SSU) across the country. So my system only
>>>>watches the offsets, that these SSUs have to assure that they are OK.
>>>>
>>>>

>>>They can monitor only to the msec (or 1/10 msec) =A0level unless the othe=
>>>
>>>

>>r systems are at the same place.
>>
>>
>
>
>
>
>>I think he's monitoring the self-reported offsets between the NTP
>>disciplined clock and the local GPS receiver. Network delay would not
>>be a factor if so.
>>
>>
>

>I got the impression that he was using this machine to monitor a whole
>bunch of other machine distributed over the USA to make sure that their
>clocks were well disciplined by their own on board GPS receivers.
>

Dave Hart

unread,

Mar 25, 2009, 11:46:17 PM3/25/09

to Dave Hart

On Mar 26, 2:39 am, Unruh <unruh-s...@physics.ubc.ca> wrote:
>
> So, again I am confused.
> I have
> server tick.usask.ca
> server ntp.ubc.ca
> server 127.127.28.0 minpoll 4 prefer
>
> where 127.127.28.0 is the PPS via shm and this system seems to have no
> problems.
> So again my confusion may be irrelevant (well my confusion certainly is
> irrelevant but my situation may not be.) Or maybe I simply do not recognize
> the problem.
> And why is the pps not the preferred peer?
> (By GPS I assume you mean an NMEA GPS source).

The original poster is using the PPS refclock driver (127.127.22.u,
refclock_atom.c). The PPS driver knows when the top of the second is,
but does not know which second. The PPS driver starts out unreachable
until a peer marked prefer is selected and the local clock is within
400ms of that prefer peer, then it trumps and becomes the source:

25 Mar 06:01:56 ntpd[1724]: peer GPS_NMEA(1) event
'event_reach' (0x84) ...
25 Mar 06:02:41 ntpd[1724]: system event 'event_peer/
strat_chg' (0x04) ...
25 Mar 06:02:41 ntpd[1724]: synchronized to GPS_NMEA(1), stratum 0
25 Mar 06:02:41 ntpd[1724]: system event 'event_sync_chg' (0x03) ...
25 Mar 06:02:41 ntpd[1724]: system event 'event_peer/
strat_chg' (0x04) ...
25 Mar 06:02:43 ntpd[1724]: peer PPS(1) event 'event_reach' (0x84) ...
25 Mar 06:03:31 ntpd[1724]: pps sync enabled
25 Mar 06:03:31 ntpd[1724]: synchronized to PPS(1), stratum 0

Since my GPS_NMEA(1) and my PPS(1) are using the same serial port, I'm
not concerned about that setup losing the prefer peer taking out the
PPS. I'm following this thread with interest, however, as I have done
work to enable the PPS driver on Windows for serial ports (provided a
custom serial.sys that adds CD timestamping is used). This is not yet
in any NTP release. On Windows, there is no support for directly
disciplining the system clock with a PPS signal, so as with Thorsten
using stock Linux, if the prefer peer goes dark the PPS will follow
before too much longer, no matter that we no longer have ambiguity
about which second. Assuming the PPS is still trusted, such as
Thorsten's, that's suboptimal. At least on Linux patches do exist for
kernel PPS discipline. I could put a gross hack in serialpps.sys to
force the system time to the exact second on each PPS and expose that
via PPSAPI as kernel time discipline, but that's really ugly on
several fronts, a serial driver has no business interfering with the
OS timekeeping, and I'm guessing that's not hoow a real kernel clock
discipline works, though the effect would be similar.

Cheers,
Dave Hart

Unruh

unread,

Mar 26, 2009, 12:46:09 AM3/26/09

to

Thank you. Your quote at the top is correct
"Listen carefully to what I say; it is very complicated."
But I have no idea if I have listened carefully.

So, I have PPS running the shm refclock, with the seconds supplied by the
local clock (ie the reading from the system clock) and the usec from the
PPS signal. Since I do not expect the local clock to suddenly shift by a
second, its usec discipline should be fine. I mark it as the preferred
server. (The PPS shm is run only once the local clock is disciplined to
within .1 sec by the other sources).

Does this make sense? Or have I totally midunderstood.

mi...@udel.edu (David Mills) writes:

>Bill,If you had taken the trouble to look at the documentation , you
>would have found the "Mitigation Rules and the Prefer Peer"
>http://www.eecis.udel.edu/~mills/ntp/html/prefer.html page which clearly
>describes how the contraptoin works.

>Unruh wrote:

>>Dave Hart <dave...@gmail.com> writes:
>>
>>
>>
>>>On Mar 25, 9:19=A0pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
>>>
>>>
>>>>I have no idea why and whether kernel PPS code is any better ( or worse)
>>>>than say PPS discipline using the shm PPS refclock using parallel port
>>>>interrupt. Ie, both can discipline
>>>>to about 1-2usec level.
>>>>
>>>>
>>
>>
>>
>>
>>>Re-read the thread, then. A kernel clock disciplined by PPS allows
>>>PPS to continue to discipline the clock when ntpd's PPS implementation
>>>stops doing so because of a prefer peer problem.
>>>
>>>
>>
>>I must admit that I am completely confused by this prefer peer problem. I
>>would think that ntp would use the PPS/refclock preferentially no matter
>>what external servers are doing. But clearly I am not understanding
>>something. If it does not that would seem to me to be a bug in ntp, not
>>something one should be rewriting kernel code to overcome.
>>

................

Unruh

unread,

Mar 26, 2009, 1:00:31 AM3/26/09

to

Dave Hart <dave...@gmail.com> writes:

>On Mar 26, 2:39=A0am, Unruh <unruh-s...@physics.ubc.ca> wrote:
>>
>> So, again I am confused.
>> I have
>> server tick.usask.ca
>> server ntp.ubc.ca
>> server 127.127.28.0 minpoll 4 prefer
>>
>> where 127.127.28.0 is the PPS via shm and this system seems to have no
>> problems.
>> So again my confusion may be irrelevant (well my confusion certainly is

>> irrelevant but my situation may not be.) Or maybe I simply do not recogni=

>ze
>> the problem.
>> And why is the pps not the preferred peer?
>> (By GPS I assume you mean an NMEA GPS source).

>The original poster is using the PPS refclock driver (127.127.22.u,
>refclock_atom.c). The PPS driver knows when the top of the second is,
>but does not know which second. The PPS driver starts out unreachable
>until a peer marked prefer is selected and the local clock is within
>400ms of that prefer peer, then it trumps and becomes the source:

OK, once it becomes the source, it should stay the source. Once the local
clock is within half a second, it is not going to suddenly jump by a second
(well not very often).

>25 Mar 06:01:56 ntpd[1724]: peer GPS_NMEA(1) event
>'event_reach' (0x84) ...
>25 Mar 06:02:41 ntpd[1724]: system event 'event_peer/
>strat_chg' (0x04) ...
>25 Mar 06:02:41 ntpd[1724]: synchronized to GPS_NMEA(1), stratum 0
>25 Mar 06:02:41 ntpd[1724]: system event 'event_sync_chg' (0x03) ...
>25 Mar 06:02:41 ntpd[1724]: system event 'event_peer/
>strat_chg' (0x04) ...
>25 Mar 06:02:43 ntpd[1724]: peer PPS(1) event 'event_reach' (0x84) ...
>25 Mar 06:03:31 ntpd[1724]: pps sync enabled
>25 Mar 06:03:31 ntpd[1724]: synchronized to PPS(1), stratum 0

>Since my GPS_NMEA(1) and my PPS(1) are using the same serial port, I'm
>not concerned about that setup losing the prefer peer taking out the
>PPS. I'm following this thread with interest, however, as I have done
>work to enable the PPS driver on Windows for serial ports (provided a
>custom serial.sys that adds CD timestamping is used). This is not yet
>in any NTP release. On Windows, there is no support for directly
>disciplining the system clock with a PPS signal, so as with Thorsten
>using stock Linux, if the prefer peer goes dark the PPS will follow
>before too much longer, no matter that we no longer have ambiguity

That sounds like a bug. Once a PPS has taken control, it should maintain
control unless there is overwhelming evidence that it has lost control (Ie
jumped by a second). Computer clocks are relatively good flywheels.

alko...@googlemail.com

unread,

Mar 26, 2009, 9:20:07 AM3/26/09

to

On Mar 26, 5:46 am, Unruh <unruh-s...@physics.ubc.ca> wrote:
> So, I have PPS running the shm refclock, with the seconds supplied by the
> local clock (ie the reading from the system clock) and the usec from the
> PPS signal. Since I do not expect the local clock to suddenly shift by a
> second, its usec discipline should be fine. I mark it as the preferred
> server. (The PPS shm is run only once the local clock is disciplined to
> within .1 sec by the other sources).

This could be a good idea in my opinion, but someone with better
knowledge should comment that.
I'm trying this right now. As soon as the GPS/DCF gets dark, the LOCAL
clock is used as prefer peer and PPS is still used to sync. Only thing
that comes to my mind are leap seconds: As the RTC is only synced by
the kernel every 11 minutes, it could cause troubles when the LOCAL
clock suddenly is 1 s off GPS/DCF. But I guess it will be marked as
false ticker until the next kernelsync.

Unruh

unread,

Mar 26, 2009, 11:52:05 AM3/26/09

to

"alko...@googlemail.com" <alko...@googlemail.com> writes:

>On Mar 26, 5:46=A0am, Unruh <unruh-s...@physics.ubc.ca> wrote:
>> So, I have PPS running the shm refclock, with the seconds supplied by the
>> local clock (ie the reading from the system clock) and the usec from the

>> PPS signal. =A0Since I do not expect the local clock to suddenly shift by=

> a
>> second, its usec discipline should be fine. I mark it as the preferred
>> server. (The PPS shm is run only once the local clock is disciplined to
>> within .1 sec by the other sources).

I have now looked at the refclock_atom source and indeed, it demands that a
prefer clocksource is available, and ignores the PPS if it is not. This I
believe is a bug, or at least a design infelicity. You could either hack
the source ( put in a flag so that if once the PPS was accepted, it would
continue to be accepted even if the prefer source disappeared. One could
perhaps set it up to timeout after some time so that if the prefer source
never came back up in say 3 days, ntp would finally start to disregard the
PPS.) Alternatively you could use a different input route for the PPS--
like the shm refclock with shmpps or gpsd to feed the shm refclock. That
does not have the logic that the refclock_atom does or disregarding the
input if another input is not there.
Note that if one does set up the atom driver with the flag, the tolerance
of .4 sec should probably be decreased to say .1 sec to make sure that the
ntpd discipline routine did not drift off out of .5 sec bounds before
settling down to properly disciplining the clock. Although that should not
happen.

>This could be a good idea in my opinion, but someone with better
>knowledge should comment that.
>I'm trying this right now. As soon as the GPS/DCF gets dark, the LOCAL
>clock is used as prefer peer and PPS is still used to sync. Only thing
>that comes to my mind are leap seconds: As the RTC is only synced by
>the kernel every 11 minutes, it could cause troubles when the LOCAL
>clock suddenly is 1 s off GPS/DCF. But I guess it will be marked as
>false ticker until the next kernelsync.

The local clock is not the rtc. The local clock is simply the system clock,
the same clock you are trying to discipline. It is not clear to me that
making thelocal the prefer clock does what you need for the atom driver,
because I have not disentangled the logic of the "Selection and mitigation
algorithm" whose description David Mills pointed me to. The local clock
appears to be treated specially within this algorithm.
So Yes, If the prefer clock went down for days before a leap second, then
your system would have no way of knowing a a leapsecond was coming and
would presumably coast on without the leapsecond. That might be a reason to
put in a limit on the length of the coasting for the PPS clock.
Presumably if the leapsecond flag had been set on the leapsecond day
sometime, that leapsecond would actually get dealt with. Or if the
leapsecond file were installed. Even if the PPS were coasting.

Harlan Stenn

unread,

Mar 26, 2009, 7:24:57 PM3/26/09

to

Bill,

There is an apparent shortcoming in the existing PPS model/implementaion
that I am still trying to wrap my head around.

https://support.ntp.org/bugs/show_bug.cgi?id=557 has some information on
this and I am looking for the other thread on that topic.

As I understand it, and I almost certainly have the following wrong in at
least some ways, there are folks out there who want to support multiple PPS
inputs to NTP, and they also want to use a model where ntp's PPS driver
maintains "health state" information about each PPS source. Dave maintains
that the existing infrastructure can already mostly handle this, and while
some work will be needed, the first set of folks are reinventing too much of
the wheel.

What I have been (slowly) working on is figuring out where the two
above-mentioned groups are miscommunicating, and identifying exactly what
needs to be done to come up with a solution.
--
Harlan Stenn <st...@ntp.org>
http://ntpforum.isc.org - be a member!

Harlan Stenn

unread,

Mar 26, 2009, 9:40:31 PM3/26/09

to

Please see http://support.ntp.org/bin/view/Dev/TestingLab for where I'd like
to see information and discussion about an NTP Testing Lab, the equipment it
should contain, the tests it should be able to run, and various other
related issues.

David Woolley

unread,

Mar 27, 2009, 4:09:06 AM3/27/09

to

Unruh wrote:

>
> I have now looked at the refclock_atom source and indeed, it demands that a
> prefer clocksource is available, and ignores the PPS if it is not. This I
> believe is a bug, or at least a design infelicity. You could either hack

I suspect the reasoning may be that both normally come from the same
radio clock which will free-run the PPS when it loses a signal, so that
detecting the failure of the NMEA data is the only way of telling that
the PPS data is unreliable.

John Ackermann N8UR

unread,

Mar 27, 2009, 8:15:38 AM3/27/09

to

More than that, some PPS sources don't have an accompanying timecode, so
they need another source to provide the coarse time. This does cause
some interesting design challenges because it makes the PPS reliant on
an external source.

A couple of years ago I tried defining multiple prefer peers to improve
the reliability, but never got that working in a reliable way.

I'd like to see one of two solutions: (a) the ability to define
multiple prefer peers, such that failure of one would cause a switch to
another; or (b) an option that would require an external sane source for
initial sync, but once the time is known to the second, continue relying
on the PPS even if the prefer server goes away.

BTW -- this is a real-world issue, at least for my definition of
"world". Two of my NTP servers have this problem. One syncs from a
Cesium atomic clock, and the other from a LORAN-C receiver. Both those
sources provide PPS but no accompanying timecode.

John

Unruh

unread,

Mar 27, 2009, 1:33:57 PM3/27/09

to

David Woolley <da...@ex.djwhome.demon.co.uk.invalid> writes:

>Unruh wrote:

That may be the logic, but it is seriously flawed. It also indicates that
the decision to interpret PPS separately from the other drivers is flawed.
atom should ONLY be used for a single separate PPS source decoupled from
anything else. If you rPPS is combined with nmea then that nmea driver
should be running the PPS since it is the combination that gives the time,
and it is the combination that knows whether the PPS is good or not.

Using some heuristic like "is a prefer source available" is a pretty poor
substitute for knowing whether the PPS is good or not.

Unruh

unread,

Mar 27, 2009, 1:40:53 PM3/27/09

to

j...@febo.com (John Ackermann N8UR) writes:

>David Woolley wrote:
>> Unruh wrote:
>>
>>> I have now looked at the refclock_atom source and indeed, it demands that a
>>> prefer clocksource is available, and ignores the PPS if it is not. This I
>>> believe is a bug, or at least a design infelicity. You could either hack
>>
>> I suspect the reasoning may be that both normally come from the same
>> radio clock which will free-run the PPS when it loses a signal, so that
>> detecting the failure of the NMEA data is the only way of telling that
>> the PPS data is unreliable.

>More than that, some PPS sources don't have an accompanying timecode, so
>they need another source to provide the coarse time. This does cause
>some interesting design challenges because it makes the PPS reliant on
>an external source.

>A couple of years ago I tried defining multiple prefer peers to improve
>the reliability, but never got that working in a reliable way.

>I'd like to see one of two solutions: (a) the ability to define
>multiple prefer peers, such that failure of one would cause a switch to
>another; or (b) an option that would require an external sane source for
>initial sync, but once the time is known to the second, continue relying
>on the PPS even if the prefer server goes away.

The second would be trivial to apply. Just stick in a flag which gets set
once the initial external source has converged to within .1 sec say, and
if set it stays set and the PPS source gets used if it is set. If you
really worry, you could augment the flag everytime the prefer peer was
unavailable ( resetting it every time it was) and not use PPS if that
counter got greater than 86400 say ( one day) This is about a three line
change to the atom driver.

>BTW -- this is a real-world issue, at least for my definition of
>"world". Two of my NTP servers have this problem. One syncs from a
>Cesium atomic clock, and the other from a LORAN-C receiver. Both those
>sources provide PPS but no accompanying timecode.

Use the shmpps code with the refclock_shm driver. It will wait until the
time source is reported to have an offset of less than .25 sec, and then
start the shm report of the PPS source. Thereafter it will continue to use
the PPS source as a normal source, which you can make the preferred source.

David Mills

unread,

Mar 27, 2009, 4:26:12 PM3/27/09

to

David,

True; this is an instance of the Principle of Fate Sharing. There is a
long history over 23 years when something like a faulty antenna caused
the serial timecode and PPS signal to wander away from each other
resulting in a large number of clients to jump the second every few
hours. There were occasions when faulty PPS house wiring resulted in
similar behavior. Keep in mind the following.

1. There are as many sanity checks as the timecode permits. The NMEA
interface provides very little health information, but our Spectacom,
Austron and Arbiter receivers provide much more useful information which
the drivers do interpret.

2. The PPS signal is groomed with a range gate to remove spurs that
might be picked up in the house wiring and to prevent chaos if the PPS
is connected to an incorrect source or has failed for some time (all of
which have happened here on more than one occasion).

3. The PPS is purposefully ignored if the preferred source does not
survive the intersection algorithm or the prefer peer offset is greater
than 0.4 s. Ponder on this and consider the failure scenarios if this is
not done.

I would assume any other method to capture the PPS signal does much the
same thing for the same reasons.

Now consider the case with two or more flakey timecode receivers or even
external servers are involved together with one or more disciplined PPS
sources. First, you have to understand a cesium clock is not really a
clock; the PPS function involves a counter/divider that must be
calibrated. Once upon a time I had three of these animals that required
regular trips to USNO in Washington, DC. Upon arrival, a lab techician
would come to the van towing a cable that plugged into the clock. Then,
he pushed a button to synchronize the clock PPS to USNO PPS. To conform
to hazmat regulations we could not use the Baltimore tunnels; we had to
use the Annapolis bridge.

The point made here is that, if you need to go to the trouble and hazard
of a real cesium farm, you are already running serious infrastructure
and it probably doesn't make much sense to duplex the PPS source. In our
case when we calibrate a clock, we simply unplugged the PSS and plugged
it into another clock. However, if for other reasons you need multiple
PPS sources, read on.

Once upon a time there were three different PPS interfaces and three
flavors of Unix tty interfaces and it got seriously complicated to
support all combinations. So, the Digital guys and I rammed the PPSAPI
interface down your collective throats. The mission in that adventure
was to hide the interface issues in the ppsclock.h header file specific
to each architecture. The version used with Solaris supports only one
PPS abstraction, but the version used with FreeBSD should support more
than one. I have no idea whether Linux does or does not.

The prefer peer issue is richly contentious and probably overstated.
There is indeed only one prefer peer at a time, but there can be more
than one peer so designated. If so, the first one in the configuration
file that passes all sanity checkes becomes preferred. The engineering
abstraction is that the PPS signal appears to come from the prefer peer
and inherits its other characteristics like stratum, distance, etc.

There is now way I can imagine how to discipline a herd of PPS signals
other than to associate each one separately with the associated
seconds-numbering source. One possible way is to use the unit number
with the expectation that the unit number of the reference clock is
associated with the atom driver with the same unit number.

Dave

David Woolley wrote:

>Unruh wrote:
>
>
>
>>I have now looked at the refclock_atom source and indeed, it demands that a
>>prefer clocksource is available, and ignores the PPS if it is not. This I
>>believe is a bug, or at least a design infelicity. You could either hack
>>
>>
>
>I suspect the reasoning may be that both normally come from the same
>radio clock which will free-run the PPS when it loses a signal, so that
>detecting the failure of the NMEA data is the only way of telling that
>the PPS data is unreliable.
>

John Ackermann N8UR

unread,

Mar 27, 2009, 5:56:16 PM3/27/09

to

David Mills said the following on 03/27/2009 05:44 PM:

> To John: I suspect you know that all LORSTA stations run by the Coast
> Guard include in the weekly announcement series a time-of-coincidence
> (TOC) second at which the epoch of the second is equal to the epoch of
> the GRI. My Astron LORAN-C receiver, which was modified by the Coast
> Guard, flashes a light at the TOC. The TOCs and the intervals between
> them vary up to several minutes depending on the GRI of the chain. I
> made provisions for the TOC in the LORAN-C receiver I built some years
> ago, but never completed the code to exploit them.

Hi Dave --

Yes, the TOC can be used to generate an on-time PPS. The Austron 2100
(sometimes referred to as the "T" model) receiver has that capability
(the more common 2100F version does not; it is useful only for frequency
comparisons).

I drive one of my NTP servers (a modified Soekris SBC running nanoBSD)
tied to the Austron 2100, and its crystal oscillator is locked to an
Austron 2010B disciplined oscillator that is driven from the LORAN
receiver. I suspect that this may be the only LORAN-based NTP system
running anywhere today. It keeps very good time; in fact, its noise is
comparable to a similar Soekris box that gets its PPS and oscillator
from a Z3801A GPSDO.

Of course, the Coast Guard just announced that they plan to turn the US
LORSTAs off in 2010 as part of budget cuts. I don't know if that is
political maneuvering to get a bigger piece of the pie, or a done deal.

John

David Mills

unread,

Mar 27, 2009, 5:44:10 PM3/27/09

to

Bill and John,

To Bill: Once upon a time several reference clock drivers I wrote had
their own idiosycratic PPS support. There are over forty now in the
driver collection. Years ago I pulled PPS support from all the drivers I
wrote in favor of the atom driver. Most drivers can be used with the
atom driver; the NMEA driver does use the $GPGSA sentence presumably to
police the PPS, but there is no evidence the other GPS drivers do. The
SHM driver you cite does much the same thing as the atom driver to
monitor the offset, but so far as I can see does not groom the signal
itself. Years ago I accidently plugged a signal generator in the PPS and
the kernl went nuts.

You have a flawed interpreation of the NTP algorithms with respect to
the synchronization distance and selection threshold. There is no need
to wait for one day in seconds. That went away with NTPv4. As for a
sticky bit, that would not be hard to add once things like flag glut are
overcome, but from my experience here I would not recommend it for use
in a public server. It is not as simple as you think, as things like the
syncrhonization distance have to be mitigated, etc.

To John: I suspect you know that all LORSTA stations run by the Coast
Guard include in the weekly announcement series a time-of-coincidence
(TOC) second at which the epoch of the second is equal to the epoch of
the GRI. My Astron LORAN-C receiver, which was modified by the Coast
Guard, flashes a light at the TOC. The TOCs and the intervals between
them vary up to several minutes depending on the GRI of the chain. I
made provisions for the TOC in the LORAN-C receiver I built some years
ago, but never completed the code to exploit them.

Dave

Unruh wrote:

>j...@febo.com (John Ackermann N8UR) writes:
>
>
>

>>David Woolley wrote:
>>
>>
>>>Unruh wrote:
>>>
>>>
>>>
>>>>I have now looked at the refclock_atom source and indeed, it demands that a
>>>>prefer clocksource is available, and ignores the PPS if it is not. This I
>>>>believe is a bug, or at least a design infelicity. You could either hack
>>>>
>>>>
>>>I suspect the reasoning may be that both normally come from the same
>>>radio clock which will free-run the PPS when it loses a signal, so that
>>>detecting the failure of the NMEA data is the only way of telling that
>>>the PPS data is unreliable.
>>>
>>>
>
>
>

Unruh

unread,

Mar 27, 2009, 7:11:11 PM3/27/09

to

mi...@udel.edu (David Mills) writes:

>Bill and John,

>To Bill: Once upon a time several reference clock drivers I wrote had
>their own idiosycratic PPS support. There are over forty now in the
>driver collection. Years ago I pulled PPS support from all the drivers I
>wrote in favor of the atom driver. Most drivers can be used with the
>atom driver; the NMEA driver does use the $GPGSA sentence presumably to
>police the PPS, but there is no evidence the other GPS drivers do. The
>SHM driver you cite does much the same thing as the atom driver to
>monitor the offset, but so far as I can see does not groom the signal
>itself. Years ago I accidently plugged a signal generator in the PPS and
>the kernl went nuts.

I am sure it would. However, that is not a failure mode one should be too
concerned about. However, on my GPS PPS every once in a while it will issue
10 pulses in one second. No idea what is causing this noise, but I have
seen it on two versions of the GPS18. It is certainly true that the driver
should do somethingabout this. (It does not however seem to caused any
great problem with my system running from that source via the shm driver,
perhaps because it throws away 60% of inputs over the 16 sec poll interval.

>You have a flawed interpreation of the NTP algorithms with respect to
>the synchronization distance and selection threshold. There is no need
>to wait for one day in seconds. That went away with NTPv4. As for a

That was put in there only if the person wanted to allow the PPS to
freewheel but only for a limited time (one day in my example). Ie, the PPS
source is assumed to be fine for 1 day even if no prefer peer is selected.

Whether there is otehr code in teh atom driver which relies on their being
a prefer peer associated with the PPS I do not know.

David Mills

unread,

Mar 27, 2009, 8:24:21 PM3/27/09

to

Johm,

Don't get me going on the Austron 21--. I had one and a 2200, too, and
they cost me more in repeated repari bills than new Spectracom GP
receivers, so I junked them. As for the Coasties, they have threatened
to turn LORSTAs off every few years since 1990. As for your suspicion
that you are the only NTP server synchronized to LORAN-C, be advised
Poul-Henning Kamp has a clone of my receiver running in Denmark with a
Rubidium timebase. Apparently, LORAN-C has new life in Europe.

Dave

DaveJohn Ackermann N8UR wrote:

> David Mills said the following on 03/27/2009 05:44 PM:
>

>> To John: I suspect you know that all LORSTA stations run by the Coast
>> Guard include in the weekly announcement series a time-of-coincidence
>> (TOC) second at which the epoch of the second is equal to the epoch
>> of the GRI. My Astron LORAN-C receiver, which was modified by the
>> Coast Guard, flashes a light at the TOC. The TOCs and the intervals
>> between them vary up to several minutes depending on the GRI of the
>> chain. I made provisions for the TOC in the LORAN-C receiver I built
>> some years ago, but never completed the code to exploit them.
>
>

John Ackermann N8UR

unread,

Mar 27, 2009, 9:34:19 PM3/27/09

to

Poul-Henning is doing some really neat stuff with a software defined
LORAN receiver, but he's not actually using that one for timekeeping
(that I know of). I didn't know he also had your design running as well.

John
----

David Mills said the following on 03/27/2009 08:24 PM:

David Mills

unread,

Mar 27, 2009, 10:17:57 PM3/27/09

to

Bill,

The NTPv4 model is that the server does not prejudge what the client is
prepared to accept. The synchronization distance includes all credible
contributions to the maximum error budget, including the increase in
maximum error due to the frequency tolerance of the server timebase.
This is clearly expressed in the specification along the the expectation
that a client (requirement for a secondary server) that if the distance
eexceeds the selection threshold, the server should not be considered by
the selection algorithm.

The client is free to adjust the selection threshold to fit its
requirements. This model apples to all sources, including the atom
driver. The server distance itself starts at the current value of the
system peer at the last selection and then increases from there, even if
all sources are lost. A dependent client will disregard the server when
the distance exceeds its selection threshold.

I am happy to continue this discussion, but only if you read the
specification.

Dave

Unruh wrote:

>>To John: I suspect you know that all LORSTA stations run by the Coast
>>Guard include in the weekly announcement series a time-of-coincidence
>>(TOC) second at which the epoch of the second is equal to the epoch of
>>the GRI. My Astron LORAN-C receiver, which was modified by the Coast
>>Guard, flashes a light at the TOC. The TOCs and the intervals between
>>them vary up to several minutes depending on the GRI of the chain. I
>>made provisions for the TOC in the LORAN-C receiver I built some years
>>ago, but never completed the code to exploit them.
>>
>>
>
>
>

David Mills

unread,

Mar 27, 2009, 10:42:08 PM3/27/09

to

John,

I built two of the radios several years ago with a rather expensive hot
rock timebase good to pars in 10^10. Unlike all of the commmercial
radios I knmow about that use a hardlimited approach, mine used a
linear, matched-filter design that was comforatble at well below -10dB
SNR. I gave one of the radios and a rock to P-H, but he has a better
rock and computer interface. The problem is/was that the radio used the
ISA interface, which of course exists only in museums. Modern designers
would use a fast ADC and FPGA and own the world.

Dave

John Ackermann N8UR wrote:

> Poul-Henning is doing some really neat stuff with a software defined
> LORAN receiver, but he's not actually using that one for timekeeping
> (that I know of). I didn't know he also had your design running as well.
>
> John
> ----
>
> David Mills said the following on 03/27/2009 08:24 PM:
>
>> Johm,
>>
>> Don't get me going on the Austron 21--. I had one and a 2200, too,
>> and they cost me more in repeated repari bills than new Spectracom GP
>> receivers, so I junked them. As for the Coasties, they have
>> threatened to turn LORSTAs off every few years since 1990. As for
>> your suspicion that you are the only NTP server synchronized to
>> LORAN-C, be advised Poul-Henning Kamp has a clone of my receiver
>> running in Denmark with a Rubidium timebase. Apparently, LORAN-C has
>> new life in Europe.
>>
>> Dave
>>
>> DaveJohn Ackermann N8UR wrote:
>>
>>> David Mills said the following on 03/27/2009 05:44 PM:
>>>

>>>> To John: I suspect you know that all LORSTA stations run by the
>>>> Coast Guard include in the weekly announcement series a
>>>> time-of-coincidence (TOC) second at which the epoch of the second
>>>> is equal to the epoch of the GRI. My Astron LORAN-C receiver, which
>>>> was modified by the Coast Guard, flashes a light at the TOC. The
>>>> TOCs and the intervals between them vary up to several minutes
>>>> depending on the GRI of the chain. I made provisions for the TOC in
>>>> the LORAN-C receiver I built some years ago, but never completed
>>>> the code to exploit them.
>>>
>>>

>>> Hi Dave --
>>>
>>> Yes, the TOC can be used to generate an on-time PPS. The Austron
>>> 2100 (sometimes referred to as the "T" model) receiver has that
>>> capability (the more common 2100F version does not; it is useful
>>> only for frequency comparisons).
>>>
>>> I drive one of my NTP servers (a modified Soekris SBC running
>>> nanoBSD) tied to the Austron 2100, and its crystal oscillator is
>>> locked to an Austron 2010B disciplined oscillator that is driven
>>> from the LORAN receiver. I suspect that this may be the only
>>> LORAN-based NTP system running anywhere today. It keeps very good
>>> time; in fact, its noise is comparable to a similar Soekris box that
>>> gets its PPS and oscillator from a Z3801A GPSDO.
>>>
>>> Of course, the Coast Guard just announced that they plan to turn the
>>> US LORSTAs off in 2010 as part of budget cuts. I don't know if that
>>> is political maneuvering to get a bigger piece of the pie, or a done
>>> deal.
>>>
>>> John
>>
>>
>>

David Mills

unread,

Mar 27, 2009, 10:30:32 PM3/27/09

to

Bill,

I have to top-post multiple messages as my eyesight is very poor and the
inline post uses a font too small for me to read even with a magnifying
glass.

The reason why a burst of pulses to the SHM driver does not cause a
problem is that, like the atom driver, the pulse buffer is scanned once
per second and new pulses overwrite old ones during the second. However,
during the one-second scan period, none, one or two pulses can occur
even if there is no bunching. The atom driver checks for these cases and
adjusts the range gate accordingly. While I didn't scrutinize the SHM
code closely, I presume it does this, too. The most important reason the
range gate is there is to protect against random spikes that
occasionally occur in the house wiring when some heavy machinery cranks
up. It is extremely effective in such cases.

Dave

Unruh wrote:

>>To John: I suspect you know that all LORSTA stations run by the Coast
>>Guard include in the weekly announcement series a time-of-coincidence
>>(TOC) second at which the epoch of the second is equal to the epoch of
>>the GRI. My Astron LORAN-C receiver, which was modified by the Coast
>>Guard, flashes a light at the TOC. The TOCs and the intervals between
>>them vary up to several minutes depending on the GRI of the chain. I
>>made provisions for the TOC in the LORAN-C receiver I built some years
>>ago, but never completed the code to exploit them.
>>
>>
>
>
>

Unruh

unread,

Mar 27, 2009, 11:13:12 PM3/27/09

to

mi...@udel.edu (David Mills) writes:

>Bill,

>The NTPv4 model is that the server does not prejudge what the client is
>prepared to accept. The synchronization distance includes all credible
>contributions to the maximum error budget, including the increase in
>maximum error due to the frequency tolerance of the server timebase.
>This is clearly expressed in the specification along the the expectation
>that a client (requirement for a secondary server) that if the distance
>eexceeds the selection threshold, the server should not be considered by
>the selection algorithm.

>The client is free to adjust the selection threshold to fit its
>requirements. This model apples to all sources, including the atom
>driver. The server distance itself starts at the current value of the
>system peer at the last selection and then increases from there, even if
>all sources are lost. A dependent client will disregard the server when
>the distance exceeds its selection threshold.

Ah, so the atom driver is weird in that it is not regarded as a valid
source, its associated peer is regarded as the source, and if the peer is lost, the
"distance" keeps increasing via ntp's algorithm. Thus after a while ntp
will say that the PPS source has such a large error that clients will
disregard it. That is of course bizarre, since a PPS source typically has a
far far smaller error than do any of the prefer peers.
I think that this is again a design flaw in the atom driver, and again tips
the balance toward using something like the shm refclock driver instead of
atom. At least the shm refclock is treated like a clock in its own right
and leaves it to the user, via whatever feeds the shm, to decide how to
decide when the PPS has become unreliable.

>I am happy to continue this discussion, but only if you read the
>specification.

I think this is an issue of how the atom driver treats the specification,
not of the specification itself.

Harlan Stenn

unread,

Mar 28, 2009, 12:06:26 AM3/28/09

to

>>> In article <968zl.19988$PH1.19347@edtnps82>, Unruh <unruh...@physics.ubc.ca> writes:

Unruh> That may be the logic, but it is seriously flawed. It also indicates
Unruh> that the decision to interpret PPS separately from the other drivers
Unruh> is flawed. atom should ONLY be used for a single separate PPS source
Unruh> decoupled from anything else. If you rPPS is combined with nmea then
Unruh> that nmea driver should be running the PPS since it is the
Unruh> combination that gives the time, and it is the combination that knows
Unruh> whether the PPS is good or not.

Unruh> Using some heuristic like "is a prefer source available" is a pretty
Unruh> poor substitute for knowing whether the PPS is good or not.

I think I probably agree with you Bill.

From one POV, it seems to me that each "instance" of a PPS source should
come with "health information" about that PPS instance. This "health"
information should include the current expected precision of the PPS signal.

Some PPS devices are "stand-alone" while others are "combined" with, say, a
GPS signal.

For the former case, it makes sense to treat them as separate sources of
time.

For the latter, it makes sense to treat the PPS signal as an "adjunct"
device.

See http://support.ntp.org/bin/view/Dev/GettingNtpdItsConfiguration for more
information.

The bottom line is we need to figure out what else needs to be done to make
the "state of affairs" with PPS sources even more generally useful.

Bill Unruh

unread,

Mar 28, 2009, 2:52:21 AM3/28/09

to

Harlan Stenn <st...@ntp.org> writes:

>>>> In article <968zl.19988$PH1.19347@edtnps82>, Unruh <unruh...@physics.ubc.ca> writes:

>Unruh> That may be the logic, but it is seriously flawed. It also indicates
>Unruh> that the decision to interpret PPS separately from the other drivers
>Unruh> is flawed. atom should ONLY be used for a single separate PPS source
>Unruh> decoupled from anything else. If you rPPS is combined with nmea then
>Unruh> that nmea driver should be running the PPS since it is the
>Unruh> combination that gives the time, and it is the combination that knows
>Unruh> whether the PPS is good or not.

>Unruh> Using some heuristic like "is a prefer source available" is a pretty
>Unruh> poor substitute for knowing whether the PPS is good or not.

>I think I probably agree with you Bill.

>From one POV, it seems to me that each "instance" of a PPS source should
>come with "health information" about that PPS instance. This "health"
>information should include the current expected precision of the PPS signal.

Not sure what the health info would be. The health of PPS is either great,
(if the seconds is good) or attrocious (if the second is bad), and the
driver presumably does not know this or it would have corrected the seconds
(remember that this is like the date being a month out, if your accuracy
was a second).

>Some PPS devices are "stand-alone" while others are "combined" with, say, a
>GPS signal.

>For the former case, it makes sense to treat them as separate sources of
>time.

>For the latter, it makes sense to treat the PPS signal as an "adjunct"
>device.

Well, I would say that the GPS is the adjunct device since the PPS is far
far more accurate (4 orders of magnitude).

>See http://support.ntp.org/bin/view/Dev/GettingNtpdItsConfiguration for more
>information.

>The bottom line is we need to figure out what else needs to be done to make
>the "state of affairs" with PPS sources even more generally useful.

Clearly the PPS needs something auxilliary to determine the seconds. On the
other hand, once that second lock is obtained it is hard to lose it. The system clock
on most computers is not going to drift by 500,000 PPM
(especially not most time servers).

Now maybe I am not imaginative enough to think of what could go wrong.

Harlan Stenn

unread,

Mar 28, 2009, 4:48:35 AM3/28/09

to

>>> In article <FOjzl.18914$Db2.8066@edtnps83>, Bill Unruh <un...@physics.ubc.ca> writes:

> Harlan Stenn <st...@ntp.org> writes:
>> From one POV, it seems to me that each "instance" of a PPS source should
>> come with "health information" about that PPS instance. This "health"
>> information should include the current expected precision of the PPS
>> signal.

Bill> Not sure what the health info would be. The health of PPS is either
Bill> great, (if the seconds is good) or attrocious (if the second is bad),
Bill> and the driver presumably does not know this or it would have
Bill> corrected the seconds (remember that this is like the date being a
Bill> month out, if your accuracy was a second).

There are at least two issues here.

In one case, there can be a GPS device that may deliver a PPS second but
that PPS is not really sync'd. The GPS device in this case will usually
have this information ([do not] believe the PPS signal) in there.

In another case, the Stanford Research PRS10 rubidium frequency standard has
an RS-232 output port that has diagnostic info, including letting you know
if the device is in its "warming up" state, which means during those 6 or 7
minutes the pulse is not yet as accurate as it should be.

I suspect there are other, similar situations with other devices.

>> Some PPS devices are "stand-alone" while others are "combined" with, say,
>> a GPS signal.

>> For the former case, it makes sense to treat them as separate sources of
>> time.

>> For the latter, it makes sense to treat the PPS signal as an "adjunct"
>> device.

Bill> Well, I would say that the GPS is the adjunct device since the PPS is
Bill> far far more accurate (4 orders of magnitude).

That may be, but the smarts for the driver are in the GPS side of things,
not on the PPS side of things. We'd probalby want the config file to say
things like "I'm an XYZ refclock that has a PPS signal", instead of "I'm a
PPS signal that comes from an XYZ device".

If the final code can handle this either way I don't care. But I figure
there is a good chance it will not be that way.

>> See http://support.ntp.org/bin/view/Dev/GettingNtpdItsConfiguration for
>> more information.

>> The bottom line is we need to figure out what else needs to be done to
>> make the "state of affairs" with PPS sources even more generally useful.

Bill> Clearly the PPS needs something auxilliary to determine the
Bill> seconds. On the other hand, once that second lock is obtained it is
Bill> hard to lose it. The system clock on most computers is not going to
Bill> drift by 500,000 PPM (especially not most time servers).

For me, the issues for a PPS source is how much can we believe the PPS
signal arrives at the instant of the stroke of the second.

That is a separate issue from "about what time is it".

Bill> Now maybe I am not imaginative enough to think of what could go wrong.

I know I'm not.

John Ackermann N8UR

unread,

Mar 28, 2009, 7:34:18 AM3/28/09

to

Harlan Stenn said the following on 03/28/2009 04:48 AM:

> Bill> Not sure what the health info would be. The health of PPS is either
> Bill> great, (if the seconds is good) or attrocious (if the second is bad),
> Bill> and the driver presumably does not know this or it would have
> Bill> corrected the seconds (remember that this is like the date being a
> Bill> month out, if your accuracy was a second).
>
> There are at least two issues here.
>
> In one case, there can be a GPS device that may deliver a PPS second but
> that PPS is not really sync'd. The GPS device in this case will usually
> have this information ([do not] believe the PPS signal) in there.
>
> In another case, the Stanford Research PRS10 rubidium frequency standard has
> an RS-232 output port that has diagnostic info, including letting you know
> if the device is in its "warming up" state, which means during those 6 or 7
> minutes the pulse is not yet as accurate as it should be.

I'm not sure if the PRS10 health info is really that helpful, for NTP
purposes. Even when the Rb is warming up, or out of lock, it will still
be far more stable and accurate than any mobo rock. e.g., in the first
minute after power-up it might be parts in 10e6 off, and that error will
close very rapidly as it warms. That's a huge error for the time-nuts
crowd, but still not bad in the NTP world.

The real problem is determining whether a PPS source is accurately
synced to UTC. For example, when the PRS10 is warm and locked, it will
generate a very high quality PPS signal, but by itself that signal will
have a random offset of up to 0.5 second with respect to UTC. It needs
to be sync'd against a UTC source before the PPS is useful for timekeeping.

It seems to me that should be pretty easy to do a sanity check against
external servers -- if the PPS is more than some fairly large number of
milliseconds (ideally, that number should be configurable) off of some
known sane external servers, we can assume it is not sync'd and ignore
it. Within that window, assume that it's the most accurate clock in the
house.

John

Harlan Stenn

unread,

Mar 28, 2009, 3:27:46 PM3/28/09

to

>>> In article <49CE0B3A...@febo.com>, j...@febo.com (John Ackermann N8UR) writes:

> Harlan Stenn said the following on 03/28/2009 04:48 AM:

>> In one case, there can be a GPS device that may deliver a PPS second but
>> that PPS is not really sync'd. The GPS device in this case will usually
>> have this information ([do not] believe the PPS signal) in there.

This first case is significant in that ntpd needs to know if the PPS should
be believed or not.

>> In another case, the Stanford Research PRS10 rubidium frequency standard
>> has an RS-232 output port that has diagnostic info, including letting you
>> know if the device is in its "warming up" state, which means during those
>> 6 or 7 minutes the pulse is not yet as accurate as it should be.

John> I'm not sure if the PRS10 health info is really that helpful, for NTP
John> purposes. Even when the Rb is warming up, or out of lock, it will
John> still be far more stable and accurate than any mobo rock. e.g., in
John> the first minute after power-up it might be parts in 10e6 off, and
John> that error will close very rapidly as it warms. That's a huge error
John> for the time-nuts crowd, but still not bad in the NTP world.

These two worlds are getting closer and closer all the time, and I'd like to
make that "journey" as easy and useful as possible.

This also goes to "truth in advertising".

I believe there is benefit if I query an NTP server and it replies "here's
my timestamp and I'm 10e6" because its PRS10 is warming up, and a subsquent
queury says " ... and I'm 10e9" (or whatever).

Likewise, I think it would be swell if a GPS refclock could also have a
sense of its precision based on the number of satellites it sees and how
well it thinks it knows its position.

This information is a bit dynamic, and I think the better we can do at
"shaving the fuzz" (or even knowing exactly where different types of fuzz
are) the better ntp will be able to do its job.

John> The real problem is determining whether a PPS source is accurately
John> synced to UTC. For example, when the PRS10 is warm and locked, it
John> will generate a very high quality PPS signal, but by itself that
John> signal will have a random offset of up to 0.5 second with respect to
John> UTC. It needs to be sync'd against a UTC source before the PPS is
John> useful for timekeeping.

From an audit perspective I wonder what it would take to track "how long has
it been since we had sync" for devices like this.

All I know about the units is what I have read from their website. Since
they are not far from me when I'm in the Bay area, I figure some day I'll
call them and see if I can visit.

John> It seems to me that should be pretty easy to do a sanity check against
John> external servers -- if the PPS is more than some fairly large number
John> of milliseconds (ideally, that number should be configurable) off of
John> some known sane external servers, we can assume it is not sync'd and
John> ignore it. Within that window, assume that it's the most accurate
John> clock in the house.

Yup, and then what can/should we do with information like that, as far as
"operational feedback".

Kevin Oberman

unread,

Mar 28, 2009, 6:53:13 PM3/28/09

to

> From: Harlan Stenn <st...@ntp.org>
> Date: Sat, 28 Mar 2009 08:48:35 +0000
> Sender: questions-bounces+oberman=es....@lists.ntp.org

> _______________________________________________
> questions mailing list
> ques...@lists.ntp.org
> https://lists.ntp.org/mailman/listinfo/questions
>

This has been a real issue for us. We have about 26 ntp servers
scattered across our network using CDMA clocks. There are a few places
where the signal is poor and clock will lose sync.

When this happens, the PPS continues, but will slowly drift with respect
to actual time. Unfortunately, even though the time from the clock is
marked as unsynced and ntpd stops using that time, it continues train
the time with the drifting PPS signal, so the time slowly drifts away
from where it should be.

Ideally, if the source of the time being trained by the PPS is bad, the
PPS also should be considered bad and kernel PPS should be disabled.
--
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: obe...@es.net Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751

John Ackermann N8UR

unread,

Mar 28, 2009, 7:26:07 PM3/28/09

to

Kevin Oberman said the following on 03/28/2009 06:53 PM:

> Ideally, if the source of the time being trained by the PPS is bad, the
> PPS also should be considered bad and kernel PPS should be disabled.

This should only be the behaviour for a refclock that provides both a
PPS and a timecode. If it has a PPS only, this results in lower
reliability because the PPS could be just fine while the independent
prefer peer goes insane.

John

David Mills

unread,

Mar 28, 2009, 11:12:54 PM3/28/09

to

John,

The intended design to detect and suppress bad reference/PPS clocks is
at least two additional sources, that do not have to be reference
clocks. If the reference/PPS clock sails to the sunset, the selection
algorithm will vote it off and the PPS will follow. The server will
continue at whatever surving source stratum plus one. This might not be
considered pefect, but it would avoid real disaster.

Dave

John Ackermann N8UR wrote:

Unruh

unread,

Mar 29, 2009, 12:16:23 AM3/29/09

to

obe...@es.net (Kevin Oberman) writes:

You mean the cdma clock will continue to deliver a PPS signal running it
off an internal drifting clock? A PPS is simply a pulse every second. How
does it deliver the information that it is bad? Or you know it is bad
because it starts to drift with respect to other clocks you trust more?
And why do you trust them more?

>to actual time. Unfortunately, even though the time from the clock is
>marked as unsynced and ntpd stops using that time, it continues train
>the time with the drifting PPS signal, so the time slowly drifts away
>from where it should be.

What refclock driver are you using?

>Ideally, if the source of the time being trained by the PPS is bad, the
>PPS also should be considered bad and kernel PPS should be disabled.

And how does it know that the time being trained by the PPS is bad?

Kevin Oberman

unread,

Mar 29, 2009, 1:30:07 AM3/29/09

to

> From: Unruh <unruh...@physics.ubc.ca>
> Date: Sun, 29 Mar 2009 04:16:23 GMT
> Sender: questions-bounces+oberman=es....@lists.ntp.org

Yes. The clock free runs and continues to deliver PPS and time from it's
internal oscillator (which is quite a bit more accurate than most PC
oscillators, but still drifts over a period of days). It marks the time
as unsynchronized and ntpd treats it as such. It does not use it.

I know it is bad because I have 25 clocks to compare with. Of those, 20
are stratum 1, so it's pretty obvious when the others show time within a
few microseconds and this one is off by milliseconds after a couple of
weeks or so.

>
>
> >to actual time. Unfortunately, even though the time from the clock is
> >marked as unsynced and ntpd stops using that time, it continues train
> >the time with the drifting PPS signal, so the time slowly drifts away
> >from where it should be.
>
> What refclock driver are you using?

TrueTime (5) and PPS (22). The clock is an Endrun Technologies Prćcis Ct.
http://endruntechnologies.com/network-time-source.htm

>
> >Ideally, if the source of the time being trained by the PPS is bad, the
> >PPS also should be considered bad and kernel PPS should be disabled.
>
> And how does it know that the time being trained by the PPS is bad?

It is marked as unsynchronized by the clock.

John Ackermann N8UR

unread,

Mar 29, 2009, 7:21:21 AM3/29/09

to

David Mills said the following on 03/28/2009 11:12 PM:

> John,
>
> The intended design to detect and suppress bad reference/PPS clocks is
> at least two additional sources, that do not have to be reference
> clocks. If the reference/PPS clock sails to the sunset, the selection
> algorithm will vote it off and the PPS will follow. The server will
> continue at whatever surving source stratum plus one. This might not be
> considered pefect, but it would avoid real disaster.

Dave, I fully agree with you that the PPS signal should be subject to
the selection algorithm. It's the tyranny of the prefer peer that I
object to.

John

alko...@googlemail.com

unread,

Mar 29, 2009, 10:37:57 AM3/29/09

to

On Mar 29, 5:12 am, mi...@udel.edu (David Mills) wrote:
> John,
>
> The intended design to detect and suppress bad reference/PPS clocks is
> at least two additional sources, that do not have to be reference
> clocks. If the reference/PPS clock sails to the sunset, the selection
> algorithm will vote it off and the PPS will follow.

In my case I would trust my PPS signal much more than any other
source. Why should I run a caesium frequency normal and not trust it?

Rob

unread,

Mar 29, 2009, 10:57:17 AM3/29/09

to

The point is that a PPS signal by itself does not provide any information
about its reliability. PPS signals sent by GPS receivers can go
unsynchronized simply because the antenna gets disconnected or covered,
and the receiver will happily send a pulse every second that is drifting
away at a rate determined by the quality of a quartz crystal.

David Mills

unread,

Mar 29, 2009, 12:37:06 PM3/29/09

to

John,

The intersection algorithm has been documented in several places along
with configuration controls to modify its behavior. Is the tyranny you
cite due to that algorithm or the notion of the prefer peer in the first
place? If the latter, do you have an alternate suggestion?

Dave

Unruh

unread,

Mar 29, 2009, 1:24:42 PM3/29/09

to

mi...@udel.edu (David Mills) writes:

>John,

>The intersection algorithm has been documented in several places along
>with configuration controls to modify its behavior. Is the tyranny you
>cite due to that algorithm or the notion of the prefer peer in the first
>place? If the latter, do you have an alternate suggestion?

Well, I would suggest that the atom driver has a flag-- a fudge or
something, which would dissociate it from any prefer peer once intial lock
had been obtained. Ie, itwould regard the PPS as the best time source
whether any prefer peer exists or not. That way , if someone wants the
current behaviour they can have it, and if they want the PPS to take
precedence they can have that as well. IF you are going to have a single
PPS driver, which you appear to want, then it is a good idea to make it
very flexible and able to be set up to the user's desires.

David Mills

unread,

Mar 29, 2009, 1:09:53 PM3/29/09

to

alkopedia,

That's not the point. No matter how much you trust the Cs, do you trust
the seconds numbering? Say you reliably number the seconds and then
disconnect the numbering source. Obviously, you have to reestablish
nunmbering every time you reboot. Would you require renumbering when the
daemon is restarted? Do you assume nothing happens that might torque the
clock to another second, such as a stuck interrupt, or any hardware
disruption. How long are you willing to wait before requiring
renumbering? A day, a week, forever?

There is a really simple thing to do exactly as you wish. Use the
configuration I recommended and enable orphan mode. This works only if
the kernel PPS is operating.

Dave

alko...@googlemail.com wrote:

John Ackermann N8UR

unread,

Mar 29, 2009, 1:00:21 PM3/29/09

to

It's the concept of requiring a single prefer peer in the case where the
PPS signal doesn't provide its own timecode.

Both my LORAN-C and my Cesium sources provide PPS without accompanying
timecode (they are on different servers, so this isn't the multiple-PPS
problem). I have to pick another server to be the prefer peer for each.
If that server fails, the PPS signal is ignored, even if the PPS is
still valid, and even if the PPS is still within a sane value of the
other servers in the selection set.

Most people have a refclock that provides both PPS and timecode, and in
that case treating the two together makes perfect sense. But that's not
the case I'm concerned about.

John
----

David Mills said the following on 03/29/2009 12:37 PM:

Unruh

unread,

Mar 29, 2009, 2:25:47 PM3/29/09

to

mi...@udel.edu (David Mills) writes:

>alkopedia,

>That's not the point. No matter how much you trust the Cs, do you trust
>the seconds numbering? Say you reliably number the seconds and then
>disconnect the numbering source. Obviously, you have to reestablish
>nunmbering every time you reboot. Would you require renumbering when the
>daemon is restarted? Do you assume nothing happens that might torque the
>clock to another second, such as a stuck interrupt, or any hardware
>disruption. How long are you willing to wait before requiring
>renumbering? A day, a week, forever?

Clearly reboot requires a renumbering. Other than that,
let him decide. Ie, have a flag together telling the atom driver to trust
the PPS (with the local clock's numbering of the seconds) with the number of seconds a free
running PPS will be trusted for.
Eg, no fudge1, do not trust it without a peer to number the seconds, fudge1 0 trust it forever, fudge1 86400
trust it for a day, etc.
In all cases do not trust it until the system clock has been brought within .25
sec, say, by another clock.

alko...@googlemail.com

unread,

Mar 29, 2009, 6:10:07 PM3/29/09

to

On Mar 29, 8:25 pm, Unruh <unruh-s...@physics.ubc.ca> wrote:
> Eg, no fudge1, do not trust it without a peer to number the seconds, fudge1 0 trust it forever, fudge1 86400
> trust it for a day, etc.
> In all cases do not trust it until the system clock has been brought within .25
> sec, say, by another clock.

In my opionion this is the perfect solution for this kind of problem.
Of course numbering is needed, but as long as the system clock is
synced by PPS it is very unlikely that system time falls to another
second.
For the momen I've configured a test server to use LOCAL as prefer
peer and PPS. This works fine as long as system time is in the right
second at ntpd startup.

Martin Burnicki

unread,

Apr 3, 2009, 4:28:14 PM4/3/09

to

Unruh wrote:
> mi...@udel.edu (David Mills) writes:
>
>>John,
>
>>The intersection algorithm has been documented in several places along
>>with configuration controls to modify its behavior. Is the tyranny you
>>cite due to that algorithm or the notion of the prefer peer in the first
>>place? If the latter, do you have an alternate suggestion?
>
> Well, I would suggest that the atom driver has a flag-- a fudge or
> something, which would dissociate it from any prefer peer once intial lock
> had been obtained. Ie, itwould regard the PPS as the best time source
> whether any prefer peer exists or not. That way , if someone wants the
> current behaviour they can have it, and if they want the PPS to take
> precedence they can have that as well. IF you are going to have a single
> PPS driver, which you appear to want, then it is a good idea to make it
> very flexible and able to be set up to the user's desires.

If the PPS input is assigned to a refclock then the current behaviour is to
mark the associated refclock as "prefer". Unfortunately you can only mark
only other other time source as "prefer", and if you'd mark
several "prefer" sources that would be at least ambiguous.

Maybe a good way to solve this would be to do it the other way round.
Configure an atom driver and assign the pseudo IP of a refclock to it, e.g.
using a fudge command or a keyword on the configuration line. This could
tell the NTP kernel this PPS source is associated to that refclock, and it
could evaluate the refclock sync status to determine whether the PPS source
is freewheeling because the refclock fell out of sync, or not.

This would also provide a simple way to configure several PPS sources in
parallel, each associated to the configured refclock.

And, if no refclock association is defined for a PPS source, ntpd could
assume this is from an independent PPS source like a cesium.

Martin
--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

Martin Burnicki

unread,

Apr 3, 2009, 4:42:36 PM4/3/09

to

Unruh wrote:
> You mean the cdma clock will continue to deliver a PPS signal running it
> off an internal drifting clock? A PPS is simply a pulse every second. How
> does it deliver the information that it is bad? Or you know it is bad
> because it starts to drift with respect to other clocks you trust more?
> And why do you trust them more?

If you look at this from the refclock's point of view this may make sense.

If the refclock starts up the time is not synchronized, and the PPS output
may be disabled. After the refclock has synchronized to its time source,
e.g. the GPS sats, status changes to synchronized and the PPS output is
enabled.

If the refclock *then* fails to receive the satellites the internal timing
is not immediately bad since many refclocks provide an oscillator which is
magnitudes better than the cheap chrystal in a PC. If the refclock
additionally tells its "time consumer" (ntpd or whatever) via some status
flags that it has lost synchronization the the "time consumer" can decide
whether to still accept the refclock plus PPS as time source, or not.

The now unsynchronized refclock may still be accepted if no other reliable
time source is available, and may be discarded if there's a better time
source.

Configuration of this could be simplified using the proposal in my other
post, i.e. assign a refclock's pseudo UIP to a PPS source instead of
marking one refclock as "prefer".

BTW, this is also how ntpd itself works. If an ntpd node syncs to an
upstream time source it declares itself as synchronized with a good
stratum, depending on the stratum of the upstream source. After the
upstream source has become unreachable the ntpd node keeps claiming to be
synchronized, at least as indicated by the leap bits, and it also keeps its
stratum, at least if no other time source has been configured which is
still reachable.