WARNING: someone's faking a leap second tonight

Marco Marongiu

unread,

Jul 31, 2012, 4:23:35 PM7/31/12

to

Hi all

This is just to warn you that there are now some NTP servers around the
globe spreading a leap second announcement for tomorrow 00:00:00 UTC
(so, basically, in a few hours now).

If you didn't take action before the leapocalypse last month, you better
hurry now.

Ciao
-- bronto

jcle...@gmail.com

unread,

Jul 31, 2012, 10:40:11 PM7/31/12

to

Yes, this affected us. Can someone explain why this was done? Was it designed to be a test of some kind? The Linux leap second kernel bug that was discovered a month ago was only patched on July 17; that patched kernel has presumably not made it to many (most?) people yet. So if it's a test it seems wildly premature.

E-Mail Sent to this address will be added to the BlackLists

unread,

Aug 1, 2012, 1:05:20 AM8/1/12

to

jcle...@gmail.com wrote:
> Can someone explain why this was done?

(Shrug)

If they were pool clocks, anything is possible,
occasionally the date appears to change on some of them.

--
E-Mail Sent to this address <Blac...@Anitech-Systems.com>
will be added to the BlackLists.

Marco Marongiu

unread,

Aug 1, 2012, 4:28:49 AM8/1/12

to

On 01/08/12 04:40, jcle...@gmail.com wrote:
> Yes, this affected us. Can someone explain why this was done? Was
> it designed to be a test of some kind? The Linux leap second kernel
> bug that was discovered a month ago was only patched on July 17; that
> patched kernel has presumably not made it to many (most?) people yet.
> So if it's a test it seems wildly premature.

I tried to collect some information around the globe, but with scarce/no
feedback. I am *suspecting* that this could be a rather imaginative
attempt to DOS worldwide.

Anyway, a colleague of mine is now hunting down some upstreams that
faked the leap second. If we get something out of his research, I'll let
you know.

Ciao
-- bronto

Marco Marongiu

unread,

Aug 1, 2012, 8:58:53 AM8/1/12

to

On 01/08/12 10:28, Marco Marongiu wrote:
> I tried to collect some information around the globe, but with scarce/no
> feedback. I am *suspecting* that this could be a rather imaginative
> attempt to DOS worldwide.
>
> Anyway, a colleague of mine is now hunting down some upstreams that
> faked the leap second. If we get something out of his research, I'll let
> you know.

While my colleague is working with a stratum 1 timekeeper to investigate
this better, I called the people at INRiM in Italy -- INRiM is the
institution responsible for the official Italian time
(http://www.inrim.it/index.shtml). Mr.Pettiti confirmed there was *no*
leap second scheduled yesterday (as we all suspected, right?), so that
is definitely a fake.

It may well be a DOS attempt, but as another colleague of mine suggests,
it could also be a bug in some upstream servers, which didn't disarm the
leap second after June 30th, and propagated it again yesterday.

Question now is: assuming those servers were running ntpd, was such a
bug reported at some point?

Ciao
-- bronto

demo...@gmail.com

unread,

Aug 1, 2012, 10:41:17 AM8/1/12

to

Hi, I tried to find some info because there was other leap second, but I didn't find anything about this issue. Does somebody has some info what happened or know if it was a DOS atack or if it was a problem of the ntp services (I'm using the ntp Dabian pools)?

Thanks in advance

Marco Marongiu

unread,

Aug 1, 2012, 10:16:09 AM8/1/12

to

On 01/08/12 14:58, Marco Marongiu wrote:
> Question now is: assuming those servers were running ntpd, was such a
> bug reported at some point?

Plus, another question. If one uses the leapfile, are spurious leap
second notifications like this one discarded?

From the docs at http://doc.ntp.org/4.2.6/ntpd.html#leap I can't
understand if the leapseconds file is authoritative at the point that,
if a leap second notification is received for a leap second not in the
file, it is discarded. If so, that would help to avoid spurious leap
seconds like this one.

Ciao
-- bronto

steven Sommars

unread,

Aug 1, 2012, 10:33:25 AM8/1/12

to

I've seen no evidence of a denial of service attack, bugs are more likely.
Several stratum one servers have been advertising LI=1 continuously for
the past month. Others alternate between LI=0 and LI=1.
Most servers claim to run ntpd.

There are over 10 stratum one's that advertise LI=1 as of Wed Aug 1
14:18:51 UTC 2012. Unless this changes another false leap second could
occur on August 31, 2012

> _______________________________________________
> questions mailing list
> ques...@lists.ntp.org
> http://lists.ntp.org/listinfo/questions
>

Rob

unread,

Aug 1, 2012, 11:54:49 AM8/1/12

to

steven Sommars <steveso...@gmail.com> wrote:
> I've seen no evidence of a denial of service attack, bugs are more likely.
> Several stratum one servers have been advertising LI=1 continuously for
> the past month. Others alternate between LI=0 and LI=1.
> Most servers claim to run ntpd.
>
> There are over 10 stratum one's that advertise LI=1 as of Wed Aug 1
> 14:18:51 UTC 2012. Unless this changes another false leap second could
> occur on August 31, 2012

When a leapsecond occurs as a result of these bits that is a bug on
its own, because leapseconds can only occur at the end of a quarter.

demo...@gmail.com

unread,

Aug 1, 2012, 1:36:15 PM8/1/12

to

All,

I found that: http://www.greyware.com/kb/kb2012.717.asp

One of my internal NTP servers has the leap flag set to 01. The fake leap second issue was produced in the servers where this NTP server is the time source preferred, so I guess that it was my problem.

In order to check the leap flag I used ntpq.

First I checked what is the preferred NTP server with the "peers" command.
After that I ran the "associations" command in order to obtain the assids of the NTP servers. When I had the assids, I check the status of the NTP servers using the "pstatus" command (usage: pstatus ASSISD_NUMBER). In the "pstatus" output I saw that the "leap" flag is set to 01.

$ ntpq
ntpq> peers
remote refid st t when poll reach delay offset jitter
==============================================================================
+ntp0.example.net 76.20.162.121 3 u 783 1024 377 0.296 2.173 0.772
*ntp1.example.net 187.35.120.100 3 u 943 1024 377 0.339 -1.158 0.481
LOCAL(0) .LOCL. 10 l 10h 64 0 0.000 0.000 0.000
ntpq> associations
ind assid status conf reach auth condition last_event cnt
===========================================================
1 12345 9424 yes yes none candidate reachable 2
2 12346 963a yes yes none sys.peer sys_peer 3
3 12347 8043 yes no none reject unreachable 4
ntpq> pstatus 12346
associd=123456 status=9424 conf, reach, sel_candidate, 2 events, reachable,
srcadr=ntp1.example.net, srcport=123, dstadr=192.168.10.1, dstport=123,
leap=01, stratum=3, precision=-20, rootdelay=124.969, rootdisp=63.766,
refid=187.35.120.100,
reftime=d3c3d39e.c685fc7f Wed, Aug 1 2012 16:11:10.775,
rec=d3c3d7ac.5f9dfe79 Wed, Aug 1 2012 16:28:28.373, reach=377,
unreach=0, hmode=3, pmode=4, hpoll=10, ppoll=10, headway=0, flash=00 ok,
keyid=0, offset=2.173, delay=0.296, dispersion=15.264, jitter=0.772,
xleave=0.012,
filtdelay= 0.30 0.33 0.30 0.36 0.35 0.35 0.35 0.30,
filtoffset= 2.17 2.01 1.80 1.68 1.49 1.25 1.14 1.00,
filtdisp= 0.00 15.75 31.91 47.49 63.14 78.63 94.43 109.95
ntpq> quit
$

Regards.

steven Sommars

unread,

Aug 1, 2012, 1:08:40 PM8/1/12

to

The main standard says a leap second is allowed in any month. That's what
the reference ntpd does.
See ITU-R, TF460, STANDARD-FREQUENCY AND TIME-SIGNAL EMISSIONS.
This link may work:
http://www.itu.int/dms_pubrec/itu-r/rec/tf/R-REC-TF.460-6-200202-I!!PDF-E.pdf

On the other hand Bulletin C (
ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat) says December or June.

Take your pick.

praritb...@gmail.com

unread,

Aug 1, 2012, 1:24:45 PM8/1/12

to

> There are over 10 stratum one's that advertise LI=1 as of Wed Aug 1
>
> 14:18:51 UTC 2012. Unless this changes another false leap second could
>
> occur on August 31, 2012

Steven, can you point me to one of those servers? The ones that I've checked all seem to have LI=0.

Thanks!

P.

jcle...@gmail.com

unread,

Aug 1, 2012, 2:25:37 PM8/1/12

to

(for those seeing this a second time, I apologize)

Hi Steven,

Thanks for the research - very interesting. Which stratum-1 servers are still advertising LI=01? Is it possible to contact their administrators to learn why they might be erroneously advertising? Can you see if those servers have anything in common?

How are the leap-second flags meant to be cleared after a leap second? Is it supposed to be automatic? Is there a bug in some code (ntpd or elsewhere) that is failing to clear the flag in (some versions of) ntp server software? I did check earlier this morning and I was unable to find a bug filed against ntpd regarding this issue - does anyone know if we should go ahead and file a bug? It'd be nice to have more information on whether this is really an ntpd issue.

In general it certainly sounds like there is some brittleness somewhere in the mechanism for clearing the leap-second (LI) flags after the leap second occurs.

Thanks,
--Jeff

Martin Burnicki

unread,

Aug 1, 2012, 3:51:01 PM8/1/12

to

Hi all,

Right after the real leap second we had some reports from customers who had
observed that the internal leap bits on their NTP servers had not been
cleared after the leap second had occurred.

It turned out this happened with some older versions of ntpd when the
customers had installed e.g. 3 or 4 servers for redundancy, and each NTP
server had the other ones configured as upstream server (personally I know
this is not a good configuration, but they did it anyway). This
configuration seemed to result in a kind of feedback loop for the leap
warning.

I'm assuming e.g. an NTP server didn't clear the internal leap status
immediately after the leap second, and when another server polled this one
it still received the leap warning from its upstream server and thus kept
it. The next one received the leap warning from the other two servers, and
so on.

This state was persistant for several days after the leap date, and of
course even if ntpd was restarted on only one server it immediatedly
received the leap warning from the other servers again.

The only fix we found was to kill ntpd on *all* involved servers first, so
no ntpd service was available anymore, and then restart the daemons on all
servers one after the other. This caused just a short interruption of the
NTP service for other clients in their networks, but obviously broke up the
leap warning feedback loop.

I didn't expect such behaviour since usually the stratum should have been
used to avoid timing loops, which I would expect to include leap warning
loops, and especially I wouldn't expect that current versions of ntpd
(4.2.6 and newer) could behave this way since they need a majority of
upstream servers providing a leap warning to accept this warning. Hmm, on
the other hand, if all servers have the leap bits set then there *is* a
majority ...

Anyway, in my opinion it sounds plausible that NTP servers which have hit
such a state will cause the insertion of another leap seconds at the end of
every month, unless the feedback loop is interrupted, especially if this
should happen with NTP versions where the plausibility check for a leap
second only checks for the end of a month (like in current versions) and
not for the end of June or December (like in earlier versions of ntpd).

Martin
--
Martin Burnicki

Meinberg Funkuhren
Bad Pyrmont
Germany

E-Mail Sent to this address will be added to the BlackLists

unread,

Aug 1, 2012, 6:11:51 PM8/1/12

to

Martin Burnicki wrote:
> It turned out this happened with some older versions of
> ntpd when the customers had installed e.g. 3 or 4 servers
> for redundancy, and each NTP server had the other ones
> configured as upstream server (personally I know this is
> not a good configuration, but they did it anyway).

What kind of association were they, Peer, Server, AnyCast, ...?

Chris Adams

unread,

Aug 1, 2012, 9:17:37 PM8/1/12

to

Once upon a time, Marco Marongiu <bront...@gmail.com> said:
>This is just to warn you that there are now some NTP servers around the
>globe spreading a leap second announcement for tomorrow 00:00:00 UTC
>(so, basically, in a few hours now).

I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
stratum-2 server in the US pool (a few of my systems have it in their
list).

--
Chris Adams <cma...@hiwaay.net>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

Dave Hart

unread,

Aug 2, 2012, 1:57:43 AM8/2/12

to

On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
> I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
> stratum-2 server in the US pool (a few of my systems have it in their
> list).

That particular system seems to have corrected its leap indication,
but plenty of other pool participants are advertising leap. I have
this laptop set to associate with every IP in a list of all pool
servers as of late June. The following are showing leap=01 now:

srcadr=109.168.118.249, leap=01
srcadr=109.237.210.31, leap=01
srcadr=131.155.140.129, leap=01
srcadr=131.155.140.130, leap=01
srcadr=138.100.11.74, leap=01
srcadr=141.138.201.22, leap=01
srcadr=143.121.199.173, leap=01
srcadr=147.83.123.133, leap=01
srcadr=149.12.192.46, leap=01
srcadr=150.214.94.5, leap=01
srcadr=157.88.36.37, leap=01
srcadr=160.94.245.44, leap=01
srcadr=161.53.131.232, leap=01
srcadr=161.53.248.35, leap=01
srcadr=164.107.116.179, leap=01
srcadr=168.243.227.194, leap=01
srcadr=168.243.227.195, leap=01
srcadr=168.243.235.130, leap=01
srcadr=176.31.112.194, leap=01
srcadr=178.218.172.164, leap=01
srcadr=178.237.34.94, leap=01
srcadr=178.63.46.16, leap=01
srcadr=192.87.106.2, leap=01
srcadr=192.87.106.3, leap=01
srcadr=192.87.36.4, leap=01
srcadr=193.110.137.171, leap=01
srcadr=193.110.157.147, leap=01
srcadr=193.2.111.2, leap=01
srcadr=193.2.111.3, leap=01
srcadr=193.2.4.2, leap=01
srcadr=193.2.78.228, leap=01
srcadr=193.77.222.200, leap=01
srcadr=193.77.237.128, leap=01
srcadr=193.95.229.133, leap=01
srcadr=194.171.167.130, leap=01
srcadr=194.249.198.30, leap=01
srcadr=194.33.191.69, leap=01
srcadr=194.88.212.200, leap=01
srcadr=194.88.212.205, leap=01
srcadr=195.214.215.17, leap=01
srcadr=195.23.33.83, leap=01
srcadr=195.239.199.18, leap=01
srcadr=195.55.174.243, leap=01
srcadr=2001:15c0:65ff:612::2, leap=01
srcadr=2001:15c0:65ff:61e::2, leap=01
srcadr=2001:1af8:4400:a00e:1::1, leap=01
srcadr=2001:4168:1::1, leap=01
srcadr=2001:4168:3::2, leap=01
srcadr=2001:4178:2:1277::10, leap=01
srcadr=2001:4428:0:13::10, leap=01
srcadr=2001:470:5:3a9::1, leap=01
srcadr=2001:470:a:732::2, leap=01
srcadr=2001:470:ea33:194:215:17ff:fe29:1fb2, leap=01
srcadr=2001:4d88:1ffc:4c6::1, leap=01
srcadr=2001:610:1108:5010::129, leap=01
srcadr=2001:610:1108:5010::130, leap=01
srcadr=2001:858:6:201::82, leap=01
srcadr=2001:8b0:ff80:3::123:2, leap=01
srcadr=2001:8c8:0:100::3, leap=01
srcadr=202.21.137.10, leap=01
srcadr=204.45.7.82, leap=01
srcadr=207.230.199.131, leap=01
srcadr=208.69.56.110, leap=01
srcadr=208.83.212.8, leap=01
srcadr=212.158.160.166, leap=01
srcadr=212.244.36.227, leap=01
srcadr=212.244.36.228, leap=01
srcadr=212.244.36.232, leap=01
srcadr=213.129.242.82, leap=01
srcadr=213.194.159.3, leap=01
srcadr=213.206.85.20, leap=01
srcadr=213.98.169.180, leap=01
srcadr=216.46.5.9, leap=01
srcadr=217.130.246.182, leap=01
srcadr=217.147.208.1, leap=01
srcadr=217.147.223.78, leap=01
srcadr=217.153.128.243, leap=01
srcadr=217.26.64.149, leap=01
srcadr=217.75.72.153, leap=01
srcadr=219.117.206.46, leap=01
srcadr=2600:3c00::f03c:91ff:fe96:398d, leap=01
srcadr=2a00:1158:3::93, leap=01
srcadr=2a00:1188:5:2::8, leap=01
srcadr=2a01:4f8:131:5463::2, leap=01
srcadr=2a01:e0b:1:88:215:17ff:fe9c:a5f7, leap=01
srcadr=2a02:348:80:c916::1, leap=01
srcadr=2a02:770:100:100::108, leap=01
srcadr=2a02:a80:0:4088::123, leap=01
srcadr=31.222.163.13, leap=01
srcadr=46.166.148.103, leap=01
srcadr=46.18.10.10, leap=01
srcadr=46.18.11.10, leap=01
srcadr=46.19.33.2, leap=01
srcadr=46.252.27.138, leap=01
srcadr=46.4.38.21, leap=01
srcadr=50.116.55.161, leap=01
srcadr=62.212.76.57, leap=01
srcadr=62.244.82.30, leap=01
srcadr=63.224.25.253, leap=01
srcadr=64.22.125.197, leap=01
srcadr=67.209.225.216, leap=01
srcadr=69.65.33.188, leap=01
srcadr=72.14.178.210, leap=01
srcadr=74.93.97.68, leap=01
srcadr=74.93.97.69, leap=01
srcadr=77.226.252.14, leap=01
srcadr=77.245.91.218, leap=01
srcadr=77.94.135.133, leap=01
srcadr=78.107.251.222, leap=01
srcadr=80.121.153.134, leap=01
srcadr=80.121.153.135, leap=01
srcadr=80.121.153.136, leap=01
srcadr=80.239.2.130, leap=01
srcadr=80.250.160.134, leap=01
srcadr=80.28.229.87, leap=01
srcadr=81.167.109.120, leap=01
srcadr=81.174.128.183, leap=01
srcadr=81.187.35.170, leap=01
srcadr=81.93.163.20, leap=01
srcadr=81.93.163.23, leap=01
srcadr=82.197.80.125, leap=01
srcadr=82.207.122.111, leap=01
srcadr=82.240.45.223, leap=01
srcadr=83.161.136.26, leap=01
srcadr=83.229.137.50, leap=01
srcadr=83.98.201.133, leap=01
srcadr=83.98.201.134, leap=01
srcadr=84.55.229.6, leap=01
srcadr=84.88.69.32, leap=01
srcadr=85.12.35.12, leap=01
srcadr=85.130.119.200, leap=01
srcadr=85.158.249.144, leap=01
srcadr=85.17.71.101, leap=01
srcadr=85.219.231.220, leap=01
srcadr=85.236.42.140, leap=01
srcadr=85.252.162.7, leap=01
srcadr=85.89.165.69, leap=01
srcadr=86.61.66.23, leap=01
srcadr=87.194.136.216, leap=01
srcadr=87.99.63.100, leap=01
srcadr=88.191.88.195, leap=01
srcadr=88.191.90.195, leap=01
srcadr=90.155.74.40, leap=01
srcadr=91.198.87.118, leap=01
srcadr=91.217.142.1, leap=01
srcadr=94.126.19.139, leap=01
srcadr=94.26.2.134, leap=01
srcadr=94.72.251.177, leap=01
srcadr=95.130.12.88, leap=01
srcadr=95.211.11.181, leap=01
srcadr=95.211.7.153, leap=01
srcadr=98.191.213.7, leap=01

Cheers,
Dave Hart

Martin Burnicki

unread,

Aug 2, 2012, 5:04:59 AM8/2/12

to

E-Mail Sent to this address will be added to the BlackLists wrote:
> Martin Burnicki wrote:
>> It turned out this happened with some older versions of
>> ntpd when the customers had installed e.g. 3 or 4 servers
>> for redundancy, and each NTP server had the other ones
>> configured as upstream server (personally I know this is
>> not a good configuration, but they did it anyway).
>
> What kind of association were they, Peer, Server, AnyCast, ...?

I remember at least one setup with 4 identical servers, where each
server had a local GPS refclock configured (which for sure didn't send
the leap warning anymore after the real leap second had passed) and in
addition had simple "server" entries for the other 3 servers.

Unfortunately this was mostly handled on the phone, so I don't have any
records with details.

Anyway, I'll see if we can install a similar setup here to see if we can
duplicate this behaviour.

Jeffrey Lerman

unread,

Aug 1, 2012, 2:00:31 PM8/1/12

to

Hi Steven,

Thanks for the research - very interesting. Which stratum-1 servers are
still advertising LI=01? Is it possible to contact their administrators
to learn why they might be erroneously advertising? Can you see if those
servers have anything in common?

How are the leap-second flags meant to be cleared after a leap second?
Is it supposed to be automatic? Is there a bug in some code (ntpd or
elsewhere) that is failing to clear the flag in (some versions of) ntp
server software? I did check earlier this morning and I was unable to
find a bug filed against ntpd regarding this issue - does anyone know if
we should go ahead and file a bug? It'd be nice to have more
information on whether this is really an ntpd issue.

In general it certainly sounds like there is some brittleness somewhere
in the mechanism for clearing the leap-second (LI) flags after the leap
second occurs.

Thanks,
--Jeff

> ques...@lists.ntp.org <mailto:ques...@lists.ntp.org>
> http://lists.ntp.org/listinfo/questions
>
>

Martin Burnicki

unread,

Aug 2, 2012, 10:20:03 AM8/2/12

to

Jeffrey Lerman wrote:
> How are the leap-second flags meant to be cleared after a leap second?
> Is it supposed to be automatic? Is there a bug in some code (ntpd or
> elsewhere) that is failing to clear the flag in (some versions of) ntp
> server software?

I've just run some tests. On a test machine:

- configured ntpd to use the current leap second file
- configured the local clock as only ref time source
- set the system date/time to 2012-06-30 23:59:45 UTC
- started ntpd

On a different machine:

- ran a test program which sends 4 requests/s to the test machine
and prints the contents of the reply packets, including leap status

Found that with both the current stable version (4.2.6p5) and a current
dev version (4.2.7p290) the leap second warning in the reply packets
already disappeared shortly *before* the leap second actually occurred.

This means if this server sends a reply to a client shortly before the
leap second the leap warning may have already been turned off, and thus
the client might *disarm* the leap second shortly before the leap second
occurs. This sounds like a bug to me, so I'm going to file a bug report
for this.

Anyway, this does *not* seem to be directly related to the actual
problem where the leap bit is not reset at all, or is set again if
there's a time source which still has the bit set immediately after the
leap second.

For completeness I've repeated the same test with the latest version of
the 4.2.4 branch, namely 4.2.4p8. This version of ntpd resets the leap
warning bit in the leap status sent to clients a few seconds *after* the
leap second, so this could be a possible issue for clients accepting a
new leap warning immediately after a leap second has occurred.

> I did check earlier this morning and I was unable to
> find a bug filed against ntpd regarding this issue - does anyone know if
> we should go ahead and file a bug? It'd be nice to have more
> information on whether this is really an ntpd issue.

I'm sure a bug will be filed, but eventually we should first find out
more details so we can write an appropriate summary of the issue.

steven Sommars

unread,

Aug 2, 2012, 10:18:49 AM8/2/12

to

One root cause involves a group of stratum one's peering each other. A
leap indicator can continue to circulate until the peering changes, or the
entire group is simultaneously reinitialized. This affects multiple
commercial server brands.
Is this a problem with some/all versions of ntpd?

> ______________________________**_________________
> questions mailing list
> ques...@lists.ntp.org
> http://lists.ntp.org/listinfo/**questions<http://lists.ntp.org/listinfo/questions>
>

Jeffrey Lerman

unread,

Aug 2, 2012, 12:10:42 PM8/2/12

to

Assuming that the current ntpd design spec is that:

a) Leap second flags can be cleared by EITHER the passage of the
actual leap second, OR the receipt (at any time) of a LI=0 from the
current upstream server
b) Leap second flags can be set by receipt (at any time) of LI != 00
from the current upstream server (or also from reading a leap-second file?)

Then it seems that it's important that the leap status in reply packets
be correct at all times. If there is even the slightest deviation in
the moment at which a server's reply packet LI value changes to 00, then
there will be trouble -- too soon and we risk clients missing the leap,
too late and we risk arming a bogus leap.

Bearing in mind that I don't know if the actual design spec matches what
I wrote in a) above, but assuming for argument's sake that it does: It
seems to me that that's a brittle system. One way to add robustness
would be to program clients to independently set LI=00 during, say the
first half of the month, and to ignore the LI value from servers during
that time. That provides a (very long!) buffer time during which the
server can get around to zeroing the LI field, and should prevent bogus
seconds from propagating easily.

Thoughts?

--Jeff

Richard B. Gilbert

unread,

Aug 2, 2012, 4:42:14 PM8/2/12

to

Does ANYONE use a stratum 10 server? If so, how good is the time?

Could a dripping faucet do as well or better?

E-Mail Sent to this address will be added to the BlackLists

unread,

Aug 2, 2012, 5:56:36 PM8/2/12

to

Richard B. Gilbert wrote:> Does ANYONE use a stratum 10 server?

Sure all the way up to S 15.
I rarely see them getting past S 4 in practice,
except for when fudged / orphaned to a higher stratum.

> If so, how good is the time?

That would mostly depend on the dispersion
of every server increasing from the root?
Worst of which would being network latency jitter?
Although the poll rate is going to add some?

Likely at least 10ms out,
in practice probably low 100s of ms out from the root,
if they were all non-LAN internet servers
to each other neighbor.

If you have two remote internet locations,
with five free computers at each location,
you could find out for yourself.
GPS->A1->B2->A3->B4->A5->B6->A7->B8->A9->B10,
and have A1 noselect B10 for comparison.

I guess you could do it with four
at each remote internet location,
if you started with a S2 pool server,
and no selected the S0 & S10 on your S3.

--
E-Mail Sent to this address <Blac...@Anitech-Systems.com>
will be added to the BlackLists.

Martin Burnicki

unread,

Aug 3, 2012, 4:53:42 AM8/3/12

to

steven Sommars wrote:
> One root cause involves a group of stratum one's peering each other. A
> leap indicator can continue to circulate until the peering changes, or the
> entire group is simultaneously reinitialized. This affects multiple
> commercial server brands.
> Is this a problem with some/all versions of ntpd?

Which versions of ntpd are affected is exactly what we need to find out.

As I've mentioned earlier the reports I had heard were from some really
old versions of ntpd (4.2.0*). I don't think there are many pool servers
running such an old version of ntpd, so for me it sounds like this
problem still persists in relatively current versions of ntpd.

Martin Burnicki

unread,

Aug 3, 2012, 5:09:09 AM8/3/12

to

Jeffrey Lerman wrote:
> Assuming that the current ntpd design spec is that:
>
> a) Leap second flags can be cleared by EITHER the passage of the
> actual leap second, OR the receipt (at any time) of a LI=0 from the
> current upstream server
> b) Leap second flags can be set by receipt (at any time) of LI != 00
> from the current upstream server (or also from reading a leap-second file?)
>
> Then it seems that it's important that the leap status in reply packets
> be correct at all times.

On Unix-like systems supporting kernel PLL the leap second warning
received from an upstream server, from a refclock, or from a leap second
file is simply passed down to the kernel, starting an hour or so before
the leap event time.

The kernel then cares about handling of the leap second, so ntpd has to
wait until the kernel has finished doing so, and then ntpd can disarm
its internal leap second warning which is passed to its clients.

> If there is even the slightest deviation in
> the moment at which a server's reply packet LI value changes to 00, then
> there will be trouble -- too soon and we risk clients missing the leap,
> too late and we risk arming a bogus leap.

In my opinion "too soon" would have more impact here since clients could
disarm their leap second handling if they receive a reply from a server
without leap warning shortly before the leap second.

It would be very hard to have ntpd disarm its leap status immediately
when the kernel has finished handling the leap second. It could be
polling the kernel's status to see when the kernel has cleared his leap
flag, but the result would also depend on *when* the *kernel* clears the
leap flag.

> Bearing in mind that I don't know if the actual design spec matches what
> I wrote in a) above, but assuming for argument's sake that it does: It
> seems to me that that's a brittle system. One way to add robustness
> would be to program clients to independently set LI=00 during, say the
> first half of the month, and to ignore the LI value from servers during
> that time. That provides a (very long!) buffer time during which the
> server can get around to zeroing the LI field, and should prevent bogus
> seconds from propagating easily.

That sounds reasonable, and I could have sworn ntpd would refuse to
accept a new leap second warning a few seconds after a leap second had
just occurred. However, this does not seem top be the case.

Maybe a restriction in ntpd to accept incoming leap warnings only during
the last days of a month would help.

E-Mail Sent to this address will be added to the BlackLists

unread,

Aug 3, 2012, 2:48:23 PM8/3/12

to

Martin Burnicki wrote:
>> clients to independently set LI=00 during, say the first
>> half of the month, and to ignore the LI value from
>> servers during that time.

I think you would have to be more exact than that.

LI is used for more than one thing.

<http://www.eecis.udel.edu/~mills/ntp/html/decode.html>

According to the doc, LI only applies to the current day?

Jeffrey Lerman

unread,

Aug 3, 2012, 5:04:08 PM8/3/12

to

That page appears to be out-of-date. The current protocol, for NTP
version 4, is here: http://www.ietf.org/rfc/rfc5905.txt

Note that there was a change from the earlier version, which did say
"current day". Also, the LI ("Leap Indicator") field is only used to
indicate presence/absence of an impending leap second.

The current doc says in part:

The fields and associated packet variables (in parentheses) are
interpreted as follows:

LI Leap Indicator (leap): 2-bit integer warning of an impending leap
second to be inserted or deleted in the last minute of the_*current
month*_with values defined in Figure 9.

+-------+----------------------------------------+
| Value | Meaning |
+-------+----------------------------------------+
| 0 | no warning |
| 1 | last minute of the day has 61 seconds |
| 2 | last minute of the day has 59 seconds |
| 3 | unknown (clock unsynchronized) |
+-------+----------------------------------------+

Figure 9: Leap Indicator

Technically, there should be no need for the 2-week buffer I suggested.
However, it shouldn't hurt, and seems likely to add robustness. The
true correct solution would be to ensure that ntpd clients pay as much
attention to LI=00 from a server as to LI != 00 (and to fix the bug
Martin filed, in which the LI field goes to 00 in the last second BEFORE
the leap second - oops). Then they would be able to recover gracefully
from a brief persistence of the LI=01 value past the leap second -
assuming that no stratum 1 servers erroneously persisted the LI value.
We really need to understand why that is happening - do we have version
info from the servers that are still doing that?

Another suggestion... Should ntpd require that a stratum-1 server has a
non-expired leap-second file, and that that file should override any
upstream server for the LI data?

--Jeff

On 8/3/2012 11:48 AM, E-Mail Sent to this address will be added to the

Jeffrey Lerman

unread,

Aug 3, 2012, 5:37:42 PM8/3/12

to

Oh, my mistake: I quote RFC5905 below, which is for NTPv4, which is
technically in _draft_ status - though it does seem pretty far along and
I believe current ntpd adheres to NTPv4, not v3.

For what it's worth the most recently approved protocol is, technically,
NTPv3, documented in RFC1305 - and that one does say "current day" -
though again, ntpd respects the end-of-month rule.

David Mills' website includes this page:

http://www.eecis.udel.edu/~mills/ntp/html/leap.html

There one can see two things:
- The idea to let the leap-seconds file override all else when it's
present is already apparently implemented. Good.
- This text:

"When an update is received from a reference clock or downstratum
server, the leap bits are inspected for all survivors of the cluster
algorithm. If the number of survivors showing a leap bit is greater
than half the total number of survivors, a pending leap condition
exists until the end of the current month."

Hmm. No mention of clearing the bit if an update is received that does
NOT show a leap bit. I wonder if that's the problem in a nutshell. Can
anyone demonstrate whether ntpd clears the bit if it is set but an
upstream server is configured and sends an LI=00 update?

--Jeff

E-Mail Sent to this address will be added to the BlackLists

unread,

Aug 3, 2012, 8:18:13 PM8/3/12

to

Jeffrey Lerman wrote:
> Can anyone demonstrate whether ntpd clears the bit if it
> is set but an upstream server is configured and sends an
> LI=00 update?

See: ntp_proto.c & ntp_timer.c ?

... and platform dependent stuff ?
nt_clockstuff.c e.g. "Disarm leap second only if the leap second is not already in progress"

Dev ver of the moment:
<http://www.eecis.udel.edu/~ntp/ntp_spool/ntp4/ntp-dev/ntp-dev-4.2.7p292.tar.gz>

--
E-Mail Sent to this address <Blac...@Anitech-Systems.com>
will be added to the BlackLists.

Harlan Stenn

unread,

Aug 3, 2012, 8:08:26 PM8/3/12

to

Jeff wrote:
> Oh, my mistake: I quote RFC5905 below, which is for NTPv4, which is
> technically in _draft_ status - though it does seem pretty far along and
> I believe current ntpd adheres to NTPv4, not v3.

The NTP code *defines* the spec, and there will be times when the
current spec and the code match, and there are times when the current
code is getting ready to define the *next* spec.

> For what it's worth the most recently approved protocol is, technically,
> NTPv3, documented in RFC1305 - and that one does say "current day" -
> though again, ntpd respects the end-of-month rule.

That depends on your definition of "Approved". NTPv3 (RFC 1305) never
made it out of DRAFT status.

RFC 5905 is a "Proposed Standard".

H

Jeffrey Lerman

unread,

Aug 3, 2012, 8:22:05 PM8/3/12

to

Fair enough. Though with a definition like that, it's formally
impossible to distinguish bugs from intentional behavior ("features").

Anyway, I'm guessing you know the design intent, as well as the relevant
implementations, pertaining to the question I posed further down in that
email, namely:

Is the leap bit supposed to be cleared by a client if it gets LI=00 from
a server? Or is the bit only *set* based on information from a server,
and cleared only upon application of the leap second? If the latter is
the current implementation, it might well explain the bogus leap second
behavior many of us saw a few days ago. Unless you have a different
explanation/understanding?

Thanks,
--Jeff

Harlan Stenn

unread,

Aug 3, 2012, 8:42:30 PM8/3/12

to

Jeff wrote:
> Is the leap bit supposed to be cleared by a client if it gets LI=00
> from a server? Or is the bit only *set* based on information from a
> server, and cleared only upon application of the leap second? If the
> latter is the current implementation, it might well explain the bogus
> leap second behavior many of us saw a few days ago. Unless you have a
> different explanation/understanding?

I'd have to look all that up, and I know different versions behave
differently.

This topic is something that's getting a lot of recent discussion and
scrutiny...

H

David Woolley

unread,

Aug 4, 2012, 4:07:52 AM8/4/12

to

Harlan Stenn wrote:
>> Oh, my mistake: I quote RFC5905 below, which is for NTPv4, which is
>> technically in _draft_ status - though it does seem pretty far along and
>> I believe current ntpd adheres to NTPv4, not v3.
>
> The NTP code *defines* the spec, and there will be times when the

I think you mean the "ntpd reference implementation", e.g. Microsoft's
NTP code does not define the standard.

Also, I don't think this is the correct relationship between RFCs and
reference implementations. An RFC specifies the protocol for a specific
reference implementation. If you do more than fix bugs in the reference
implementation, you need a new RFC before it becomes the standard.

Harlan Stenn

unread,

Aug 4, 2012, 6:10:15 AM8/4/12

to

David Woolley writes:
> Harlan Stenn wrote:
> >> Oh, my mistake: I quote RFC5905 below, which is for NTPv4, which is
> >> technically in _draft_ status - though it does seem pretty far along and
> >> I believe current ntpd adheres to NTPv4, not v3.
> >
> > The NTP code *defines* the spec, and there will be times when the
>
> I think you mean the "ntpd reference implementation", e.g. Microsoft's
> NTP code does not define the standard.

Yes, thanks...

> Also, I don't think this is the correct relationship between RFCs and
> reference implementations. An RFC specifies the protocol for a specific
> reference implementation. If you do more than fix bugs in the reference
> implementation, you need a new RFC before it becomes the standard.

Yes, and what you describe is the ordinary case.

The reference implementation for NTP is a bit different - any difference
between the specification and the reference implementation is grounds
for careful scrutiny and deliberation. Sometimes the spec is correct.
More often than not, the code is the preferred way to go.

There comes a time when ntp-stable is the "current" RFC code and
specification, and ntp-dev is what we use to start to drive the "next"
version of the RFC. Sometimes there are several releases of the -stable
code.

The push towards an NTP4 RFC began in late '96, with ntp-4.0 being
released in September of '97. The last release in the 4.0 branch was in
January of '00. The 4.1 branch (improvements to the V4 spec, etc.) ran
from August of '01 thru July of '03. The 4.2 branch has run from
October of '03 until now. We're expecting the last release of the 4.2
branch this summer, and after that we'll start on the next branch of the
code (which is still expected to drive the NTPv4 specification forward).

H

unruh

unread,

Aug 4, 2012, 12:28:15 PM8/4/12

to

On 2012-08-04, David Woolley <da...@ex.djwhome.demon.invalid> wrote:
> Harlan Stenn wrote:
>>> Oh, my mistake: I quote RFC5905 below, which is for NTPv4, which is
>>> technically in _draft_ status - though it does seem pretty far along and
>>> I believe current ntpd adheres to NTPv4, not v3.
>>
>> The NTP code *defines* the spec, and there will be times when the
>
> I think you mean the "ntpd reference implementation", e.g. Microsoft's
> NTP code does not define the standard.

And it is a reference implimentation, not the definition. Ie, it is an
implimentation that is supposed to follow the standard. It does not

define the standard.
>
> Also, I don't think this is the correct relationship between RFCs and
> reference implementations. An RFC specifies the protocol for a specific

I think that the reference implimentation impliments a specific rfc. Ie,
the rfc comes first.

> reference implementation. If you do more than fix bugs in the reference
> implementation, you need a new RFC before it becomes the standard.

An rfc is just a request for comments. It is NOT a standard. It may
become one ( although I think none of the ntp rfcs have actually ever
become standards).

>
>

David Woolley

unread,

Aug 4, 2012, 3:39:21 PM8/4/12

to

unruh wrote:

>
> I think that the reference implimentation impliments a specific rfc. Ie,
> the rfc comes first.
>

My understanding is the reverse. My understanding is that the RFC
system requires a reference implementation, to prove that the
specification is implementable, before the specification can be
published as an RFC.

Richard B. Gilbert

unread,

Aug 4, 2012, 3:47:39 PM8/4/12

to

It's unlikely to become a standard until people stop tinkering with it!
It's pure hell trying to "standardize" a moving target.

The standard, when published, must satisfy meet the needs of the
community. It won't be easy. We've had something that works for most
of us for the last few years. With a bit of luck we can have "this is
how it works and these are the standards that a conforming
implementation must meet".

Blood will flow before we get a standard we can all agree on.
Hopefully, only people I don't like will be killed. ;-)

Jeffrey Lerman

unread,

Aug 4, 2012, 2:13:10 PM8/4/12

to

Yes. The unfortunate combination of the bogus leap second and the
newly-discovered (on July 1) Linux kernel bug related to leap-second
handling means that bogus leap seconds have a much bigger-than-normal
impact.

It looks like this recently-filed (and cryptically-named) ntpd bug might
be related to the bogus leap seconds?
http://bugs.ntp.org/show_bug.cgi?id=2246 "sys_leap is stick"

If so, that bug possibly ought to be bumped up in priority.

Meantime if we can confirm that installing a current/valid "leap
seconds" file should block bogus leap seconds, perhaps that could be a
recommended workaround to the bogus leap-seconds issue, until the actual
issue can be patched. Could you comment?
> H

Thanks,
--Jeff

Brian Utterback

unread,

Aug 4, 2012, 2:32:53 PM8/4/12

to

The relationship between a protocol, the RFC that defines it and the
reference implementation (if there is one) is often not straight forward.

The early RFC almost all documented existing programs and the protocols
that they implemented. This tradition has continued to this day because
a working implementation is always preferable to a theoretical set of
standards. However, it is often the case that new standards are designed
by committee, with the (hopefully) best minds debating the features
required. You can get amazing protocols that way, but you can just as
easily end up with something that is never implemented.

In the case of NTP, the reference implementation came before the RFC,
with the RFC basically documenting the protocol that the ref-impl was
already used. Since the ref-impl continues to be an experimental
platform at its core, with most of the protocol controlled by Dr. Mills,
this is still the case.

So, an RFC defines the protocol and it is true that when others seek to
implement NTP, it is the RFC they follow. But the ref-impl is still the
gold standard that the RFC follows. And has been previously noted, this
makes it tricky to fix bugs in the protocol. One needs to be very
careful that the behavior is not intentional and just not yet documented
yet.

> _______________________________________________
> questions mailing list
> ques...@lists.ntp.org

> http://lists.ntp.org/listinfo/questions

--
blu

Always code as if the guy who ends up maintaining your code will be a
violent psychopath who knows where you live. - Martin Golding
-----------------------------------------------------------------------|
Brian Utterback - Solaris RPE, Oracle Corporation.
Ph:603-262-3916, Em:brian.u...@oracle.com

Harlan Stenn

unread,

Aug 4, 2012, 5:22:53 PM8/4/12

to

unruh writes:
> On 2012-08-04, David Woolley <da...@ex.djwhome.demon.invalid> wrote:

>> Harlan Stenn almost wrote:
>>> The NTP reference implmentation *defines* the spec, and there will
>>> be times when the ...

>
> And it is a reference implimentation, not the definition. Ie, it is an
> implimentation that is supposed to follow the standard. It does not
> define the standard.

You can believe what you want.

In this case you are kinda wrong. But perhaps it's a matter of
perspective.

The reference implementation *in this case* is the target the RFC
intends to meet. The current RFC is developed and written based on what
the then-current ntp-dev implements.

There comes a time when the RFC is left as a marker and the code moves
on, in preparation for the next RFC.

> > Also, I don't think this is the correct relationship between RFCs and
> > reference implementations. An RFC specifies the protocol for a specific
>
> I think that the reference implimentation impliments a specific rfc. Ie,
> the rfc comes first.

In general you are right. And in this case most people are interested
in having correct time on their boxes, not a pedantically-correct
implementation of the RFC.

And RFCs can be updated. If there is a bug in them people can choose to
run strictly-compliant broken code or they can apply the fixes.

Other folks may choose to value "better timekeeping" and they can have
what they want, too.

> > reference implementation. If you do more than fix bugs in the reference
> > implementation, you need a new RFC before it becomes the standard.
>
> An rfc is just a request for comments. It is NOT a standard. It may
> become one ( although I think none of the ntp rfcs have actually ever
> become standards).

NTPv2 was a standard.

H

Nathan Stratton Treadway

unread,

Aug 4, 2012, 6:08:00 PM8/4/12

to

On Thu, Aug 02, 2012 at 05:57:43 +0000, Dave Hart wrote:
> On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
> > I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
> > stratum-2 server in the US pool (a few of my systems have it in their
> > list).
>
> That particular system seems to have corrected its leap indication,

I noticed that name1.glorb.com is annoucing a leap second insertion
again just now:
# date -u; ntpq -c "rv 0 leap,stratum,refid" name1.glorb.com
Sat Aug 4 21:55:10 UTC 2012
leap=01, stratum=2, refid=209.51.161.238

It seems that one (and only one) of its upstream servers is also showing
"leap=01":

# ntpq -c "lassoc" -c "mrv &1 &999 leap,srcadr,stratum" name1.glorb.com | grep -E "leap=([^0].|.[^0])"
srcadr=truechimer.cites.illinois.edu, leap=01, stratum=1
srcadr=time-b.nist.gov, leap=11, stratum=1

So it appears in this case "name1"'s leap flag does come and go over
time... but it's not as simple "use the current system peer's leap flag
value" (since the currently-listed refid [209.51.161.238] is a server
that does NOT have the leap flag set...).

Nathan

----------------------------------------------------------------------------
Nathan Stratton Treadway - nath...@ontko.com - Mid-Atlantic region
Ray Ontko & Co. - Software consulting services - http://www.ontko.com/
GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt ID: 1023D/ECFB6239
Key fingerprint = 6AD8 485E 20B9 5C71 231C 0C32 15F3 ADCD ECFB 6239

Martin Burnicki

unread,

Aug 5, 2012, 6:03:25 AM8/5/12

to

Nathan Stratton Treadway wrote:

> date -u; ntpq -c "rv 0 leap,stratum,refid" name1.glorb.com

The command above doesn't report the version of the NTP daemon running on
that system.

date -u; /usr/sbin/ntpq -p -c rv name1.glorb.com
So 5. Aug 09:52:26 UTC 2012
remote refid st t when poll reach delay offset jitter
=======================================================================
-navobs1.oar.net .GPS. 1 u 108 256 377 11.545 -5.508 0.118
+clock.nyc.he.ne .CDMA. 1 u 196 256 377 18.289 -0.053 0.062
*truechimer.cite .PPS. 1 u 153 256 377 14.664 0.049 0.271
-time-b.nist.gov .ACTS. 1 u 719 256 254 27.975 4.316 0.089
-utcnist2.colora .ACTS. 1 u 125 256 377 36.595 0.137 0.039
-navobs1.wustl.e .GPS. 1 u 227 256 377 19.129 -0.140 0.075
-bonehed.lcs.mit .PPS. 1 u 205 256 377 26.380 -3.559 0.115
+navobs1.gatech. .GPS. 1 u 131 256 377 27.583 -0.031 0.128
-50-77-217-185-s .ACTS. 1 u 43 256 277 34.933 -3.199 1.403
-andromeda.cs.pu .CDMA. 1 u 122 256 377 12.729 -2.859 0.138
-nist.netservice .ACTS. 1 u 193 256 377 7.332 3.041 0.081
+ben.cs.wisc.edu .GPS. 1 u 235 256 377 19.632 -0.024 0.067
-rackety.udel.ed .PPS. 1 u 178 256 377 21.960 -2.131 0.035
x216.119.63.113 .ACTS. 1 u 181 256 377 48.526 14.320 0.088

associd=0 status=46f4 leap_add_sec, sync_ntp, 15 events, freq_mode,
version="ntpd 4.2...@1.1612-o Wed Nov 24 19:02:25 UTC 2010 (1)",
processor="i686", system="Linux/2.6.32-131.6.1.el6.i686", leap=01,
stratum=2, precision=-20, rootdelay=14.664, rootdispersion=20.099,
peer=23785, refid=128.174.38.133,
reftime=d3c8bd40.03e02317 Sun, Aug 5 2012 11:37:04.015, poll=8,
clock=d3c8c0dc.88fe46f6 Sun, Aug 5 2012 11:52:28.535, state=4,
offset=-0.034, frequency=-356.152, jitter=0.263, noise=0.133,
stability=0.005, tai=0

So this is 4.2.4p8 which (if I remember correctly) accepts an incoming leap
warning even if only a single upstream server from the group of "survivors"
of the clustering algorithm provides it.

A quick inspection of the upstream servers shows indeed that only one of
them is providing the leap warning, namely truechimer.cites.illinois.edu

A 4.2.6 daemon would require a majority of the survivors to provide the leap
warning before it accepted it, so at least this server should not have the
eraaneous leat bit set if it were running 4.2.6.

If I remember correctly then cases like this where the reason why the
requirement of a majority was introduced after 4.2.4.

Martin Burnicki

unread,

Aug 5, 2012, 6:49:44 AM8/5/12

to

E-Mail Sent to this address will be added to the BlackLists wrote:
> Martin Burnicki wrote:
>>> clients to independently set LI=00 during, say the first
>>> half of the month, and to ignore the LI value from
>>> servers during that time.
>
> I think you would have to be more exact than that.
>
> LI is used for more than one thing.
>
> <http://www.eecis.udel.edu/~mills/ntp/html/decode.html>
>
> According to the doc, LI only applies to the current day?

If we are going to discuss this absolutely in detail the even more things
need to be taken into account, e.g. where the leap second warnings come
from, how long they persist, and how they are passed to NTP.

E.g. the GPS satellites usually start to send information about an upcoming
leap second about 6 months before the leap second event actually occurs,
shortly after the IERS has sent its bulletin C which announces this. The
sats don't simply send a warning flag but the exact UTC time *when* this is
going to happen. So GPS receivers know about the leap second long before it
becomes interesting for NTP.

It depends on the GPS receiver at which time it starts to output a leap
second warning so ntpd can become aware of it, which protocol is used to
pass GPS time to NTP, and if the selected protocol even supports leap
second warnings. Correct me if I'm wrong, but as far as I know there is
e.g. no NMEA sentence providing a leap second warning before the leap
second actually occurs. For operation with NTP it is not sufficient just to
send second 60 *during* the leaps second, i.e. when the leap second is
already in progress.

On the other hand there are string formats supported by the parse river
(driver 8) which do support this.

The German longwave transmitter DCF77 starts to send out a leap second
warning flag only 1 hour before the leap second occurs. This interval may
be long enough to provide the leap warning for stratum 1 and eventually
stratum 2 servers, but it is probably too short for clients at a lower
stratum level if large polling intervals are used, simply because the worst
case summary of polling intervals exceeds the announcement interval.

If IRIG signals are used for refclocks then things are even worse since most
IRIG frame formats don't even support leap second warnings. Only IRIG
formats with extension IEEE 1344 or its successor, IEEE C37.118 support
this, but the announcement interval is only 1 minute or so, which is
definitely to short to pass a leap second warning reliably from an IRIG
controlled stratum-1 NTP server down to a chain of secondaries and clients.

So in some cases like this a leap second file (which needs to be updated
regularly) should at least be a way to get the leap second be handled
reliably.

On the other hand, if you are using a GPS receiver and a serial string
format providing ntpd with a leap second warning early enough, then you
don't have to care about leap seconds since the GPS receiver gets the
warnings from the satellites, and you don't have to worry if your leap
second file is updated regularly.

If it comes to NTP then you should always mention the version of NTP you are
talking about since the behaviour of leap second handling has changed
across versions, e.g.

- when ntpd starts to accept a leap warning

- from which sources it accepts a leap warning under which conditions: from
a single upstream server (earlier NTP versions) or only from a majority
(current NTP versions), from refclocks, from a leap second file, and which
priorities come into effect if there are upstream servers *and* one or more
refclocks *and* a leap second file, or another combination of those.

- which plausibilty checks ntpd does to avoid erraneaous leap second
warnings: earlier NTP versions inserted a leapsecond only at the end of
June or the end of December, but supported also leap second deletions which
could occur in theory and can also be warned about by GPS, while current
versions of ntpd accept leap second warning for the end of *any* month, but
(as far as I know) don't support leap second deletion anymore at all.

Of course, all of the above is valid only in the absence of firmware bugs or
NTP bugs which can mess up everything.

Martin Burnicki

unread,

Aug 5, 2012, 7:18:17 AM8/5/12

to

Jeffrey Lerman wrote:
> The unfortunate combination of the bogus leap second and the
> newly-discovered (on July 1) Linux kernel bug related to leap-second
> handling means that bogus leap seconds have a much bigger-than-normal
> impact.

I think a main problem here is that there are many software developers who
are not aware at all that leap seconds exist, or they just don't care.

In the 1980's and 1990's there were leap seconds about once every 12 or 18
months, and most people working on stuff which could be affected by leap
seconds were aware of this and took care the leaps were handled correctly
since the next one was going to come "soon".

Then after Jan 1999 there was suddenly a period of 7 years without any leap
second, and people just forgot about them or didn't really care about leap
seconds anymore. Then for Dec 2005 a new leap second was scheduled which
already caused some trouble, and until the next leap seconds (Dec 2008 and
June 2012) there were intervals of 3 and 3 1/2 years. Hers a link with a
graph showing the UTC time steps due to leap seconds:
http://hpiers.obspm.fr/eop-pc/earthor/utc/leapsecond.html

unruh

unread,

Aug 5, 2012, 4:24:10 PM8/5/12

to

On 2012-08-04, Harlan Stenn <st...@ntp.org> wrote:
> unruh writes:
>> On 2012-08-04, David Woolley <da...@ex.djwhome.demon.invalid> wrote:
>>> Harlan Stenn almost wrote:
>>>> The NTP reference implmentation *defines* the spec, and there will
>>>> be times when the ...
>>
>> And it is a reference implimentation, not the definition. Ie, it is an
>> implimentation that is supposed to follow the standard. It does not
>> define the standard.
>
> You can believe what you want.
>
> In this case you are kinda wrong. But perhaps it's a matter of
> perspective.
>
> The reference implementation *in this case* is the target the RFC
> intends to meet. The current RFC is developed and written based on what
> the then-current ntp-dev implements.

Except of course that the reference implimentation contains a huge bunch
of stuff that is never intended to be part of the rfc ( the exact order
fo the statements, the exact implimentation details, etc). As it is the
rfc already defines far too many things that are accidents of
theimplimentation rather than design specifications that any
implimentation should meet.
For example the code contains bugs. If the code is supposed to be the
defining structure, then of course there are no bugs, just features. The
leap second hangs the whole implimentation? That is as it should be. The
implimentation does a leapsecond round robin so that it never shuts off?
Thatis as it should be. After all the code defines ntp. That would be an
absurd position to take. Now the RFC may be an abstraction from the
code, but again, one wants to make sure that it is a sufficiently
generic abstraction that may valid implimentations are possible.

>
> There comes a time when the RFC is left as a marker and the code moves
> on, in preparation for the next RFC.

>
>> > Also, I don't think this is the correct relationship between RFCs and
>> > reference implementations. An RFC specifies the protocol for a specific
>>
>> I think that the reference implimentation impliments a specific rfc. Ie,
>> the rfc comes first.
>
> In general you are right. And in this case most people are interested
> in having correct time on their boxes, not a pedantically-correct
> implementation of the RFC.

They also do not care exactly how that is accomplished-- whether via ntp
or chrony say, as long as it is robust and correct.

>
> And RFCs can be updated. If there is a bug in them people can choose to
> run strictly-compliant broken code or they can apply the fixes.

But there should be difference between a logical bug in the rfc and a
coding bug in the reference implimentation.

Harlan Stenn

unread,

Aug 6, 2012, 2:19:10 AM8/6/12

to

unruh writes:
> On 2012-08-04, Harlan Stenn <st...@ntp.org> wrote:
> > unruh writes:
> >> On 2012-08-04, David Woolley <da...@ex.djwhome.demon.invalid> wrote:
> >>> Harlan Stenn almost wrote:
> >>>> The NTP reference implmentation *defines* the spec, and there will
> >>>> be times when the ...
> >>
> >> And it is a reference implimentation, not the definition. Ie, it is an
> >> implimentation that is supposed to follow the standard. It does not
> >> define the standard.
> >
> > You can believe what you want.
> >
> > In this case you are kinda wrong. But perhaps it's a matter of
> > perspective.
> >
> > The reference implementation *in this case* is the target the RFC
> > intends to meet. The current RFC is developed and written based on what
> > the then-current ntp-dev implements.
>
> Except of course that the reference implimentation contains a huge bunch
> of stuff that is never intended to be part of the rfc ( the exact order
> fo the statements, the exact implimentation details, etc).

So what? Any implementation has such freedoms.

> As it is the rfc already defines far too many things that are
> accidents of theimplimentation rather than design specifications that
> any implimentation should meet.

I don't know. But have you submitted alleged this list to the RFC
editors?

> For example the code contains bugs. If the code is supposed to be the
> defining structure, then of course there are no bugs, just features.

Now you just are being silly.

The code has bugs.

The RFC probably has bugs.

Your email has typos (bugs).

Mine probably does too.

So what?

> The leap second hangs the whole implimentation? That is as it should
> be.

What are you talking about? And that was rhetorical, as you're being a
troll and I'm done with this particular interaction after I send this.

> The implimentation does a leapsecond round robin so that it never
> shuts off? Thatis as it should be.

What are you talking about? Ibid.

> After all the code defines ntp. That would be an
> absurd position to take. Now the RFC may be an abstraction from the
> code, but again, one wants to make sure that it is a sufficiently
> generic abstraction that may valid implimentations are possible.

That was better, thanks.

> > There comes a time when the RFC is left as a marker and the code moves
> > on, in preparation for the next RFC.
>
> >> > Also, I don't think this is the correct relationship between RFCs and
> >> > reference implementations. An RFC specifies the protocol for a specific
>
> >>
> >> I think that the reference implimentation impliments a specific rfc. Ie,
> >> the rfc comes first.
> >
> > In general you are right. And in this case most people are interested
> > in having correct time on their boxes, not a pedantically-correct
> > implementation of the RFC.
>
> They also do not care exactly how that is accomplished-- whether via ntp
> or chrony say, as long as it is robust and correct.

Different perspectives on the issue. From what I've seen Chrony has its
pros and cons. But none of this is related to the discussion above, as
best as I can tell. Unless you want to attempt to assert that chrony is
an NTP reference implementation, which I suspect you are not trying to do.

> > And RFCs can be updated. If there is a bug in them people can choose to
> > run strictly-compliant broken code or they can apply the fixes.
>
> But there should be difference between a logical bug in the rfc and a
> coding bug in the reference implimentation.

I don't think you read what I wrote about this. Please go back and look.
The phrase I used ended with "careful scrutiny and deliberation".

H

Dick Wesseling

unread,

Aug 6, 2012, 8:44:25 AM8/6/12

to

In article <501D6636...@gmail.com>,

Jeffrey Lerman <jeffrey...@gmail.com> writes:
>
>
> On Fri, Aug 03 2012 at 5:42PM, Harlan Stenn <st...@ntp.org> wrote:
>
> It looks like this recently-filed (and cryptically-named) ntpd bug might
> be related to the bogus leap seconds?
> http://bugs.ntp.org/show_bug.cgi?id=2246 "sys_leap is stick"
>

I intended to type "sticky".

Richard B. Gilbert

unread,

Aug 6, 2012, 3:51:05 PM8/6/12

to

Proof read before hitting "send"! Especially if you are a poor typist.
I find that a "spelling checker" is a good thing to use.
Of course it won't help much if you type the wrong word!

Message has been deleted

irte...@gmail.com

unread,

Aug 30, 2012, 1:05:38 PM8/30/12

to

On Wednesday, August 1, 2012 12:08:40 PM UTC-5, steven Sommars wrote:
> The main standard says a leap second is allowed in any month. That's what
>
> the reference ntpd does.
>
> See ITU-R, TF460, STANDARD-FREQUENCY AND TIME-SIGNAL EMISSIONS.
>
> This link may work:
>
> http://www.itu.int/dms_pubrec/itu-r/rec/tf/R-REC-TF.460-6-200202-I!!PDF-E.pdf
>
>
>
>
>
>
>
>
>
> On the other hand Bulletin C (
>
> ftp://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat) says December or June.
>
>
>
>
>
> Take your pick.
>
>
>
Yes, for future notifications of the real authority on leap seconds, simply go to http://hpiers.obspm.fr/eop-pc/index.php?index=bulletin_registration&lang=en and add yourself to the C Bulletins. You will receive regular updates on when they will occur, or if the next possible block (December / June) is not to occur.

Miroslav Lichvar

unread,

Aug 31, 2012, 4:56:54 AM8/31/12

to

On Thu, Aug 02, 2012 at 05:57:43AM +0000, Dave Hart wrote:
> On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
> > I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
> > stratum-2 server in the US pool (a few of my systems have it in their
> > list).
>
> That particular system seems to have corrected its leap indication,

> but plenty of other pool participants are advertising leap. I have
> this laptop set to associate with every IP in a list of all pool
> servers as of late June. The following are showing leap=01 now:
>
[...]

From that list the following IPv4 servers still seem to be announcing
a pending leap second:

131.155.140.129 Netherlands
131.155.140.130 Netherlands
143.121.199.173 Netherlands
161.53.248.35 Croatia
164.107.116.179 United States
178.237.34.94 Netherlands
192.87.106.2 Netherlands
192.87.106.3 Netherlands
192.87.36.4 Netherlands
193.2.111.2 Slovenia
193.2.111.3 Slovenia
193.2.4.2 Slovenia
193.2.78.228 Slovenia
193.77.222.200 Slovenia
193.77.237.128 Slovenia
193.95.229.133 Slovenia
194.171.167.130 Netherlands
194.249.198.30 Slovenia
213.129.242.82 Austria
213.206.85.20 Netherlands
217.75.72.153 Slovakia
219.117.206.46 Japan
64.22.125.197 United States
67.209.225.216 United States
69.65.33.188 United States
72.14.178.210 United States
77.245.91.218 Netherlands
77.94.135.133 Slovenia
80.239.2.130 Norway
81.167.109.120 Norway
81.187.35.170 United Kingdom
81.93.163.20 Norway
81.93.163.23 Norway
82.197.80.125 United Kingdom
83.98.201.133 Netherlands
83.98.201.134 Netherlands
85.158.249.144 Netherlands
85.17.71.101 Netherlands
85.252.162.7 Norway
86.61.66.23 Slovenia
90.155.74.40 United Kingdom
91.198.87.118 Netherlands
94.26.2.134 Bulgaria
95.211.7.153 Netherlands
98.191.213.7 United States

--
Miroslav Lichvar

Terje Mathisen

unread,

Aug 31, 2012, 6:37:38 AM8/31/12

to

Miroslav Lichvar wrote:
> On Thu, Aug 02, 2012 at 05:57:43AM +0000, Dave Hart wrote:
>> On Thu, Aug 2, 2012 at 1:17 AM, Chris Adams wrote:
>>> I'm still seeing leap=01 from 204.235.61.9 (name1.glorb.com), a
>>> stratum-2 server in the US pool (a few of my systems have it in their
>>> list).
>>
>> That particular system seems to have corrected its leap indication,
>> but plenty of other pool participants are advertising leap. I have
>> this laptop set to associate with every IP in a list of all pool
>> servers as of late June. The following are showing leap=01 now:
>>
> [...]
>
> From that list the following IPv4 servers still seem to be announcing
> a pending leap second:

I think I know one cause of those leap bits:

Two days ago I was asked to visit a local power company who had ntp
problems, turned out that they had a bunch of routers acting as ntp
relays, all with the leap bit set.

The cause was that they had a pair of Symmetricom 350's as their root
servers, set up as mutual peers, and one of them did indeed claim that a
leap second was upcoming.

I suspect this was caused by the combination of sticky leap bits and a
possible race condition during the actual passing of the leap second on
midnight UTC June 30.

Restarting the faulty ntpd process fixed the server issue but they still
had to find time to restart all their distribution routers before
midnight tonight.:-(

Symmetricom has not released any firmware updates for the 300 series
since last fall.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"