server clock.psu.edu prefer
server ntp-2.vt.edu prefer
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
driftfile /etc/ntp/drift
broadcastdelay 0.008
authenticate no
ntpq> peers
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) LOCAL(0) 10 l 41 64 377 0.000 0.000 0.008
otc2.psu.edu otc1.psu.edu 2 u 59 64 377 20.372 454561. 1431.88
proxy.cc.vt.edu gps1.tns.its.ps 2 u 44 64 377 13.786 454895. 1361.55
ntpq> as
ind assID status conf reach auth condition last_event cnt
===========================================================
1 55444 9614 yes yes none sys.peer reachable 1
2 55445 9014 yes yes none reject reachable 1
3 55446 9014 yes yes none reject reachable 1
It is obviously talking with the external servers. So why is it choosing the local fudge
clock?
Five minutes after restarting ntp (which calls ntpdate to set the time):
ntpq> peers
remote refid st t when poll reach delay offset jitter
==============================================================================
*LOCAL(0) LOCAL(0) 10 l 5 64 17 0.000 0.000 0.008
otc2.psu.edu ntp2.usno.navy. 2 u 6 64 17 22.184 4418.13 1383.85
proxy.cc.vt.edu gps1.tns.its.ps 2 u 2 64 17 13.730 4523.10 1389.26
ntpq> as
ind assID status conf reach auth condition last_event cnt
===========================================================
1 19332 9614 yes yes none sys.peer reachable 1
2 19333 9014 yes yes none reject reachable 1
3 19334 9014 yes yes none reject reachable 1
The restart put this in the syslog:
Jan 29 19:52:24 p2 ntpd[1282]: ntpd exiting on signal 15
Jan 29 19:52:24 p2 ntpd: ntpd shutdown succeeded
Jan 29 20:00:56 p2 ntpdate[2973]: step time server 128.118.25.3 offset 503.777623 sec
Jan 29 20:00:56 p2 ntpd: succeeded
Jan 29 20:00:56 p2 ntpd[2977]: ntpd 4.1...@1.791 Sat Aug 31 18:27:29 EDT 2002 (1)
Jan 29 20:00:56 p2 ntpd: ntpd startup succeeded
Jan 29 20:00:56 p2 ntpd[2977]: precision = 11 usec
Jan 29 20:00:56 p2 ntpd[2977]: kernel time discipline status 0040
Jan 29 20:00:56 p2 ntpd[2977]: frequency initialized 63.924 from /etc/ntp/drift
Jan 29 20:04:16 p2 ntpd[2977]: kernel time discipline status change 41
A server won't get foot in door if the root distance is greater than one
second. Once component of root distance is the jitter, which in your
case is off the map. I haven't seen jitter that large since Norway came
up on the Internet and the Atlanic path was some old rickety cable
nibbled by fishes and Canadian trawlers. To confirm this conclusion,
note that there is no tattletale character as the first character in the
server line.
I have no idea why the jitter is so high, but you really would not want
to synchronize to those suckers. The daemon will become a fine imitation
of a pinball machine. Go fix the jitter first. Take a look at the filter
stages in the individual server billboards.
Wash my mouth. I said I don't do Linux.
Dave
> I have no idea why the jitter is so high, but you really would not want
> to synchronize to those suckers. The daemon will become a fine imitation
> of a pinball machine. Go fix the jitter first. Take a look at the filter
> stages in the individual server billboards.
>
How do I do that?
It appears that this is a case of a shitty hardware clock causing ntp to think jitter is
high when it isn't. I added the server next to it on the network to ntp.conf and the
jitter for that is higher than the jitter for the external servers!
How can I increase the minium jitter time? I looked in the docs and didn't find anything.
It would probably help more to replace the hardware clock. It looks
like your jitter exceeds *1 second*, which means that your hardware
clock is unusable for any accurate timekeeping. If you can't replace
the clock, have a cron job reset the clock from a server once an hour
and you will do as well as you can. Trying to have NTP track the
situation won't help much and may make things worse.
Dale
A couple of thoughts:
- On machines with a cycle counter, the kernel can be configured to use
that for the system clock. If it's broken (or if you have a weird
SMP machine - is it SMP?) you could be in for a lot of grief. If you
are not using the cycle counter and the clock you are using is broken,
ditto. Try experimenting with the various kernels available. If you
are used to rolling your own, try experimenting with the various clock
options.
- I think I read somewhere that (some of?) the kernels in RH 8.0 come
with HZ set to 500 (something other than 100, anyway). If NTP hasn't
been compiled knowing this, it can get confused. I don't remember this
ever showing up as excessive jitter, but you never know. Try building
your own version of NTP from the latest source and see if that fixes
the problem.
Of course you didn't find how to increase the "minimum jitter time". You
are out of the NTP design space. It was never intended or expected to
operate with jitter over one second. On some occasions in the past, a
sudden increase in jitter like this has been a reliable indicator of
broken hardware. The only credible source for jitter that high is the
harsware of operating system.
Dave
In principle, the NTP daemon knows nothing about Hz; it operates with
256-Hz Ultrix clocks and 1024-Hz Alpha clocks without advance warning
and has no Hz at all at any other frequency.
However, watch carefully what the Unix adjtime() system call does. It
adds/subtracts the tickadj kernel value at each timer interrupt. The NTP
daemon expects the kernel to be able to slew the clock at least 500 PPM
to correct for the intrinsic clock oscillator error. If tickadj is 5
microseconds, which is the recommended value, the effective rate for a
100-Hz clock is 500 PPM. Now, if you start fiddling the clock frequency,
the adjtime() slew rate scales as well. Depending on the value ot
tickadj, the effective slew rate might or might not be sufficient at the
lower frequency and that might Hz a lot.
Dave
This is from the RedHat site:
"Red Hat Linux 8.0 Release Notes" at
http://www.redhat.com/docs/manuals/linux/RHL-8.0-Manual/release-notes/x86/
under the section "Kernel Notes":
"HZ=512 on i686 and Athlon means that the system clock ticks 5 times
as fast as on other x86 platforms (i386 and i586); HZ=100 has been the
Linux default on x86 platforms for the entire history of the Linux
kernel. This change provides better interactive response, lower
latency response from some programs, and better response from the
scheduler. We have adjusted the /proc file system to report numbers
as if using the default HZ=100. "
I'm too new of a kernel hacker and ntp hacker to comment on whether
this breaks any of the kernel ntp code.
John DeDourek
I stand corrected. I was just reporting my observation that NTPD started
working on a system with HZ=1000 after it was recompiled. Maybe I was lucky.