Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Can a clock drift be too big for ntpd?

20 views
Skip to first unread message

Patrick Nolan

unread,
Oct 18, 2007, 8:59:05 PM10/18/07
to
I'm having trouble with one linux client out of a group.
It seems as if ntpd can't keep its clock syncronized. It's
drifting about 6-10 minutes per day, well over the 500 ppm limit.

After reading some of the posts on this newsgroup, I have come
to realize that debugging ntp problems can be quite complex.
Before I launch into a detailed search, I would like to clarify
one question. Is it possible for the clock frequency to be so
far off that ntpd just can't get it under control?

I have stripped my ntp.conf to the minimum, just one server
line and a drift file. The log entries look like this:

Oct 18 17:23:32 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
Oct 18 17:27:48 client ntpd[30920]: no servers reachable
Oct 18 17:29:06 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
Oct 18 17:29:16 client ntpd[30920]: time reset +10.678548 s
Oct 18 17:29:58 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
Oct 18 17:31:59 client ntpd[30920]: no servers reachable

ntpq -p says this:
remote refid st t when poll reach delay offset jitter
==============================================================================
grandfather.Sta .GPS. 1 u 49 64 375 0.243 2833.49 1139.84

For a while there was a * in the first column, but it went away.
Did I mention that there are several other clients with no problems at all?
ntpd 4.2.2 is running with -g, burst, and iburst.

So, does this look like a hardware problem?
If not, I'll have to dig into networking and details of the configuration.

Hal Murray

unread,
Oct 18, 2007, 10:26:07 PM10/18/07
to

>It seems as if ntpd can't keep its clock syncronized. It's
>drifting about 6-10 minutes per day, well over the 500 ppm limit.

>After reading some of the posts on this newsgroup, I have come
>to realize that debugging ntp problems can be quite complex.
>Before I launch into a detailed search, I would like to clarify
>one question. Is it possible for the clock frequency to be so
>far off that ntpd just can't get it under control?

6 min per day is 4000 ppm. NTP can't handle more than 500 ppm.


>So, does this look like a hardware problem?

It's a problem. It may not be hardware.

Do you have any other similar machines?

Are you missing interrupts? (say because some strange IO
device eats up a lot of CPU processing its interrupts)

--
These are my opinions, not necessarily my employer's. I hate spam.

Richard B. Gilbert

unread,
Oct 18, 2007, 10:35:08 PM10/18/07
to
Patrick Nolan wrote:
> I'm having trouble with one linux client out of a group.
> It seems as if ntpd can't keep its clock syncronized. It's
> drifting about 6-10 minutes per day, well over the 500 ppm limit.
>
> After reading some of the posts on this newsgroup, I have come
> to realize that debugging ntp problems can be quite complex.
> Before I launch into a detailed search, I would like to clarify
> one question. Is it possible for the clock frequency to be so
> far off that ntpd just can't get it under control?
>
Yes it is possible. It's rare but it can happen.

Some Linux systems have known problems with losing timer interrupts!
During periods of heavy I/O load disk drivers may mask or disable
interrupts for a little too long a time. . . . Some Windows systems
have also been known to exhibit similar behavior.

> I have stripped my ntp.conf to the minimum, just one server
> line and a drift file. The log entries look like this:
>
> Oct 18 17:23:32 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
> Oct 18 17:27:48 client ntpd[30920]: no servers reachable
> Oct 18 17:29:06 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
> Oct 18 17:29:16 client ntpd[30920]: time reset +10.678548 s
> Oct 18 17:29:58 client ntpd[30920]: synchronized to 171.64.7.87, stratum 1
> Oct 18 17:31:59 client ntpd[30920]: no servers reachable
>
> ntpq -p says this:
> remote refid st t when poll reach delay offset jitter
> ==============================================================================
> grandfather.Sta .GPS. 1 u 49 64 375 0.243 2833.49 1139.84
>
> For a while there was a * in the first column, but it went away.
> Did I mention that there are several other clients with no problems at all?
> ntpd 4.2.2 is running with -g, burst, and iburst.
>
> So, does this look like a hardware problem?
> If not, I'll have to dig into networking and details of the configuration.

What's the value stored in your drift file?

DON'T use burst! The burst keyword was intended for situations where
ntpd has to make a phone call to NIST (or similar service) to get the
time. It is NOT suitable for general use over the internet.

Iburst is good. It gets you a fast startup and then lets your system
poll the server at normal intervals.

Check the value of a Kernel variable called "HERTZ". Some Linux systems
set it to 1000 which is not good for NTP. If yours is set to 1000 (or
250) try changing it to 100.

Using a single server is not usually a good idea. Two servers are the
worst possible configuration; ntpd has no good way to decide which one
to believe. Three are good but four are better. Try to select servers
that are close to you in network space (low values of Delay).

Patrick Nolan

unread,
Oct 19, 2007, 5:52:11 PM10/19/07
to
On 2007-10-19, Hal Murray <hal-u...@ip-64-139-1-69.sjc.megapath.net> wrote:
>
> Do you have any other similar machines?

We bought a batch of 4 identical Dell Optiplex 745 machines. Three are running
Linux and one Windows. Only one has this problem.


>
> Are you missing interrupts? (say because some strange IO
> device eats up a lot of CPU processing its interrupts)
>

I'm not sure how to look for this.

I realized something since yesterday. This machine's unique feature is that I
compiled a new kernel to add the Reiser file system. I downloaded the
source RPM for the exact version that came with the distro, and the only
thing I changed was to enable ReiserFS.

Patrick Nolan

unread,
Oct 19, 2007, 6:22:44 PM10/19/07
to
On 2007-10-19, Richard B. Gilbert <rgilb...@comcast.net> wrote:
>
> Some Linux systems have known problems with losing timer interrupts!
> During periods of heavy I/O load disk drivers may mask or disable
> interrupts for a little too long a time. . . . Some Windows systems
> have also been known to exhibit similar behavior.

I would like to know more about this. How can I monitor the interrupts?

After my original post, I remembered that this machine has a unique
feature. I compiled a new kernel to add the Reiser file system.
I don't think I changed anything else, but I don't have any previous
experience with custom kernels.

>
> What's the value stored in your drift file?

Currently it's 74.080. This morning it started out around 30.

>
> DON'T use burst! The burst keyword was intended for situations where
> ntpd has to make a phone call to NIST (or similar service) to get the
> time. It is NOT suitable for general use over the internet.

Without burst, it just drifts freely. The size of the drift is even
worse than I thought. With burst, here are some lines from the log
file:

Oct 19 13:37:23 client ntpd[12595]: time reset +13.151972 s
Oct 19 13:55:58 client ntpd[12595]: time reset +8.779090 s
Oct 19 14:08:09 client ntpd[12595]: time reset +8.712040 s
Oct 19 14:28:21 client ntpd[12595]: time reset +11.494533 s
Oct 19 14:44:53 client ntpd[12595]: time reset +9.450835 s

If I ever get this situation under control I'll turn off burst.

>
> Iburst is good. It gets you a fast startup and then lets your system
> poll the server at normal intervals.
>
> Check the value of a Kernel variable called "HERTZ". Some Linux systems
> set it to 1000 which is not good for NTP. If yours is set to 1000 (or
> 250) try changing it to 100.

More ignorance on my part. Where would I look for this? I searched
the kernel source code and didn't find it.

>
> Using a single server is not usually a good idea. Two servers are the
> worst possible configuration; ntpd has no good way to decide which one
> to believe. Three are good but four are better. Try to select servers
> that are close to you in network space (low values of Delay).
>
>

Again, I'll fix this if I ever get things running properly. The one
server I chose is our master campus server, which is quite close
network-wise.

Here's another issue. I just learned about the distinction between
the kernel clock and the hardware (TOY) clock. I have tried running
hwclock from time to time, comparing it to my WWV-controlled wall
clock. It never seems to be more than 1 or 2 seconds off. Is there
any way to exploit this?

Richard B. Gilbert

unread,
Oct 19, 2007, 8:35:29 PM10/19/07
to
<snip>

Sorry, I've read about HERTZ but I'm not a Linux expert, have never
experienced the problem, or actually done the fix.

Harlan Stenn

unread,
Oct 19, 2007, 8:44:26 PM10/19/07
to
>>> In article <slrnfhg0a...@glast2.Stanford.EDU>, Patrick Nolan <p...@glast2.Stanford.EDU> writes:

Patrick> I'm having trouble with one linux client out of a group. It seems
Patrick> as if ntpd can't keep its clock syncronized. It's drifting about
Patrick> 6-10 minutes per day, well over the 500 ppm limit.

As others have noted, ntpd cannot help in this situation, if it cannot be
corrected. See http://support.ntp.org/Support/TroubleshootingNTP for more
information.

Patrick> After reading some of the posts on this newsgroup, I have come to
Patrick> realize that debugging ntp problems can be quite complex. Before I
Patrick> launch into a detailed search, I would like to clarify one
Patrick> question. Is it possible for the clock frequency to be so far off
Patrick> that ntpd just can't get it under control?

Yes, but be sure to rule out hardware *and* OS issues first. See above.

Patrick> I have stripped my ntp.conf to the minimum, just one server line
Patrick> and a drift file. The log entries look like this:

Patrick> Oct 18 17:23:32 client ntpd[30920]: synchronized to 171.64.7.87,
Patrick> stratum 1 Oct 18 17:27:48 client ntpd[30920]: no servers reachable
Patrick> Oct 18 17:29:06 client ntpd[30920]: synchronized to 171.64.7.87,
Patrick> stratum 1 Oct 18 17:29:16 client ntpd[30920]: time reset +10.678548
Patrick> s Oct 18 17:29:58 client ntpd[30920]: synchronized to 171.64.7.87,
Patrick> stratum 1 Oct 18 17:31:59 client ntpd[30920]: no servers reachable

Patrick> ntpq -p says this: remote refid st t when poll reach delay offset
Patrick> jitter
Patrick> ==============================================================================
Patrick> grandfather.Sta .GPS. 1 u 49 64 375 0.243 2833.49 1139.84

grandfather is not sync'd - you will not be able to sync to it while it is
like this.

Patrick> For a while there was a * in the first column, but it went away.

For a while, that was good. But then it was bad.

Patrick> Did I mention that there are several other clients with no problems
Patrick> at all? ntpd 4.2.2 is running with -g, burst, and iburst.

Burst is probably not your friend. There will also be pages on the support
web that will help with that, too.

Patrick> So, does this look like a hardware problem? If not, I'll have to
Patrick> dig into networking and details of the configuration.

I recommend digging in to both HW and OS issues first.

And having more machines to sync with would also be useful.

H

Dennis Hilberg, Jr.

unread,
Oct 19, 2007, 9:50:53 PM10/19/07
to
Patrick Nolan wrote:
> More ignorance on my part. Where would I look for this? I searched
> the kernel source code and didn't find it.

I did a little searching for you. If you're using 2.6 kernel, you're
looking for HZ in /usr/src/linux/include/asm_i386/param.h . In the current
version of the kernel, it's set to 100. I just checked it.

I think you can also set it in the makemenuconfig.

--
Dennis Hilberg, Jr. timekeeper(at)dennishilberg(dot)com
NTP Server Information: http://saturn.dennishilberg.com/ntp.php

Hal Murray

unread,
Oct 19, 2007, 10:21:49 PM10/19/07
to
>> Check the value of a Kernel variable called "HERTZ". Some Linux systems
>> set it to 1000 which is not good for NTP. If yours is set to 1000 (or
>> 250) try changing it to 100.
>
>More ignorance on my part. Where would I look for this? I searched
>the kernel source code and didn't find it.

It's probably HZ rather than HERTZ.

If you use make menuconfig...
Processor type and features ?
Timer frequency (250 HZ) --->
That gets me 3 choices: 100, 250, and 1000.

That's on a 2.6 kernel.

The problem with interrupts is roughly this...
Each time the scheduler clock ticks, the system bumps
the time by x ms. If the CPU is busy doing something
else (like mucking with the file system) and doesn't get around
to updating the clock before the next tick then a ticks worth
of time gets lost. That seems to match your clock always jumping
forwards. (But maybe I have the sign bit backwards.)

I don't know why Reiser uses a lot of CPU at interrupt level,
or even if it does.

If changing HZ fixes things, we should be sure to update the wiki.

Steve Kostecke

unread,
Oct 19, 2007, 10:34:31 PM10/19/07
to
On 2007-10-19, Patrick Nolan <p...@glast2.Stanford.EDU> wrote:

> On 2007-10-19, Hal Murray wrote:
>
>> Do you have any other similar machines?
>
> We bought a batch of 4 identical Dell Optiplex 745 machines. Three are
> running Linux and one Windows. Only one has this problem.

It is possible that you have one motherboard with a hardware problem.

>> Are you missing interrupts? (say because some strange IO device eats
>> up a lot of CPU processing its interrupts)
>
> I'm not sure how to look for this.

It is de-rigueur to mention Linux and missing interrupts in the same
sentence. Even in the absence of evidence.

If you can establish a correlation between clock resets and periods of
heavy IDE disk activity, you may have an interrupt problem.

> I realized something since yesterday. This machine's unique feature
> is that I compiled a new kernel to add the Reiser file system. I
> downloaded the source RPM for the exact version that came with the
> distro, and the only thing I changed was to enable ReiserFS.

Can you try recompiling the kernel without Reiser FS and seeing if that
makes a difference?

On a 2.6.15 kernel the HZ setting may be found in "make menuconfig"
under "Processor type and features" as "Timer frequency".

FWIW, my 2.6.15 kernel is set to 250Hz and ntpd has no problem with it.

--
Steve Kostecke <kost...@ntp.org>
NTP Public Services Project - http://support.ntp.org/

Hal Murray

unread,
Oct 19, 2007, 11:17:55 PM10/19/07
to

>It is de-rigueur to mention Linux and missing interrupts in the same
>sentence. Even in the absence of evidence.

RedHat 7.2 shipped with IDE disks running with DMA disabled.
(I assumed that was a workaround for buggy hardware or a buggy driver.)

That was ok with light loads which covered most usage patterns.
I had a couple of hacks for measuring disk throughput that would
generate lost interrupts every time. Lots of them.

Turning on DMA solved the problem.

Steve Kostecke

unread,
Oct 19, 2007, 10:57:47 PM10/19/07
to
On 2007-10-19, Patrick Nolan <p...@glast2.Stanford.EDU> wrote:
> On 2007-10-19, Richard B. Gilbert <rgilb...@comcast.net> wrote:
>>
>> What's the value stored in your drift file?
>
> Currently it's 74.080. This morning it started out around 30.

That's an excessive change.

>> DON'T use burst! The burst keyword was intended for situations where
>> ntpd has to make a phone call to NIST (or similar service) to get the
>> time. It is NOT suitable for general use over the internet.

'burst' is an 8x multiplier; it causes ntpd to send a burst of 8 packets
at evey poll interval instead of the usual single packet. That's all.

In addition to the previously mentioned use during modem connections,
'burst' is useful in situations where one must overcome packet loss

The problem with using burst against someone else's time server is that
you are greatly increasing the load your ntpd presents to that server.

> Without burst, it just drifts freely.

This would suggest that there may be connectivity issues. Perhaps faulty
hardware or, perhaps, interrupt sharing between the NIC and another busy
portion of the system (e.g. a hardrive).

> Oct 19 13:37:23 client ntpd[12595]: time reset +13.151972 s
> Oct 19 13:55:58 client ntpd[12595]: time reset +8.779090 s
> Oct 19 14:08:09 client ntpd[12595]: time reset +8.712040 s
> Oct 19 14:28:21 client ntpd[12595]: time reset +11.494533 s
> Oct 19 14:44:53 client ntpd[12595]: time reset +9.450835 s

The rate of change is fairly consistant. It ranges from 7.8 to 11.9
ms/sec. This is actually a good sign.

Since the rate of change is fairly consistent (and is always in the same
direction), it's possible that adjusting the tick make be the solution.

You may want to review a larger log to be sure.

>> Check the value of a Kernel variable called "HERTZ". Some Linux systems
>> set it to 1000 which is not good for NTP. If yours is set to 1000 (or
>> 250) try changing it to 100.
>
> More ignorance on my part. Where would I look for this? I searched
> the kernel source code and didn't find it.

Answered in another article.

> Here's another issue. I just learned about the distinction between
> the kernel clock and the hardware (TOY) clock. I have tried running
> hwclock from time to time, comparing it to my WWV-controlled wall
> clock. It never seems to be more than 1 or 2 seconds off. Is there
> any way to exploit this?

The hardware clock in a PC is made of exceedingly cheap components. A
common quartz wristwatch is a better clock.

Dennis Hilberg, Jr.

unread,
Oct 20, 2007, 12:19:20 AM10/20/07
to
Steve Kostecke wrote:
> FWIW, my 2.6.15 kernel is set to 250Hz and ntpd has no problem with it.

My stock Mandriva 2.6.17-5mdv kernel is also set to 250Hz with no problems.

David Woolley

unread,
Oct 20, 2007, 6:25:28 AM10/20/07
to
In article <ywn91wbq...@ntp1.isc.org>,
Harlan Stenn <st...@ntp.org> wrote:

>>> In article <slrnfhg0a...@glast2.Stanford.EDU>, Patrick Nolan <p...@glast2.Stanford.EDU> writes:

Patrick> 6-10 minutes per day, well over the 500 ppm limit.

> As others have noted, ntpd cannot help in this situation, if it cannot be

If it is systematic, it can be pre-corrected using tickadj; however, a
static error this large due to the clock hardware would suggest the need
for a new motherboard, and one due to lost interrupts will vary with
system loading, and, at best, will result in the severe hunting of the
time, and may result in the difference between quiet and busy period
exceeding the ~900 ppm that you can cover by pre-compensating for the
centre of the error band. If the uncertainty in the daily, open loop,
drift really is 10 - 6 = 4 minutes, you almost certainly do have a
lost interrupts problem and pre-compensating will not be enough.

> Yes, but be sure to rule out hardware *and* OS issues first. See above.

A bad clock frequency is a hardware issue!

> Patrick> ntpq -p says this: remote refid st t when poll reach delay offset
> Patrick> jitter
> Patrick> ==============================================================================
> Patrick> grandfather.Sta .GPS. 1 u 49 64 375 0.243 2833.49 1139.84

> grandfather is not sync'd - you will not be able to sync to it while it is
> like this.

grandfather is synced (stratum 1). It has lost its * because the
offset is more than 128ms and ntpd is in the reconfirming offset state.
It does seem to have lost one query, suggesting that, especially if
it is on the LAN, the network is overloaded (wide area networks are
designed to lose the occasional packet). However the delay is very low,
so it is unlikely that even severe network congestion will result in
clock steps. (Alternatively, you just lost a large number of consecutive
clock interrupts, and the real delay is much larger - but then the lost
interrupts would be the primary issue.)

Uwe Klein

unread,
Oct 20, 2007, 7:44:30 AM10/20/07
to
Patrick Nolan wrote:
> On 2007-10-19, Richard B. Gilbert <rgilb...@comcast.net> wrote:

>>Check the value of a Kernel variable called "HERTZ". Some Linux systems
>>set it to 1000 which is not good for NTP. If yours is set to 1000 (or
>>250) try changing it to 100.
>
>
> More ignorance on my part. Where would I look for this? I searched
> the kernel source code and didn't find it.

you can do ( the .config file used is compiled into the kernel )
zcat /proc/config.gz | grep HZ

Looks like this:
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
( 250 Hz for my box )

usually it is also copied into /boot/ when installing the new
kernel as config-<$kernelversion>

Change it in the /usr/src/linux/.config file or use
make <UI of choice>config in /usr/src/linux/ before building
a new kernel.

uwe

Jos van de Ven

unread,
Oct 20, 2007, 9:48:28 AM10/20/07
to
After OP posted the start of this thread, I wanted to know more about this,
but as I dig deeper it becomes more confusing to me.

My server runs fine (I think) on a 2.6.9- kernel. I read that 2.6 kernels
have 1000 Hz as default and you can not change that in config files. At
least I don't see anything related to HZ or HERTZ in my config files.

My relative frequency is stable at about -148 ppm, so my system clock runs
too fast due to 1000 Hz setting as I understand. I read that -148 ppm is not
a good clock; should be no more than absolute 50 ppm. Is it the kernel
that's making my clock bad or just also bad hardware or a combination of
both?
What if I do a tickadj, so the frequency is stable at about 0 ppm, do I have
a better clock then? I don't think so, but other people querying my server,
may think it is very reliable.

Jos van de Ven


"David Woolley" <da...@ex.djwhome.demon.co.uk.invalid> schreef in bericht
news:4719d8c0$0$511$5a6a...@news.aaisp.net.uk...

Maarten Wiltink

unread,
Oct 20, 2007, 11:24:41 AM10/20/07
to
"Jos van de Ven" <j...@notavailable.nl> wrote in message
news:53889$471a0733$5351af52$24...@cache100.multikabel.net...
[...]

> My relative frequency is stable at about -148 ppm, so my system clock
> runs too fast due to 1000 Hz setting as I understand.

It would run fast if not corrected, but I would hesitate to ascribe it
to the kernel's HZ setting.


> I read that -148 ppm is not a good clock; should be no more than
> absolute 50 ppm.

Why not? It's well within the limit of what NTP can correct for.


> Is it the kernel that's making my clock bad or just also bad hardware
> or a combination of both?

You don't know. I don't, either.


> What if I do a tickadj, so the frequency is stable at about 0 ppm,
> do I have a better clock then? I don't think so, but other people
> querying my server, may think it is very reliable.

It could correct for a wider range of transients. If -150 PPM is your
starting point, an extra disturbance of +350 PPM pushes you over the
limit. If you pre-correct, it could deal with disturbances up to
+500 PPM. On the other hand (side), though, you could stand a -650 PPM
disturbance in the first, and 'only' -500 PPM in the second case.

However, even 350 PPM is absurd. If you saw variations like that, I'd
recommend not leaving your computer out in the desert at night.

And those other people should *never* see your uncorrected frequency
offset. Even loading this value from the drift file would correct
for it.

Groetjes,
Maarten Wiltink


Steve Kostecke

unread,
Oct 20, 2007, 11:06:39 AM10/20/07
to
On 2007-10-20, Jos van de Ven <j...@notavailable.nl> wrote:

> My relative frequency is stable at about -148 ppm,

That number is the frequency CORRECTION applied by ntpd.

The fact this this value is stable is much more important than the value
itsself.

> so my system clock runs too fast due to 1000 Hz setting as I
> understand. I read that -148 ppm is not a good clock; should be no
> more than absolute 50 ppm.

That's simply not true.

ntpd is able to apply a frequency correction of between -500 ppm and
+500 ppm.

As long as you are not operating at the margins (i.e. close to +/- 500)
ntpd will be able to discipline your clock.

Hal Murray

unread,
Oct 20, 2007, 1:43:13 PM10/20/07
to
>A bad clock frequency is a hardware issue!

Or a software bug. There have been troubles with roundoff
on the value of the tick size.

Uwe Klein

unread,
Oct 20, 2007, 2:51:42 PM10/20/07
to
Steve Kostecke wrote:
> On 2007-10-20, Jos van de Ven <j...@notavailable.nl> wrote:
>>so my system clock runs too fast due to 1000 Hz setting as I
>>understand. I read that -148 ppm is not a good clock; should be no
>>more than absolute 50 ppm.
>
>
> That's simply not true.
>
> ntpd is able to apply a frequency correction of between -500 ppm and
> +500 ppm.
>
> As long as you are not operating at the margins (i.e. close to +/- 500)
> ntpd will be able to discipline your clock.
>
Just to give an idea about quality of std commercial xtals
el cheapo as found on MOBOs:

tolerance (25°C):
±50ppm
stability over operating temp range:
±50ppm
aging per year:
±5ppm/a

Uwe Klein

unread,
Oct 20, 2007, 3:36:32 PM10/20/07
to
Patrick Nolan wrote:
> I'm having trouble with one linux client out of a group.
> It seems as if ntpd can't keep its clock syncronized. It's
> drifting about 6-10 minutes per day, well over the 500 ppm limit.

Could it be that "clock spread spectrum" is
enabled (in bios) on that box and
not on any of the others?

Depending on the chip used effective clock rate may be lower
than without.

uwe

Patrick Nolan

unread,
Oct 23, 2007, 1:57:22 PM10/23/07
to
On 2007-10-19, Patrick Nolan <p...@glast2.Stanford.EDU> wrote:
>
> I realized something since yesterday. This machine's unique feature is that I
> compiled a new kernel to add the Reiser file system. I downloaded the
> source RPM for the exact version that came with the distro, and the only
> thing I changed was to enable ReiserFS.

I got a chance to reboot the machine with the original distro kernel,
without Reiser FS. The clock problem goes away. ntpd syncs up and stays
synced.

My next step will be to make a kernel with Reiser but with HZ reduced from
1000 to 250. Stay tuned for further developments.

This is kernel 2.6.18 running on an Intel Core Duo, 2.4 GHz.

Uwe Klein

unread,
Oct 23, 2007, 2:20:02 PM10/23/07
to

start with the config from the original kernel
(/boot/config* or cat /sys/config.gz >/usr/src/linux/.config

build once and check that you get the same (correct) behaviour.
activate features you want to use preferably as modules.

this gives easier testing without reboot.

uwe

Patrick Nolan

unread,
Oct 24, 2007, 3:06:10 PM10/24/07
to

It works. With kernel HZ reduced to 250 and a Reiser disk mounted,
ntp is able to sync. I have gone back to a conventional ntp.conf with
multiple servers.

Patrick Nolan

unread,
Oct 24, 2007, 3:24:00 PM10/24/07
to
On 2007-10-20, Steve Kostecke <kost...@ntp.org> wrote:
> The hardware clock in a PC is made of exceedingly cheap components. A
> common quartz wristwatch is a better clock.
>
I have noticed this. Until WWV-controlled clocks came out, my most
accurate timepiece was a $20 Casio wristwatch. When the first one
broke I got another which was just as accurate. It really bugs me
that clocks in computers and cars are not as good. I'm amazed that
clock radios, plugged into the 60 Hz supply, aren't as good.
Sometimes I wish I had an old-fashioned motorized electric clock
powered by AC.

Maarten Wiltink

unread,
Oct 24, 2007, 3:35:58 PM10/24/07
to
"Patrick Nolan" <p...@glast2.Stanford.EDU> wrote in message
news:slrnfhv6u...@glast2.Stanford.EDU...

> [...] I'm amazed that clock radios, plugged into the 60 Hz supply,
> aren't as good [as wrist watches].

So am I. Because here at least, they keep a close eye on the number
of zero crossings in the power plants, and make very sure that it
comes out right at the end of the day.

You'd almost think the clocks depend on 220 V really being 220 V,
rather than 50 Hz really being 50 Hz.

Groetjes,
Maarten Wiltink


Hal Murray

unread,
Oct 24, 2007, 3:38:32 PM10/24/07
to
In article <slrnfhv6u...@glast2.Stanford.EDU>,

Wrist watches work in a nice, temperature controlled environment.

Motors are expensive. Transistors are almost free if you need
the package for something else.

Spoon

unread,
Oct 25, 2007, 5:24:53 AM10/25/07
to
Patrick Nolan wrote:
> Patrick Nolan wrote:

>> Patrick Nolan wrote:
>>> I realized something since yesterday. This machine's unique feature is that I
>>> compiled a new kernel to add the Reiser file system. I downloaded the
>>> source RPM for the exact version that came with the distro, and the only
>>> thing I changed was to enable ReiserFS.
>> I got a chance to reboot the machine with the original distro kernel,
>> without Reiser FS. The clock problem goes away. ntpd syncs up and stays
>> synced.
>>
>> My next step will be to make a kernel with Reiser but with HZ reduced from
>> 1000 to 250. Stay tuned for further developments.
>>
>> This is kernel 2.6.18 running on an Intel Core Duo, 2.4 GHz.
>
> It works. With kernel HZ reduced to 250 and a Reiser disk mounted,
> ntp is able to sync. I have gone back to a conventional ntp.conf with
> multiple servers.

If I understand correctly, the problem was HZ=1000?

Richard B. Gilbert

unread,
Oct 25, 2007, 7:37:01 AM10/25/07
to

Learn to suffer!! Manufacturers will do almost anything to avoid adding
$20 to the cost of their product! People will not buy a watch that does
not keep time. The same cannot be said of computers (or cars)!

The most important considerations, to most people, are "How many GHz?"
and "How much does it cost?" Possibly, some people consider how much
disk space, or the choice of CDRW vs. DVD+/-RW. The clock is seldom
considered at all.

There are something like 300,000,000 people in the US. I doubt if there
are 300 people who contribute to this newsgroup regularly. Consider
also the large number of people who only want to synchronize the clocks
of their computers and simply do not care what time it is.


Richard B. Gilbert

unread,
Oct 25, 2007, 7:47:34 AM10/25/07
to
Maarten Wiltink wrote:
> "Patrick Nolan" <p...@glast2.Stanford.EDU> wrote in message
> news:slrnfhv6u...@glast2.Stanford.EDU...
>
>
>>[...] I'm amazed that clock radios, plugged into the 60 Hz supply,
>>aren't as good [as wrist watches].
>
>
> So am I. Because here at least, they keep a close eye on the number
> of zero crossings in the power plants, and make very sure that it
> comes out right at the end of the day.

I think that power companies almost everywhere count the cycles in the
day. Since most are interconnected, staying in phase is mandatory! So
there are 60*86400 or 50*86400 cycles per day but the difference between
any two consecutive crossings is not guaranteed to be exactly 1/100 or
1/120 of a second. The generators slow down when the load increases or
speed up when the load decreases!


Patrick Nolan

unread,
Oct 25, 2007, 1:12:59 PM10/25/07
to

It was the combination of Reiser FS and HZ=1000.
HZ=250 with Reiser works.
HZ=1000 without Reiser works.
HZ=1000 with Reiser doesn't work.

Evandro Menezes

unread,
Oct 25, 2007, 3:40:59 PM10/25/07
to
Having had my share of bad HW and poorly configured SW leading to
haywire in NTP, I'd add a couple of suggestions to what others have
already said:

- power management with dynamic CPU clock control may confuse some
kernels on some systems, when using the kernel boot option helps.

- some patches tackle buggy chipsets, when the best is to copy the
original config file, as others suggested, and just changing the
option you're interested in.

HTH

Evandro Menezes

unread,
Oct 25, 2007, 3:50:56 PM10/25/07
to
On Oct 25, 2:40 pm, Evandro Menezes <evan...@mailinator.com> wrote:

> - power management with dynamic CPU clock control may confuse some
> kernels on some systems, when using the kernel boot option helps.

Ahem, "when using the kernel boot option NOTSC helps."

David L. Mills

unread,
Oct 26, 2007, 10:06:06 AM10/26/07
to
Dennis,

As long as you do not use the kernel at other than 100 Hz, things should
go well. If you do use the kernel at 250 Hz, you should check the
overshoot of the impulse response; it will probably be in the 10-15
percent range, which is not good.

It's easy to fix the problem with the kernel and on previous occasions
folks have patched the Linux kernel to workm even with Hz 1000 Hz. As
proof, note the Digital RISC has Hz 256 and the Alpha has Hz 1024. As
the Linux kernelmongers seem to have little interest in fixing this and
related problems with ntpd, my suggested workaround is to use FreeBSD.

Dave

0 new messages