Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

NTP 4.1.0 not replying from correct source address

0 views
Skip to first unread message

David Schwartz

unread,
Sep 5, 2001, 6:30:47 PM9/5/01
to

I thought NTP bound individually to each interface address so it could
respond with the correct source address, but this is not true:

15:27:06.064550 eth0 < 206.x.y.140.33084 > 216.a.b.143.ntp: v2 res2
strat 0 poll 2 prec 4 (DF)
15:27:06.064748 eth0 > 216.a.b.137.ntp > 206.x.y.140.33084: [len=8] v2
-1s res2 strat 0 poll 2 prec 4 (DF) [tos 0x10]

This causes tools like 'ntpdc' to ignore the reply. Is there a
configuration option to fix this by causing NTP to bind individually to
each IP.

Is this intentional to prevent the same NTP server from appearing as
two different servers? If so, wouldn't a 'server ID' (which could also
be the 'preferred' IP on which to contact the server) be a better
solution?

DS

David L. Mills

unread,
Sep 5, 2001, 9:59:07 PM9/5/01
to David Schwartz
David,

NTPv4 stateless servers should respond with the same interface address
the client packet came in on. A stateful before sending the first packet
a stateful client will attempt to determine the outgoing interface to
use before the packet is sent. This is necessary for autokey to work;
however, at least one system (Linux) doesn't cooperate with the same
code other systems use. I read your client version as NTPv2; I'm not
sure I believe that ten year old dinosaur is still around. Also, what's
with that port number? If I understand the other numbers, the poll value
and precision values are bogus and the server is not synchronized. Are
you sure you are running NTP?

Dave

David Schwartz

unread,
Sep 5, 2001, 11:31:19 PM9/5/01
to
"David L. Mills" wrote:

> NTPv4 stateless servers should respond with the same interface address
> the client packet came in on. A stateful before sending the first packet
> a stateful client will attempt to determine the outgoing interface to
> use before the packet is sent. This is necessary for autokey to work;
> however, at least one system (Linux) doesn't cooperate with the same
> code other systems use. I read your client version as NTPv2; I'm not
> sure I believe that ten year old dinosaur is still around. Also, what's
> with that port number? If I understand the other numbers, the poll value
> and precision values are bogus and the server is not synchronized. Are
> you sure you are running NTP?

The server is running 4.1.0 and is Linux. I simply queried it with
'ntpq'. The same thing happens if I query it with 'ntpdate'.

DS

David L. Mills

unread,
Sep 6, 2001, 6:11:05 PM9/6/01
to David Schwartz
David Schwartz wrote:
>
> "David L. Mills" wrote:
...

I read your client version as NTPv2; I'm not
> > sure I believe that ten year old dinosaur is still around. Also, what's
> > with that port number? If I understand the other numbers, the poll value
> > and precision values are bogus and the server is not synchronized. Are
> > you sure you are running NTP?
>
> The server is running 4.1.0 and is Linux. I simply queried it with
> 'ntpq'. The same thing happens if I query it with 'ntpdate'.
>
> DS

Please answer the questions in my message. Why do those fields read as
they do? Obviously, something is seriously broken.

Dave

Ulrich Windl

unread,
Sep 7, 2001, 1:55:25 AM9/7/01
to
David Schwartz <dav...@webmaster.com> writes:

You could try uninterpreted raw data. Maybe your tcp dump is old or broken.

>
> DS

David Schwartz

unread,
Sep 7, 2001, 3:32:53 PM9/7/01
to
Ulrich Windl wrote:

> You could try uninterpreted raw data. Maybe your tcp dump is old or broken.

It's the one included with RedHat 7.1. Even if it is old or broken I
doubt it got the source address wrong.

DS

David L. Mills

unread,
Sep 7, 2001, 10:06:05 PM9/7/01
to David Schwartz
David Schwartz wrote:
>
> "David L. Mills" wrote:
...
I read your client version as NTPv2; I'm not
> > sure I believe that ten year old dinosaur is still around. Also, what's
> > with that port number? If I understand the other numbers, the poll value
> > and precision values are bogus and the server is not synchronized. Are
> > you sure you are running NTP?
>
> The server is running 4.1.0 and is Linux. I simply queried it with
> 'ntpq'. The same thing happens if I query it with 'ntpdate'.
>
> DS

Please answer the questions in my message. Why do those fields read as

David L. Mills

unread,
Sep 7, 2001, 10:37:50 PM9/7/01
to David Schwartz
David Schwartz wrote:
>
> "David L. Mills" wrote:
...
I read your client version as NTPv2; I'm not
> > sure I believe that ten year old dinosaur is still around. Also, what's
> > with that port number? If I understand the other numbers, the poll value
> > and precision values are bogus and the server is not synchronized. Are
> > you sure you are running NTP?
>
> The server is running 4.1.0 and is Linux. I simply queried it with
> 'ntpq'. The same thing happens if I query it with 'ntpdate'.
>
> DS

Please answer the questions in my message. Why do those fields read as

David L. Mills

unread,
Sep 7, 2001, 10:38:13 PM9/7/01
to David Schwartz
David Schwartz wrote:
>
> "David L. Mills" wrote:
...
I read your client version as NTPv2; I'm not
> > sure I believe that ten year old dinosaur is still around. Also, what's
> > with that port number? If I understand the other numbers, the poll value
> > and precision values are bogus and the server is not synchronized. Are
> > you sure you are running NTP?
>
> The server is running 4.1.0 and is Linux. I simply queried it with
> 'ntpq'. The same thing happens if I query it with 'ntpdate'.
>
> DS

Please answer the questions in my message. Why do those fields read as

David L. Mills

unread,
Sep 7, 2001, 10:41:35 PM9/7/01
to David Schwartz
David Schwartz wrote:
>
> "David L. Mills" wrote:
...
I read your client version as NTPv2; I'm not
> > sure I believe that ten year old dinosaur is still around. Also, what's
> > with that port number? If I understand the other numbers, the poll value
> > and precision values are bogus and the server is not synchronized. Are
> > you sure you are running NTP?
>
> The server is running 4.1.0 and is Linux. I simply queried it with
> 'ntpq'. The same thing happens if I query it with 'ntpdate'.
>
> DS

Please answer the questions in my message. Why do those fields read as

David Schwartz

unread,
Sep 7, 2001, 11:05:05 PM9/7/01
to

Okay, here's the symptom:

$ uname -a
Linux machine.querying.from 2.4.10-pre4 #14 SMP Tue Sep 4 16:39:23 PDT
2001 i686 unknown
$ ntpdate -v
7 Sep 19:48:31 ntpdate[22872]: ntpdate 4.1.0 Fri Aug 3 16:42:46 PDT
2001 (1)
7 Sep 19:48:31 ntpdate[22872]: no servers can be used, exiting

$ ntpdate -q some.second.name
server some.second.name, stratum 0, offset 0.000000, delay 0.00000
7 Sep 19:48:45 ntpdate[22874]: no server suitable for synchronization
found

$ ntpdate -q some.first.name
server some.first.name, stratum 3, offset 0.002279, delay 0.08676
7 Sep 19:48:53 ntpdate[22875]: adjust time server 1.1.1.1 offset
0.002279 sec

$ ntpq -c "rv 0 daemon_version" some.first.name
status=06f4 leap_none, sync_ntp, 15 events, event_peer/strat_chg,
daemon_version="ntpd 4.1.0 Wed Sep 5 15:20:10 PDT 2001 (1)"

Now 'some.first.name' and 'some.second.name' are the same
physical
machine. 'some.first.name' resolves to the primary IP,
'some.second.name' is
a secondary. The machine runs Linux 2.4.

Now here are the queries that didn't work:

machine.querying.from.33189 > some.second.name.webchat.org.ntp:
v4 client strat 0 poll 4 prec -6 (DF)

some.first.name.some.second.name.com.ntp > machine.querying.from.33189:
v4 server strat 3 poll 4 prec -17 (DF) [tos 0x10]

This repeats four times. I keep querying 'some.second.name', but
'some.first.name' keeps responding. Here's what it looks like when I
query
'some.first.name':

machine.querying.from.33190 > some.first.name.some.second.name.com.ntp:
v4 client strat 0 poll 4 prec -6 (DF)

some.first.name.some.second.name.com.ntp > machine.querying.from.33190:
v4 server strat 3 poll 4 prec -17 (DF) [tos 0x10]

In this case, 'some.first.name' responds and all is well.
Otherwise,
the two queries are identical.

DS

David L. Mills

unread,
Sep 8, 2001, 12:22:24 PM9/8/01
to David Schwartz
David,

Please, I'm not getting through. I need the trace from the <server>
itself. The data you send are from the client and are insufficient to
pin the cause. Rememver, the server is stateless, so only the debug
trace will reveal the actual addresses.

Dave

David L. Mills

unread,
Sep 8, 2001, 2:21:39 PM9/8/01
to
David,

A FreeBSD router deep in the tangle of things runs latest NTPv4 with
nine interfaces (!) reachable from here. Both ntpd and ntpdate use the
same NTP client mode packet format and the servers can't tell which
program sent them. I cranked up a NTPv4 ntpd client and aimed at two
different interfaces at the server with the debug trace enabled. From
the trace I extracted the relevant transmit and receive packet events
which show the addresses in the transmit packet together with the
corresponding addresses in the receive packet.

transmit: at 3 128.4.2.9->140.173.4.66 mode 3
receive: at 3 128.4.2.9<-140.173.4.66 mode 4 code 1
transmit: at 4 128.4.2.9->140.173.6.81 mode 3
receive: at 4 128.4.2.9<-140.173.6.81 mode 4 code 1

I have to believe this particular experiment shows the server is doing
as I expect and returning the addresses given. I don't know what is
different in your experience, but you could do the same thing I did and
report.

Dave

Per Hedeland

unread,
Sep 8, 2001, 3:00:44 PM9/8/01
to
In article <3B998AE1...@webmaster.com> David Schwartz

<dav...@webmaster.com> writes:
> Now here are the queries that didn't work:
>
> machine.querying.from.33189 > some.second.name.webchat.org.ntp:
> v4 client strat 0 poll 4 prec -6 (DF)
>
> some.first.name.some.second.name.com.ntp > machine.querying.from.33189:
> v4 server strat 3 poll 4 prec -17 (DF) [tos 0x10]
>
> This repeats four times. I keep querying 'some.second.name', but
>'some.first.name' keeps responding.

I don't know if this has changed in recent versions, but in the past
(x)ntpd wouldn't find out about interfaces configured after it had
started (and consequently never use their address as source address on
outgoing packets). Could this be the case for your setup? Is ntpd
listening on the some.second.name IP address?

--Per Hedeland
p...@bluetail.com

David Schwartz

unread,
Sep 8, 2001, 4:30:32 PM9/8/01
to
Per Hedeland wrote:

> I don't know if this has changed in recent versions, but in the past
> (x)ntpd wouldn't find out about interfaces configured after it had
> started (and consequently never use their address as source address on
> outgoing packets). Could this be the case for your setup? Is ntpd
> listening on the some.second.name IP address?
>
> --Per Hedeland
> p...@bluetail.com

Another clue! All the interface were set up when the server started,
but the NTP server only bound to the primary!

udp 0 0 prim.ip.address:123
0.0.0.0:*
udp 0 0 127.0.0.1:123
0.0.0.0:*
udp 0 0 0.0.0.0:123 0.0.0.0:*

So there's probably some problem with the code that traverses the list
of interfaces for Linux.

DS

David Schwartz

unread,
Sep 8, 2001, 4:28:48 PM9/8/01
to
"David L. Mills" wrote:

> transmit: at 3 128.4.2.9->140.173.4.66 mode 3
> receive: at 3 128.4.2.9<-140.173.4.66 mode 4 code 1
> transmit: at 4 128.4.2.9->140.173.6.81 mode 3
> receive: at 4 128.4.2.9<-140.173.6.81 mode 4 code 1

receive: at 17 0.0.0.0<-ip.of.querying.machine mode 3 code 2
transmit: at 17 primary.ip.of.server->ip.of.querying.machine mode 4

For some reason, NTP is not getting/reporting the address on which it
received the packet.

DS

David Schwartz

unread,
Sep 9, 2001, 8:23:43 PM9/9/01
to

> So there's probably some problem with the code that traverses the list
> of interfaces for Linux.

/* Exclude logical interfaces (indicated by ':' in the
interfac
e name) */
if (debug)
printf("interface <%s> ", ifr->ifr_name);
if ((listen_to_virtual_ips == 0)
&& (strchr(ifr->ifr_name, (int)':') != NULL)) {
if (debug)
printf("ignored\n");
continue;
}
if (debug)
printf("OK\n");

Hmm, why are logical interfaces excluded?!

interface <lo> OK
interface <eth0> OK
interface <eth0:0> ignored
interface <eth0:1> ignored
interface <eth0:2> ignored

That's the problem. If there's a reason to exclude logical interfaces,
it sure doesn't apply on Linux.

DS

David Schwartz

unread,
Sep 9, 2001, 8:25:58 PM9/9/01
to

Well I'll be, 'ntpd -L' fixed my problem. Why isn't that the default?

DS

David L. Mills

unread,
Sep 9, 2001, 10:26:38 PM9/9/01
to
David,

Absolutely; the 0.0.0.0 indicates the packet landed on the wildcard
interface. The ntpd tries real hard to find the real interface the
packet landed on and this works gangbusters for every system but Linux.
So, the only thing ntpd can do is interchange the the source address and
destination address from the wildcard interface and toss the packet back
over the fence. What you see is what you get. As I said, it all works as
intended in every other system we can test here, except it doesn't work
in Linux. See the findinterface() routine in ntp_io.c. If somebody can
fix that for Linux, we would all get well.

Dave

David L. Mills

unread,
Sep 9, 2001, 10:26:23 PM9/9/01
to David Schwartz
David,

Absolutely; the 0.0.0.0 indicates the packet landed on the wildcard
interface. The ntpd tries real hard to find the real interface the
packet landed on and this works gangbusters for every system but Linux.
So, the only thing ntpd can do is interchange the the source address and
destination address from the wildcard interface and toss the packet back
over the fence. What you see is what you get. As I said, it all works as
intended in every other system we can test here, except it doesn't work
in Linux. See the findinterface() routine in ntp_io.c. If somebody can
fix that for Linux, we would all get well.

Dave

David L. Mills

unread,
Sep 9, 2001, 10:38:47 PM9/9/01
to
David,

Approximately 4,529 complaints have been filed about the miserable state
of the I/O code in ntp_io.c, some of which is well over a decade old.
About a third say the only way to fix it is to write a separate I/O
module for each operating system and each version and patchlevel of that
system; about a third say the only remedy is a complete rewrite; and the
remaining third say toss the entire system and rewrite in Java (don't
laugh).

Seriously, I would strike a medal for the first guy that wades in and
really fixes the I/O stuff. I've held it together with string and glue
for almost its entire life, but I'm no expert on the intricate dances
the various systems play. The real problem as I see it is dealing
properly with virtual interfaces and proper multicast behavior.
Voluteers always profoundly welcomed.

Dave

David L. Mills

unread,
Sep 9, 2001, 10:42:20 PM9/9/01
to David Schwartz
David,

As I recall, this was a relatively recent patch contribution. I don't
know why the author excluded logical interfaces.

Dave

David L. Mills

unread,
Sep 9, 2001, 10:44:04 PM9/9/01
to David Schwartz
David,

I don't know why this is the default. Maybe somebody actually reading
these exchanges might know.

Dave

Kees Hendrikse

unread,
Sep 10, 2001, 3:02:26 PM9/10/01
to
On Mon, 10 Sep 2001 02:38:47, "David L. Mills" <mi...@udel.edu> wrote:

> Seriously, I would strike a medal for the first guy that wades in and
> really fixes the I/O stuff. I've held it together with string and glue
> for almost its entire life, but I'm no expert on the intricate dances
> the various systems play.

I think the problem is you'll find no single person out there that knows
enough about the various TCP/IP implementations.
About 2 years ago I examined the io stuff seriously, to see how
easy/hard
it would be to make the IP's ntpd listens on configurable. That's
already
quite hard to do. I probably could rewrite the IO routines for *BSD
TCP/IP,
but I know zilch about the inards of Linux, Win, HP/UX, VMS etc. TCP/IP
implementations. Well, I do know they look dangerously alike.

I think you'll need to prioritize OS-support, have someone write a new
ntp_io.c from scratch for the "most important" OS, test and debug it
vigorously, then start adding support for other implementations. Until
support for a particular OS is added, it has to run with the old
ntp_io.c

> The real problem as I see it is dealing
> properly with virtual interfaces and proper multicast behavior.

You'll be needing IPv6 support as well, to complicate matters even more.

--
Kees Hendrikse | email:
ke...@echelon.nl
| web:
www.echelon.nl
ECHELON consultancy and software development | phone: +31 (0)53 48 36
585
PO Box 545, 7500AM Enschede, The Netherlands | fax: +31 (0)53 43 36
222

David Schwartz

unread,
Sep 10, 2001, 11:28:51 PM9/10/01
to

I suggest the following changes to NTP, which I'm willing to code.
Comments wanted:

1) NTP would be changed to, by default, bind to up to 16 virtual
interfaces.

2) The '-L' option would keep its current meaning, causing NTP to bind
to all virtual interfaces (raising the limit from 16 to infinity).

3) A '-B' option would be added that takes an integer parameter
specifying the number of virtual interfaces to bind to. '-B 0' would be
equivalent to the current default behavior.

This would solve the problem I reported. And I believe the default
configuration would handle every reasonable case I can think of.

DS

Matthias Andree

unread,
Sep 11, 2001, 12:16:51 AM9/11/01
to
"David L. Mills" <mi...@udel.edu> writes:

> Seriously, I would strike a medal for the first guy that wades in and
> really fixes the I/O stuff. I've held it together with string and glue
> for almost its entire life, but I'm no expert on the intricate dances
> the various systems play. The real problem as I see it is dealing
> properly with virtual interfaces and proper multicast behavior.
> Voluteers always profoundly welcomed.

Well, be careful with virtual interfaces on Linux for now. What works on
FreeBSD 4.4-RC breaks on Linux (2.2 and 2.4). Workaround: On Linux, make
absolutely sure they use distinct names like eth0:0, eth0:1. SIOCGIFCONF
will happily return alias addresses for eth0 f. ex. as well, but
SIOCGIF* cannot look up the proper broadcast addresses and netmasks
unless the name of the interface is unique.

I filed compatibility patches against Linux 2.4.9, not sure if and when
they'll appear, however I don't think that these would solve e. g. the
broadcast receive problem on Linux that ntp 4.1.0 shows and that I
reported some time ago.

--
Matthias Andree
Outlook (Express) users: press Ctrl+F3 for the full source code of this post.
begin dont_click_this_virus.exe
end

Ulrich Windl

unread,
Sep 11, 2001, 7:31:08 AM9/11/01
to
David Schwartz <dav...@webmaster.com> writes:

> I suggest the following changes to NTP, which I'm willing to code.
> Comments wanted:
>
> 1) NTP would be changed to, by default, bind to up to 16 virtual
> interfaces.
>
> 2) The '-L' option would keep its current meaning, causing NTP to bind
> to all virtual interfaces (raising the limit from 16 to infinity).
>
> 3) A '-B' option would be added that takes an integer parameter
> specifying the number of virtual interfaces to bind to. '-B 0' would be
> equivalent to the current default behavior.

Instead of "-B #" I'd explicitly list the addresses to bind to; either each on command line, or in the configuration. Maybe with a

restrict _ip_ _mask_ nobind

Shouldn't be that hard to implement.

>
> This would solve the problem I reported. And I believe the default
> configuration would handle every reasonable case I can think of.
>
> DS

Ulrich

David Schwartz

unread,
Sep 11, 2001, 4:59:27 PM9/11/01
to
Ulrich Windl wrote:

> Instead of "-B #" I'd explicitly list the addresses to bind to; either each on command line, or in the configuration. Maybe with a
>
> restrict _ip_ _mask_ nobind
>
> Shouldn't be that hard to implement.

But what should the default behavior be? The thing I like about my
proposed solution is that the default behavior should be reasonable in
every case I can imagine and safe in all cases.

DS

David L. Mills

unread,
Sep 11, 2001, 8:31:18 PM9/11/01
to
David,

I love this idea, since it solves lots more cases than just the command
line switches. I propose a default of no bind virtual and restrict * for
bind everything. If you default on, then it gets a little hard disable
it if you need to.

Dave

Ulrich Windl

unread,
Sep 12, 2001, 4:13:44 AM9/12/01
to
David Schwartz <dav...@webmaster.com> writes:

> Ulrich Windl wrote:
>
> > Instead of "-B #" I'd explicitly list the addresses to bind to; either each on command line, or in the configuration. Maybe with a
> >
> > restrict _ip_ _mask_ nobind
> >
> > Shouldn't be that hard to implement.
>
> But what should the default behavior be? The thing I like about my

The default would be (assuming restrict nobind) to bind each enad
every interface.

> proposed solution is that the default behavior should be reasonable in
> every case I can imagine and safe in all cases.

Most people have very few interfaces and they don't care if ntpd
listens on them. Just as sendmail or named does...

Regards,
Ulrich

Per Hedeland

unread,
Sep 12, 2001, 5:23:33 AM9/12/01
to
In article <snfd74w...@rrzc2.rz.uni-regensburg.de> Ulrich Windl

<wiu0...@rrzc2.rz.uni-regensburg.de> writes:
>David Schwartz <dav...@webmaster.com> writes:
>
>> Ulrich Windl wrote:
>>
>> > Instead of "-B #" I'd explicitly list the addresses to bind to;
>either each on command line, or in the configuration. Maybe with a
>> >
>> > restrict _ip_ _mask_ nobind

FWIW, I agree - being able to specify a number seems mostly useless, if
I'd want ntpd to bind to some but not all of my virtual interfaces, I'd
certainly want to specify *which* interfaces it should bind to. Plus of
course this functionality could be useful in scenarios with multiple
physical interfaces too (plus of course being able to specify the
interfaces/addresses is basically defacto standard for anything that has
the ability to bind to specific ones).

>> > Shouldn't be that hard to implement.

I won't comment on that.:-)

>> But what should the default behavior be? The thing I like about my
>
>The default would be (assuming restrict nobind) to bind each enad
>every interface.

Yes, regardless of the mechanism, the default must certainly be to bind
to none or all (actually I'd probably extend that to cover non-virtual
interfaces, making "all" the only reasonable default). A hardcoded magic
number seems like a Really Bad Idea, programmers' value judgements about
what is "reasonable" tend to come back and haunt them ("but it would be
unreasonable for the user to enter more than 255 characters in response
to this question"... - not implying any similar consequenes in this case
of course, the point is that you can never guess what people will try to
do with your program if it's wide-spread and long-lived enough - and
ntpd certainly is).

David, you just went through some pain to figure out that your brand-new
ntpd version doesn't bind to any virtual interfaces on Linux. Imagine if
you'd had the 16 limit and had been running that version happily for
years (well "a long time" anyway:-), and suddenly it doesn't work as
expected anymore - because you just added your 17th virtual interface
(and perhaps even restarted the system with the result that the 17th
interface worked fine, but one of the old ones didn't anymore...).

--Per Hedeland
p...@bluetail.com

David L. Mills

unread,
Sep 12, 2001, 10:19:14 AM9/12/01
to Per Hedeland
Per,

I've had several requests from folks who want to bind only to specific
physical interfaces, such as those on one side of the firewall or the
other. I've used a virtual interface to tunnel a different network over
a local wire and did in fact need that capability myself. I would not
want the default behavior to bind all virtual interfaces.

Dave

Per Hedeland

unread,
Sep 12, 2001, 1:57:16 PM9/12/01
to
In article <3B9F6EE2...@udel.edu> "David L. Mills"

<mi...@udel.edu> writes:
>
>I've had several requests from folks who want to bind only to specific
>physical interfaces, such as those on one side of the firewall or the
>other. I've used a virtual interface to tunnel a different network over
>a local wire and did in fact need that capability myself.

Yes... I don't think I wrote anything that suggested that this wouldn't
be the case, quite the opposite in fact. And functionality along the
lines of what Ulrich suggested seems to be a good fit for that (as I
wrote).

> I would not
>want the default behavior to bind all virtual interfaces.

Well, the question I think is what exactly is the logic of a default
behaviour of "bind to all physical interfaces and no virtual
interfaces". We've already seen one quite knowledgeable user being
confused by it (and you yourself unable to diagnose the cause of his
confusion), I suspect there will be more. And I'm not aware of any other
major software that has such a behaviour as default.

Also, in the case where you for some reason find yourself needing to
have a host with multiple IP addresses on a single network segment, the
choice of which one of them is "physical" and which are "virtual" can be
rather arbitrary - and you won't generally expect the "virtual" ones to
be somehow "inferior".

Finally, the case that I assume (at least ISTR reading about it here
quite a while ago) motivated this change, people having virtually tons
of virtual interfaces that turns into a problem with ntpd needing to
have one socket for each and every one of them is, relatively speaking,
rare. I'd say that expecting those people to do special configuration to
deal with it is quite reasonable (they presumably have to do it for at
least BIND too).

But my main point was that a default behaviour of "bind to all physical
interfaces and at most 16 virtual interfaces" is worse.:-)

--Per Hedeland
p...@bluetail.com


David L. Mills

unread,
Sep 12, 2001, 4:32:20 PM9/12/01
to Per Hedeland
Per,

You assume too much. I don't have the understanding and experience of
others such as you in these matters. Why don't you (and others here)
craft a work statement how the thing should work and I'll try to find a
volunteer to implement it. The ground rules are that it be implemented
using the restrict mechanism, presumably by defining new types, as many
as needed.

Dave

David Schwartz

unread,
Sep 12, 2001, 4:43:32 PM9/12/01
to
Per Hedeland wrote:

> FWIW, I agree - being able to specify a number seems mostly useless, if
> I'd want ntpd to bind to some but not all of my virtual interfaces, I'd
> certainly want to specify *which* interfaces it should bind to.

Odds are, you'd wind it to bind only to the primary interface(s). The
idea is to come up with a default configuration that provides sensible
behavior on all reasonable configurations. Binding to a limited number
of virtual interfaces does this.

> Plus of
> course this functionality could be useful in scenarios with multiple
> physical interfaces too (plus of course being able to specify the
> interfaces/addresses is basically defacto standard for anything that has
> the ability to bind to specific ones).

The issue of how to obtain fine control is different from the issue of
how to create a sensible default that works on most configurations and
fails spectacularly on no reasonable configurations.



> >The default would be (assuming restrict nobind) to bind each enad
> >every interface.

That will fail on configurations where you have more than 256 virtual
interfaces and you have a per-process limit of 256 fds. Now that's not
common, but IMO it's common enough that it shouldn't cause NTP to fail
in its default configuration.



> Yes, regardless of the mechanism, the default must certainly be to bind
> to none or all (actually I'd probably extend that to cover non-virtual
> interfaces, making "all" the only reasonable default).

None breaks on any machine with one virtual interface. All breaks on
any machine with many. There are possible defaults that break on no
configurations.

> A hardcoded magic
> number seems like a Really Bad Idea, programmers' value judgements about
> what is "reasonable" tend to come back and haunt them ("but it would be
> unreasonable for the user to enter more than 255 characters in response
> to this question"... - not implying any similar consequenes in this case
> of course, the point is that you can never guess what people will try to
> do with your program if it's wide-spread and long-lived enough - and
> ntpd certainly is).

Well that's what being able to change the default is for! Obviously the
default won't work right in every bizarre case you can possibly imagine.
How can you do better than the pick a default that you believe works
reasonably in all cases?!

> David, you just went through some pain to figure out that your brand-new
> ntpd version doesn't bind to any virtual interfaces on Linux. Imagine if
> you'd had the 16 limit and had been running that version happily for
> years (well "a long time" anyway:-), and suddenly it doesn't work as
> expected anymore - because you just added your 17th virtual interface
> (and perhaps even restarted the system with the result that the 17th
> interface worked fine, but one of the old ones didn't anymore...).

I'd gladly trade that for it not horribly breaking in reasonable
configurations. I'd gladly trade that for consistent defaults across
platforms. I'd gladly trade that for documented behavior.

DS

David Schwartz

unread,
Sep 12, 2001, 4:47:48 PM9/12/01
to
Per Hedeland wrote:

> Well, the question I think is what exactly is the logic of a default
> behaviour of "bind to all physical interfaces and no virtual
> interfaces". We've already seen one quite knowledgeable user being
> confused by it (and you yourself unable to diagnose the cause of his
> confusion), I suspect there will be more. And I'm not aware of any other
> major software that has such a behaviour as default.

Well this isn't the default for NTP either. It's the *intended* default
for NTP. However the code that implements this default is broken and
only functions as designed (and it's designed to break things) on Linux.



> Finally, the case that I assume (at least ISTR reading about it here
> quite a while ago) motivated this change, people having virtually tons
> of virtual interfaces that turns into a problem with ntpd needing to
> have one socket for each and every one of them is, relatively speaking,
> rare. I'd say that expecting those people to do special configuration to
> deal with it is quite reasonable (they presumably have to do it for at
> least BIND too).
>
> But my main point was that a default behaviour of "bind to all physical
> interfaces and at most 16 virtual interfaces" is worse.:-)

The problem is, I'm hearing that only very minor changes would be
reasonable for 4.1. So any brilliant plan for massive configuration
improvements won't help in the code stream. That code stream is in
widespread use, and is broken on Linux. So there is a need for a
solution that breaks nothing and fixes the common cases.

How about this, on Linux, bind to the first 16 virtual interfaces by
default instead of none. Keep '-L' to allow binding to all. Add an
option to allow binding to none. Leave the defaults the same on all
other platforms.

Long term grand plans are great. But the problem is, the current code
is broken and needs to be fixed. A quick fix is possible. Yes, it won't
be perfect, but the current code isn't perfect, in fact, it's horribly
broken.

DS

Per Hedeland

unread,
Sep 13, 2001, 6:54:39 PM9/13/01
to
In article <3B9FC654...@udel.edu> "David L. Mills"
<mi...@udel.edu> writes:
>
>You assume too much.

Well, hopefully it isn't too much to assume that there was *some* reason
for this (intended) departure from the traditional default behaviour of
(x)ntpd in this respect. My assumption (singular) seemed pretty
reasonable to me, but of course I could be wrong - the ChangeLog and
NEWS files in the distribution don't give any real clues in the matter.

> I don't have the understanding and experience of
>others such as you in these matters. Why don't you (and others here)
>craft a work statement how the thing should work and I'll try to find a
>volunteer to implement it. The ground rules are that it be implemented
>using the restrict mechanism, presumably by defining new types, as many
>as needed.

I think Ulrich's suggestion is fine.

--Per Hedeland
p...@bluetail.com

Per Hedeland

unread,
Sep 13, 2001, 7:02:44 PM9/13/01
to
In article <3B9FC9F4...@webmaster.com> David Schwartz
<dav...@webmaster.com> writes:

> Well this isn't the default for NTP either. It's the *intended* default
>for NTP. However the code that implements this default is broken and
>only functions as designed (and it's designed to break things) on Linux.

> The problem is, I'm hearing that only very minor changes would be


>reasonable for 4.1. So any brilliant plan for massive configuration
>improvements won't help in the code stream. That code stream is in
>widespread use, and is broken on Linux. So there is a need for a
>solution that breaks nothing and fixes the common cases.

From this (assuming you are correct, and I have no reason to doubt
that), it seems obvious to me that the only reasonable "quick fix" is to
change the code to work the same on Linux as on all other platforms, the
same as (x)ntpd traditionally has worked, the same as other widespread
software does.

--Per Hedeland
p...@bluetail.com

Per Hedeland

unread,
Sep 13, 2001, 7:16:25 PM9/13/01
to
In article <3B9FC8F4...@webmaster.com> David Schwartz

<dav...@webmaster.com> writes:
> Well that's what being able to change the default is for! Obviously the
>default won't work right in every bizarre case you can possibly imagine.
>How can you do better than the pick a default that you believe works
>reasonably in all cases?!

I think we should just agree to disagree on this. My definite opinion is
that a default that follows the "principle of least surprise" and is
logically sound - which, to me, definitely implies that it doesn't have
hidden "magic numbers" just because someone at some point in time
decided that precisely those numbers were the "reasonable" ones - is
well worth the price of total non-function *of the default* in certain
unusual cases. (A price which I consider far lower than that of partial
non-function occuring at incremental modification of the system
environment, btw.) And if it's good enough for BIND, I certainly think
it's good enough for ntpd.

--Per Hedeland
p...@bluetail.com

David Schwartz

unread,
Sep 13, 2001, 7:53:22 PM9/13/01
to
Per Hedeland wrote:

> I think we should just agree to disagree on this. My definite opinion is
> that a default that follows the "principle of least surprise" and is
> logically sound - which, to me, definitely implies that it doesn't have
> hidden "magic numbers" just because someone at some point in time
> decided that precisely those numbers were the "reasonable" ones - is
> well worth the price of total non-function *of the default* in certain
> unusual cases. (A price which I consider far lower than that of partial
> non-function occuring at incremental modification of the system
> environment, btw.) And if it's good enough for BIND, I certainly think
> it's good enough for ntpd.

My only issue with this is that it will cause configurations that
currently work to fail. I've been given the impression that this is not
acceptable in the 4.1 frame.

DS

David L. Mills

unread,
Sep 14, 2001, 1:25:12 AM9/14/01
to Per Hedeland
Per,

The current, probably defective, code was the result of a patch
submitted by somebode else with more knowledge than me. Perhaps I was
flagrantly incompetent in agreeing to incorporate it, but the only
solution as I see it is for somebody such as you to submit a definitive
patch that gets everybody well.

Dave

David Schwartz

unread,
Sep 14, 2001, 7:42:09 PM9/14/01
to
David Schwartz wrote:

> Well this isn't the default for NTP either. It's the *intended* default
> for NTP. However the code that implements this default is broken and
> only functions as designed (and it's designed to break things) on Linux.

This statement was in error. This change affects Solaris as well, as
Solaris names its virtual interfaces with a ':' also.

DS

Per Hedeland

unread,
Sep 15, 2001, 2:33:33 PM9/15/01
to
In article <3BA194B8...@udel.edu> "David L. Mills"

<mi...@udel.edu> writes:
>
>The current, probably defective, code was the result of a patch
>submitted by somebode else with more knowledge than me. Perhaps I was
>flagrantly incompetent in agreeing to incorporate it, but the only
>solution as I see it is for somebody such as you to submit a definitive
>patch that gets everybody well.

Chances are that 'patch -R' will fix the problem.

--Per Hedeland
p...@bluetail.com

0 new messages