Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

HELP! Resolving Problems

1 view
Skip to first unread message

Jonathan de Boyne Pollard

unread,
Jun 30, 2003, 1:23:23 PM6/30/03
to
SW> I think it may be failing because queries for
SW> ns1.nameserver.ch to the ascio.com nameservers
SW> return no authority records [...]

The "nameserver.ch." content DNS servers don't add the "NS" resource record
set for "nameserver.ch." to all of their responses, true, but this is
relatively benign. It merely means that the "ch." content DNS servers will
have to be queried afresh every 12 hours.

There is certainly a failure here (albeit that it's possibly not the one that
David Meier is experiencing - he hasn't given us enough information to
determine this). However, it is not related to the "nameserver.ch." content
DNS servers.

SW> klaeui.com has a complex delegation which might also
SW> be causing problems

Actually, what is happening with "klaeui.com." involves just two queries, one
to the root servers and one to the UltraDNS servers. The complex delegation
doesn't actually enter into it at all.

The failure here (with both "klaeui.com." and "pwr.ag.") is a combination of
an over-optimisation in some versions of Sendmail, the fact that BIND passes
through results with the AA bit set to 1 the first time around, and an
outright error in the way that the UltraDNS servers work. Its aetiology is as
follows:

1. To avoid a BIND version 4 bug in the resolution of "CNAME" queries, most
SMTP Client softwares perform an "*" query rather than a "CNAME" query when
canonicalising the domain part of an envelope recipient mailbox, filtering out
from the result those resource records of the type that it actually wanted.
Effectively, a "CNAME" lookup is disguised as a "*" lookup.

However, Sendmail in particular (and, indeed, only certain versions of
Sendmail) _also_ does this in place of issuing explicit "MX" and "A" queries
(even though the BIND version 4 bug does not affect "MX" or "A" queries, only
"CNAME" queries) if it thinks that the response to the first "*" query wasn't
cached along the way. It assumes that "any" means "all" in such
circumstances.

(Not all versions of Sendmail try to optimise their DNS query traffic by doing
this. Also, other MTS softwares, such as "qmail", do not do this. And it's
certainly debatable whether this is a reasonable optimisation to be doing in
the first place, given that best practice is for an SMTP Client to have a
local caching proxy DNS server anyway.)

2. The process of query resolution for an "*" query for "pwr.ag." stops at
the "ag." content DNS servers. In response to other types of queries, the
"ag." content DNS servers return a partial answer comprising a referral for
"pwr.ag.", as expected.

[204.74.112.1:0035] -> [0.0.0.0:0000] 73
Header: 0002 1+0+2+0, R, , query, no_error
Question: pwr.ag. IN A
Authority: pwr.ag. IN NS 86400 ns2.namecenter.ch.
Authority: pwr.ag. IN NS 86400 ns1.namecenter.ch.

However, in response to an "*" query they instead return a complete answer
(i.e. one that doesn't end in a referral) - but one where the relevant
resource record sets ("MX" and "A") are erroneously empty.

[204.74.112.1:0035] -> [0.0.0.0:0000] 123
Header: 0002 1+2+2+0, R, AUTH, query, no_error
Question: pwr.ag. IN *
Answer: pwr.ag. IN NS 86400 ns2.namecenter.ch.
Answer: pwr.ag. IN NS 86400 ns1.namecenter.ch.
Authority: ag. IN NS 86400 TLD2.ULTRADNS.NET.
Authority: ag. IN NS 86400 TLD1.ULTRADNS.NET.

Given that the "*" query will be the first one made, it will be unlikely that
any "pwr.ag." delegation information is already cached. So query resolution
will not reach the "pwr.ag." content DNS servers at all, and will instead stop
at the "ag." content DNS servers when they return that complete answer.

There's a strong argument that the "ag." content DNS servers (run by UltraDNS)
are wrong here. Certainly, a complete answer is not the response that would
be generated by following the algorithm in RFC 1034 section 4.3.2.

3. If this is the first time that the "*" query was made, as it will be in
these particular circumstances, BIND passes through the response leaving the
AA bit set to 1. Sendmail takes this to mean that the response wasn't cached
along the way, and so re-uses the response when performing the "MX" and "A"
lookups, filtering it for resource records of the desired types, instead of
making further queries.

4. The assumptions that Sendmail is making thus break. It is assuming that if
the AA bit is set to 1 in the response, "any" will have really meant "all".
But that's not true in these circumstances.

When Sendmail filters the result of the "*" query looking for "MX" and "A"
resource record sets it finds no resource records of those types. Its
disguised "MX" and "A" lookups thus return empty resource record sets, and it
thus complains that it cannot transport mail addressed to "pwr.ag." and
"klaeui.com." mailboxes.

Note that, as mentioned, this problem relies upon a set of subtle interactions
between a specific combination of softwares. Change any one of them and the
problem goes away:

* Change Sendmail to some other MTA (or change to an appropriate version of
Sendmail), and the assumption that a "passed through" response (i.e. with the
AA bit set to 1) to an "any" query actually contains "all" records goes away.
Other MTAs instead explicitly issue "MX" and "A" queries, which will be
properly resolved because the UltraDNS servers correctly return referrals in
response to "MX" and "A" queries.

* Change BIND to some other proxy DNS server software, and "passed through"
responses from the proxy DNS server go away. ("dnscache" always sets the AA
bit to 0, for example.) Sendmail will thus never assume that "any" has in
fact meant "all", and will thus explicitly issue "MX" and "A" queries rather
than using the result from an "*" query.

* Fix the broken UltraDNS servers so that they always hand out referrals when
appropriate, _even when_ the query type is "*", and query resolution always
ends by asking the "pwr.ag." content DNS servers themselves. They _do_ return
the "MX" and "A" resource record sets in the response to an "*" query, and so
Sendmail's optimisation will happen to work.

Jonathan de Boyne Pollard

unread,
Jun 30, 2003, 1:34:05 PM6/30/03
to
SW> I don't know of any tools that point out over complicated DNS
SW> delegation, but maybe the folks at menandmice.com, or Dan
SW> (http://cr.yp.to/), have something suitable?

Dan Bernstein's "dnstrace" will show all of the possible paths that may be
followed to resolve a query, but one has to sit down, read, and decode the
output, the format of which isn't actually documented.

<URL:http://cr.yp.to/djbdns/debugging.html>

Moreover, one has to come up with one's own definition of what "over
complicated" is. However, the actual _amount_ of output is, of course, a
rough guide to the amount of gluelessness involved.

Mark_A...@isc.org

unread,
Jun 30, 2003, 7:54:13 PM6/30/03
to

To be precise the "bug" was independent of query type. If
there was a error loading the zone named would return
SERVFAIL for negative answers. "*" queries just returned
what was available so unless you had a bad domain you wouldn't
get a negative answer.

Most mail domains have A and/or MX records. Very few had
CNAMES. The problem was described as a problem with CNAMES.

Note there was never a need for sendmail to issue the CNAME
query. Standard DNS processing would have returned the
CNAMES if they existed to MX/A queries.

Nor that requires by RFC 2308.

Upgrade to BIND 9 or the next BIND 8 release (aa was incorrectly
being preserved).

> * Fix the broken UltraDNS servers so that they always hand out referrals when
> appropriate, _even when_ the query type is "*", and query resolution always
> ends by asking the "pwr.ag." content DNS servers themselves. They _do_ retur
> n
> the "MX" and "A" resource record sets in the response to an "*" query, and so
> Sendmail's optimisation will happen to work.

--
Mark Andrews, Internet Software Consortium
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: Mark.A...@isc.org

Simon Waters

unread,
Jun 30, 2003, 8:02:22 PM6/30/03
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jonathan de Boyne Pollard wrote:
> SW> I think it may be failing because queries for
> SW> ns1.nameserver.ch to the ascio.com nameservers
> SW> return no authority records [...]
>
> The "nameserver.ch." content DNS servers don't add the "NS" resource
record
> set for "nameserver.ch." to all of their responses, true, but this is
> relatively benign. It merely means that the "ch." content DNS servers
will
> have to be queried afresh every 12 hours.

I think it is more catastrophic for some versions of BIND 9, which
assume if a nameserver says "it is authoritative and no nameservers"
exist for a domain, then it is believed, despite the obvious contradiction.

> he hasn't given us enough information to
> determine this). However, it is not related to the "nameserver.ch."
content
> DNS servers.

I'm sure he has supplied enough information, he gave us his recursive
server IP, which can be seen to know that pwr.ag is served by
ns[12].namecenter.ch, but if you ask it to get ns1.namecenter.ch IP
address it gives SERVFAIL, which I'm pretty sure brings us back to the
answer I gave before, that the answers for the question "what is the IP
of ns1.namecenter.ch" gives a corrupt answer.

> Given that the "*" query will be the first one made, it will be
unlikely that
> any "pwr.ag." delegation information is already cached.

Urm it is cached, just query the server, but it is incomplete.
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQE/AMynGFXfHI9FVgYRAjUtAJwPeILgQe8Ro0DmNIjrgOFBUOn8ygCgtSSz
7dayMlaMpFhxPluyhi7xJZk=
=vaWj
-----END PGP SIGNATURE-----


Mark_A...@isc.org

unread,
Jun 30, 2003, 8:39:05 PM6/30/03
to

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Jonathan de Boyne Pollard wrote:
> > SW> I think it may be failing because queries for
> > SW> ns1.nameserver.ch to the ascio.com nameservers
> > SW> return no authority records [...]
> >
> > The "nameserver.ch." content DNS servers don't add the "NS" resource
> record
> > set for "nameserver.ch." to all of their responses, true, but this is
> > relatively benign. It merely means that the "ch." content DNS servers
> will
> > have to be queried afresh every 12 hours.
>
> I think it is more catastrophic for some versions of BIND 9, which
> assume if a nameserver says "it is authoritative and no nameservers"
> exist for a domain, then it is believed, despite the obvious contradiction.

No version of named depends upon the authoritative servers for the
zone returning NS records for the zone in the authority section. They
will be cached, used and preferred if returned.

You may be confusing this with named not querying servers for which
it has received a NXDOMAIN for. This happens when *only* glue
address records are added and not the real records the glue records
are supposed to be copies of.

Prior to adding IPv6 support this sort of error was not highly visible.
Named looks for missing glue and as the parent has not returned glue
AAAA records (as they don't exist). The NXDOMAIN response is cached
and the nameserver is not tried until the cache expires.

Usually the first query to the zone succeeds and subsequent ones fail.

> > he hasn't given us enough information to
> > determine this). However, it is not related to the "nameserver.ch."
> content
> > DNS servers.
>
> I'm sure he has supplied enough information, he gave us his recursive
> server IP, which can be seen to know that pwr.ag is served by
> ns[12].namecenter.ch, but if you ask it to get ns1.namecenter.ch IP
> address it gives SERVFAIL, which I'm pretty sure brings us back to the
> answer I gave before, that the answers for the question "what is the IP
> of ns1.namecenter.ch" gives a corrupt answer.
>
> > Given that the "*" query will be the first one made, it will be
> unlikely that
> > any "pwr.ag." delegation information is already cached.
>
> Urm it is cached, just query the server, but it is incomplete.
> -----BEGIN PGP SIGNATURE-----
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQE/AMynGFXfHI9FVgYRAjUtAJwPeILgQe8Ro0DmNIjrgOFBUOn8ygCgtSSz
> 7dayMlaMpFhxPluyhi7xJZk=
> =vaWj
> -----END PGP SIGNATURE-----
>
>

0 new messages