DNS over TCP - a fix for Kaminsky's hack

James Taylor

unread,

Aug 10, 2008, 12:27:13 AM8/10/08

to

During Dan Kaminsky's presentation of the DNS cache poisoning problem at
Black Hat 2008 recently, someone asked him whether switching to DNS over
TCP would work as a mitigation strategy. Dan said that DNS servers
already run so close to maximum capacity on UDP alone that the system
simply wouldn't cope with the extra load of TCP.

However, I find it hard to believe that all DNS resolvers are so heavily
loaded, and I suspect that only the major ISPs would need to add
capacity, and even then not so much extra capacity, for this to work.
Surely, corporate internal DNS resolvers could be switched to only TCP
right now with considerable benefit?

I'm quite certain Dan, Paul Vixie and the others understand this a lot
better than me, so I wonder if someone here could explain to me why DNS
over TCP would be so impractical. Thanks.

--
James Taylor

Skybuck Flying

unread,

Aug 10, 2008, 3:28:04 AM8/10/08

to

Could you explain DNS cache poisoning as simple as possible to somebody who
doesn't know much about dns ?

What is the problem ?

Bye,
Skybuck.

James Taylor

unread,

Aug 10, 2008, 8:29:17 AM8/10/08

to

Skybuck Flying <Blood...@hotmail.com> wrote:

> Could you explain DNS cache poisoning as simple as possible to somebody
> who doesn't know much about dns ?

Well, I'm no expert either, but my understanding is as follows:

There are two types of DNS servers: authoritative and resolvers (oh and
forwarders that simply relay queries to a resolver). Authoritative DNS
servers actually have the answers that are looked up, resolvers do the
looking up and nearly always cache the results so they can answer you
quicker next time. Only caching resolvers are affected by cache
poisoning. The idea is that an attacker tricks the resolver into caching
a false answer, so that anyone relying on that resolver will be directed
to a host they didn't intend, probably one controlled by a phisher or
malware distributor.

If your DNS resolver is poisoned, you might go to a well known site such
as Hotmail, PayPal, or Amazon, but actually be taken to an evil twin
site or be connecting through a malicious proxy, and because the URL in
your browser is correct you'd be fooled into think your were safe when
you're not.

> What is the problem ?

It used to be relatively difficult to trick a resolver into caching
false information, but Kaminsky's discovery has shown that it can be
done reliably in a matter of seconds, making the problem suddenly rather
serious. And with exploit code readily available there are already
attacks going on out on the 'net.

The attack relies on guessing the transaction ID and source port of the
query the resolver makes. The transaction IDs are random, but in many
cases the source ports were not, which made the attack easy. As a first
step mitigation the DNS software of many vendors has been patched to
randomise the source ports. This should hold off the worst attacks until
a more permanent fix for DNS can be dreamt up and implemented.

Some people believe DNSSec is the way to go, but it is complex to set up
and hasn't really made any progress towards mass adoption in the last
ten years. (I'd be interested to hear what people think about DNSSec as
a solution.)

DNS servers already support TCP so using TCP in place of UDP would be a
quick and dirty solution. However, Dan Kaminsky ruled it out for
performance reasons, and that's what I'd like someone to help me
understand. This is the TCP/IP experts group after all. Anyone?

--
James Taylor

Vernon Schryver

unread,

Aug 10, 2008, 9:18:54 AM8/10/08

to

In article <1ilgffc.nf1dvkpxtxkiN%use...@oakseed.demon.co.uk.invalid>,
James Taylor <use...@oakseed.demon.co.uk.invalid> wrote:

>During Dan Kaminsky's presentation of the DNS cache poisoning problem at

> ...

>I'm quite certain Dan, Paul Vixie and the others understand this a lot
>better than me, so I wonder if someone here could explain to me why DNS
>over TCP would be so impractical. Thanks.

That idea has come up frequently in NANOG and elsewhere. For a
recent example, see the thread starting at
http://www.merit.edu/mail.archives/nanog/threads.html#10298

Vernon Schryver v...@rhyolite.com

James Taylor

unread,

Aug 10, 2008, 12:02:52 PM8/10/08

to

Vernon Schryver <v...@calcite.rhyolite.com> wrote:

> James Taylor <use...@oakseed.demon.co.uk.invalid> wrote:
>
> > I wonder if someone here could explain to me why DNS
> > over TCP would be so impractical. Thanks.
>
> That idea has come up frequently in NANOG and elsewhere. For a
> recent example, see the thread starting at
> http://www.merit.edu/mail.archives/nanog/threads.html#10298
> See also http://ops.ietf.org/lists/namedroppers/namedroppers.2008/threads.html

Thanks for those links, but there's a lot to wade through and it's hard
to find anything specifically relevant to the question of using TCP for
DNS queries. I did see Paul Vixie explaining that many authoritative
servers aren't set up to work that way, but that doesn't answer the
question of why it would not be possible to change them to do so. If
everyone switched to TCP, wouldn't that solve the DNS cache poisoning
problem? Can anyone help explain what the performance issue is?

--
James Taylor

Skybuck Flying

unread,

Aug 10, 2008, 12:04:51 PM8/10/08

to

Ok,

So if I understand correctly the following happens during the attack:

1. Somehow the resolver is forced to contact the authoritative server...

(For example by looking up a dns name).

The resolver does so using an udp packet to the authoritative server...

Apperently the resolver does this via a transaction id, and a source port on
the resolver side.

2. The authoritative server has to reply with an answer probably containing
the same transaction id and ofcourse the dest port set to the source port of
the resolver.

3. Now the attacker pretends to be the authoritative server. The attacker
guesses the transaction id and the source port of the resolver... and then
sends a spoofed packet with spoofed ip addressess (?) pretending to be from
the authoritative server.

If successfull the resolver is poisoned.

Now let's look at some possible solutions:

***
Solution A, Encryption (complex, cpu intensive, not so good, possibly other
attack vectors.)
***

I assume anybody could run a resolver.

One possible solution could be:

1. First the resolver contacts the authoritative servers to get a unique
encryption key via rsa. (Therefore only man in the middle attacks could
intercept the keys... it's unlikely the attacker could intercept the keys...
it's also unlikely who could guess the keys or spoof them or whatever so
this should work.

2. The resolver has now acquired a key from the authoritative server... now
the both can communicate udp packets encrypted via aes.

However since these packets are small, and the contents could be known up
front... this might not be a too good solution... since then cryptoanalysis
experts might quickly discover the key...

However maybe a little bit of extra random data, or padding might make it a
bit more difficult for them.

However... aes requires lots of cpu and such... and so does rsa... and even
random stuff and even mac-codes.

So maybe:

3. Use a lighter encryption algorithm.
4. Use a lighter public key encryption algorithm.
5. Maybe always do these steps... to get new keys.

***
Solution B, Detect and Block spoofer (Impossible?)
***

Let's look at another possible solution (which doesn't work):

1. Detect the spoofers/fakers somehow... for example:

1.1 The transaction id was not what it was supposed to be.

1.2 The source port was not what it was supposed to be.

However this does not make it possible to block the "attacker" because the
ip is spoofed...

***
Solution C, Make it harder to guess transaction id, use more bits and good
pseudo generator !. (Easy solution !?!)
***

Finally another possible solution:

1. Simply make a very large transaction id like 128 bit or so.

That should make it nearly impossible for the attacker to predict the
transaction id simply because it's way too large to try out all
possibilities...

Unless the attacker knows how the transaction id's are generated ?!?!?

So a good random transaction id generator is necessary... and if the
resolver is used a lot by others than the random transaction id's will be
generated more quickly so that should make it difficult for an attacker to
guess/predict it accurately.

Finally if one is really paranoid and wanna solve it for a long time:

Simply use 1024 bit transaction id... that's gonna be fun =D

Thanks for your explanation ;)

Bye,
Skybuck.

Vernon Schryver

unread,

Aug 10, 2008, 5:59:10 PM8/10/08

to

In article <1ilhand.gxqw9o10zqgreN%use...@oakseed.demon.co.uk.invalid>,
James Taylor <use...@oakseed.demon.co.uk.invalid> wrote:

>> http://www.merit.edu/mail.archives/nanog/threads.html#10298

>>See also http://ops.ietf.org/lists/namedroppers/namedroppers.2008/threads.html
>
>Thanks for those links, but there's a lot to wade through and it's hard
>to find anything specifically relevant to the question of using TCP for
>DNS queries. I did see Paul Vixie explaining that many authoritative
>servers aren't set up to work that way, but that doesn't answer the
>question of why it would not be possible to change them to do so. If
>everyone switched to TCP, wouldn't that solve the DNS cache poisoning
>problem? Can anyone help explain what the performance issue is?

- a DNS/UDP/IP operation requires only 2 packets, one in each
direction. More important, the DNS server does not need to keep
any state or remember anything about the client.

- a DNS/TCP/IP operation generally requires 6 or 7 packets. Worse,
the DNS server must make arrangements to remember things about
the client from the time it receives the first packet until at
least the time it sends the last packet and perhaps longer.

- DNS/UDP/IP does not need to worry about various kinds of load balancing
that make a bunch of servers answer a requests sent to a single
IP address. TCP requires that for the duration of the connection,
a single server receives all of the packets for a client's request.

- Many firewalls are configured to block TCP connections including
to port 53. The fact that violates the DNS standards does
change reality.

- Many firewalls that are configured to allow port 53 or DNS/TCP/IP
keep state or remember things about each TCP connection. Firewalls
that now have no trouble dealing with the relatively small number
of DNS/TCP/IP transactions might have problems.

Vernon Schryver v...@rhyolite.com

Barry Margolin

unread,

Aug 10, 2008, 10:30:52 PM8/10/08

to

In article <1ilgffc.nf1dvkpxtxkiN%use...@oakseed.demon.co.uk.invalid>,
use...@oakseed.demon.co.uk.invalid (James Taylor) wrote:

> During Dan Kaminsky's presentation of the DNS cache poisoning problem at
> Black Hat 2008 recently, someone asked him whether switching to DNS over
> TCP would work as a mitigation strategy. Dan said that DNS servers
> already run so close to maximum capacity on UDP alone that the system
> simply wouldn't cope with the extra load of TCP.
>
> However, I find it hard to believe that all DNS resolvers are so heavily
> loaded, and I suspect that only the major ISPs would need to add
> capacity, and even then not so much extra capacity, for this to work.
> Surely, corporate internal DNS resolvers could be switched to only TCP
> right now with considerable benefit?

The issue isn't the communication between the client and the ISP server,
it's the communication between the ISP servers and all the authoritative
servers they get the information from.

E.g. the root and GTLD servers probably couldn't cope with receiving a
majority of their queries over TCP rather than UDP.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE don't copy me on replies, I'll read them in the group ***

James Taylor

unread,

Aug 10, 2008, 11:35:07 PM8/10/08

to

Vernon Schryver <v...@calcite.rhyolite.com> wrote:

> James Taylor <use...@oakseed.demon.co.uk.invalid> wrote:
>
> > Can anyone help explain what the performance issue is?
>
> - a DNS/UDP/IP operation requires only 2 packets, one in each
> direction. More important, the DNS server does not need to keep
> any state or remember anything about the client.
>
> - a DNS/TCP/IP operation generally requires 6 or 7 packets. Worse,
> the DNS server must make arrangements to remember things about
> the client from the time it receives the first packet until at
> least the time it sends the last packet and perhaps longer.
>
> - DNS/UDP/IP does not need to worry about various kinds of load balancing
> that make a bunch of servers answer a requests sent to a single
> IP address. TCP requires that for the duration of the connection,
> a single server receives all of the packets for a client's request.
>
> - Many firewalls are configured to block TCP connections including
> to port 53. The fact that violates the DNS standards does
> change reality.
>
> - Many firewalls that are configured to allow port 53 or DNS/TCP/IP
> keep state or remember things about each TCP connection. Firewalls
> that now have no trouble dealing with the relatively small number
> of DNS/TCP/IP transactions might have problems.

That's a fantastic concise summary, laid out in an easy to digest way.
Thanks a lot.

I wish I could hang around to discuss this fascinating topic further,
but I have to leave home today, and I have to pack in a hurry. I'm going
on a two week advanced free-diving course in the tropical waters of
Thailand, so I can't grumble. ;-)

--
James Taylor

Alan Strassberg

unread,

Aug 11, 2008, 1:34:13 PM8/11/08

to

In article <2c236$489e9883$54198363$26...@cache2.tilbu1.nb.home.nl>,

Skybuck Flying <Blood...@hotmail.com> wrote:
>Could you explain DNS cache poisoning as simple as possible to somebody who
>doesn't know much about dns ?

An Illustrated Guide to the Kaminsky DNS Vulnerability

http://www.unixwiz.net/techtips/iguide-kaminsky-dns-vuln.html

alan

Rick Jones

unread,

Aug 11, 2008, 1:42:54 PM8/11/08

to

Modulo user-space code to track FD's, the irony is that the piggier
the user-space DNS server code, the less the effect of switching to
TCP. The TCP processing overhead will be mostly kernel time.

rick jones
--
The computing industry isn't as much a game of "Follow The Leader" as
it is one of "Ring Around the Rosy" or perhaps "Duck Duck Goose."
- Rick Jones
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

Rod Dorman

unread,

Aug 11, 2008, 1:55:40 PM8/11/08

to

In article <7d0a2$489f11a2$54198363$7...@cache3.tilbu1.nb.home.nl>,
Skybuck Flying <Blood...@hotmail.com> wrote:
> ...

>Now let's look at some possible solutions:

> ...

Keep in mind that any solution that requires a flag day is going to
have quite a bit of resistance to it.

--
-- Rod --
rodd(at)polylogics(dot)com

robert...@yahoo.com

unread,

Aug 11, 2008, 3:56:18 PM8/11/08

to

On Aug 10, 11:04 am, "Skybuck Flying" <BloodySh...@hotmail.com> wrote:
> Ok,
>
> So if I understand correctly the following happens during the attack:
>
> 1. Somehow the resolver is forced to contact the authoritative server...
>
> (For example by looking up a dns name).
>
> The resolver does so using an udp packet to the authoritative server...
>
> Apperently the resolver does this via a transaction id, and a source port on
> the resolver side.
>
> 2. The authoritative server has to reply with an answer probably containing
> the same transaction id and ofcourse the dest port set to the source port of
> the resolver.
>
> 3. Now the attacker pretends to be the authoritative server. The attacker
> guesses the transaction id and the source port of the resolver... and then
> sends a spoofed packet with spoofed ip addressess (?) pretending to be from
> the authoritative server.
>
> If successfull the resolver is poisoned.

That's basically correct - so long as the spoofed packet arrives
before the real one, the victim DNS server will be fooled. In fact,
you can send many fake packets before the real DNS server can send its
response if you can time the victim DNS server's query fairly well.

What's worse, it that a fake message might contain a (false)
resolution for a domain higher up in the hierarchy.

The attacker doesn't really have to guess the transaction ID in any
active way, since it's only sixteen bits, trying a bunch at random is
good enough. The randomization of source ports increases the number
of combinations the attacker has to try, making it much less likely to
succeed.

The need is for authentication, not encryption. DNSSEC has been
attempting to address that for some time, but breadth of deployment
and load on the servers and resolvers are an issue, as well as
managing the key distribution hierarchy for something the scale of the
DNS system.

DNS resolvers can, in fact, be potentially implemented on any device
attached to the internet, although most machines implement only stub
resolvers that ask a local (in some sense of the word) DNS server to
do all the heavy lifting for them. But even for stub resolvers, the
issues are largely the same, but since they tend to connect to a
somewhat more limited set of upstream DNS servers, and are often
behind firewalls, they’re a bit more sheltered.

> ***
> Solution B, Detect and Block spoofer (Impossible?)
> ***
>
> Let's look at another possible solution (which doesn't work):
>
> 1. Detect the spoofers/fakers somehow... for example:
>
> 1.1 The transaction id was not what it was supposed to be.
>
> 1.2 The source port was not what it was supposed to be.
>
> However this does not make it possible to block the "attacker" because the
> ip is spoofed...

The problem is that once the packet with the spoofed source address is
on the Internet it's hard to detect. Proper deployment of ingress
filtering would help a bit, but there are still plenty of places where
you could not meaningfully apply ingress filtering.

> ***
> Solution C, Make it harder to guess transaction id, use more bits and good
> pseudo generator !. (Easy solution !?!)
> ***
>
> Finally another possible solution:
>
> 1. Simply make a very large transaction id like 128 bit or so.
>
> That should make it nearly impossible for the attacker to predict the
> transaction id simply because it's way too large to try out all
> possibilities...
>
> Unless the attacker knows how the transaction id's are generated ?!?!?
>
> So a good random transaction id generator is necessary... and if the
> resolver is used a lot by others than the random transaction id's will be
> generated more quickly so that should make it difficult for an attacker to
> guess/predict it accurately.
>
> Finally if one is really paranoid and wanna solve it for a long time:
>
> Simply use 1024 bit transaction id... that's gonna be fun =D

Unfortunately the transaction ID is a fixed sixteen bit field in the
standard DNS request/response packet. Changing it is going to be
very, very difficult. If you're going to insist on a bigger field,
you will need to update every single device attached to the Internet.
A transitional approach presents its own problems, namely that so long
as you're willing to deal with hosts and other DNS servers that don't
support the long transaction IDs, you're going to have holes that can
be exploited in the same way.

The ultimate solution to this is unclear at this point, but the
randomizing of source ports buys a bit of time.

Didi

unread,

Aug 11, 2008, 6:03:11 PM8/11/08

to

Vernon Schryver wrote:
> ...

> - a DNS/UDP/IP operation requires only 2 packets, one in each
> direction. More important, the DNS server does not need to keep
> any state or remember anything about the client.
>
> - a DNS/TCP/IP operation generally requires 6 or 7 packets. Worse,
> the DNS server must make arrangements to remember things about
> the client from the time it receives the first packet until at
> least the time it sends the last packet and perhaps longer.

Well this is true of course but the impact of using TCP is not that
huge in reality; the server sends only a single segment in response,
just
as it would send a single UDP packet. The overhead (and RTT times) for
opening/closing the connection remains, of course - so if the servers
are
close to the limit using UDP then it could be enough to tip the
balance,
but this would be very close, I doubt this is the case.

But my DPS system uses normally TCP and so far I have encountered
just one DNS server which will refuse a connection... usually it just
works over TCP. I have not done extensive tests, but when my ISPs
DNS servers are dead - which happens rarely - I go globally/
nonrecursively
and I am just fine - unless I try to reach my own domain, LOL, my
hosting companies DNS server is the one which won't do TCP... :-).

Didi

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.protocols.tcp-ip/msg/0cec8847a9bce384?dmode=source

Vernon Schryver

unread,

Aug 11, 2008, 6:44:40 PM8/11/08

to

In article <e3b0fc39-dab8-4d28...@c58g2000hsc.googlegroups.com>,
Didi <d...@tgi-sci.com> wrote:

>> - a DNS/UDP/IP operation requires only 2 packets, one in each

>> - a DNS/TCP/IP operation generally requires 6 or 7 packets. Worse,

>> the DNS server must make arrangements to remember things about
>> the client from the time it receives the first packet until at
>> least the time it sends the last packet and perhaps longer.
>
>Well this is true of course but the impact of using TCP is not that
>huge in reality; the server sends only a single segment in response,
>just
>as it would send a single UDP packet. The overhead (and RTT times) for
>opening/closing the connection remains, of course - so if the servers
>are
>close to the limit using UDP then it could be enough to tip the
>balance,
>but this would be very close, I doubt this is the case.

It's not only the packets but the state, and not at your own or your
ISP's DNS servers but at the DNS servers to which your servers recurse.
When your ISP switches to TCP to prevent its cache from being polluted,
it won't pay much more, but other servers, such as the gTLD servers
that answer its requests will pay a lot more.

If most requests to your DNS server use UDP, then you can use a
single input socket listening on port 53. Your DNS server will not
need to keep any state except for requests that cannot be answered
from its cache and require asking some other server (recursing).
If your DNS server uses random port numbers to minimize poisoning
(i.e. it's "patched"), then you will need as many UDP sockets as
active requests needing recursion, which should be small except
perhaps when first starting or under attack.

If all of the clients that use your DNS server switch to TCP, how many
sockets and how many records of some sort of state will you need in
your DNS server? Some fraction will stall for TCP retransmissions and
even full timeouts as clients disappear without sending a FIN or RST,
and unless you play games you'll need to keep an open socket the duration.

Or as someone said in this thread, the problem with DNS/TCP/IP is
not at ISPs or other DNS clients, but at very busy servers such as
the gTLD servers. No one has told me, but somehow I doubt that the
TLD servers have enough spare bandwidth to handle a switch to TCP
by most of the world, and never mind the extra CPU cycles and memory
required to handle the per-client state.

>But my DPS system uses normally TCP and so far I have encountered
>just one DNS server which will refuse a connection... usually it just
>works over TCP. I have not done extensive tests, but when my ISPs
>DNS servers are dead - which happens rarely - I go globally/
>nonrecursively
>and I am just fine - unless I try to reach my own domain, LOL, my
>hosting companies DNS server is the one which won't do TCP... :-).

Please don't be insulted, but for professionals as opposed to
hobbyists, that statement is equivalent shouting "DO NOT USE
DNS/TCP/IP! IT DOESN'T WORK!" Professionally run systems can't
afford to work only most of the time.

Vernon Schryver v...@rhyolite.com

Didi

unread,

Aug 11, 2008, 7:17:12 PM8/11/08

to

Vernon Schryver wrote:
>
> It's not only the packets but the state, and not at your own or your
> ISP's DNS servers but at the DNS servers to which your servers recurse.
> When your ISP switches to TCP to prevent its cache from being polluted,
> it won't pay much more, but other servers, such as the gTLD servers
> that answer its requests will pay a lot more.

There is no state to save between requests if you use TCP
other than what there is if you use UDP. Both transports
are defined in RFC1034/1035, the only difference being a length word
at the beginning of the data.
You may want to take note that this comes from someone who has
implemented a TCP, DNS with caches etc.

> the gTLD servers. No one has told me, but somehow I doubt that the
> TLD servers have enough spare bandwidth to handle a switch to TCP
> by most of the world, and never mind the extra CPU cycles and memory
> required to handle the per-client state.

Oh they do work over TCP, trust me. Have worked for me for years.

BTW, the extra CPU power TCP takes is negligible to the search power
it takes to locate the wanted resource.

> Please don't be insulted, but for professionals as opposed to
> hobbyists, that statement is equivalent shouting "DO NOT USE
> DNS/TCP/IP! IT DOESN'T WORK!" Professionally run systems can't
> afford to work only most of the time.

Actually TCP does work for DNS. Has worked for me for years.
The chances to hit a dead or bogus DNS server are much greater
than the chances to hit one which won't do TCP. Like I said, for
all these years I have only hit *one*.
With the obvious vulnerability of DNS over UDP gone public now we
can expect TCP to become the standard for all clients before too long,
servers support it anyway.

Didi

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original mesage: http://groups.google.com/group/comp.protocols.tcp-ip/msg/600dc36812c5082e?dmode=source

Rick Jones

unread,

Aug 11, 2008, 7:29:38 PM8/11/08

to

How does carrying queries over TCP connections interact with anycast
addressing?

If the vendors and stack providers had not gone through the
"SPECweb96" excercises a decade ago I might worry more about being
able to bring enough "oomph" to bear but to my otherwise untrained gut
it certainly feels that unless anycast and TCP don't/can't get along,
it is "a small matter of deploying Ntimes the hardware" in the
infrastructure for some still not completely defined value for N.
Compared to the hand wringing and angst about the effect of this hole
even a non-trivial single-digit value for N doesn't seem like that
high a price.

rick jones
--
Process shall set you free from the need for rational thought.

Barry Margolin

unread,

Aug 11, 2008, 8:54:50 PM8/11/08

to

In article <g7qi12$at0$3...@usenet01.boi.hp.com>,
Rick Jones <rick....@hp.com> wrote:

> How does carrying queries over TCP connections interact with anycast
> addressing?

Since DNS doesn't generally need to keep connections around for long
periods, it probably wouldn't be a problem. I think there have been
some experiments with anycast to HTTP servers and they worked OK.
Internet routes don't usually flap enough to cause problems for
short-lived connections.

Vernon Schryver

unread,

Aug 11, 2008, 9:12:48 PM8/11/08

to

In article <barmar-5AFFB8....@newsgroups.comcast.net>,

Barry Margolin <bar...@alum.mit.edu> wrote:
>In article <g7qi12$at0$3...@usenet01.boi.hp.com>,
> Rick Jones <rick....@hp.com> wrote:
>
>> How does carrying queries over TCP connections interact with anycast
>> addressing?
>
>Since DNS doesn't generally need to keep connections around for long
>periods, it probably wouldn't be a problem. I think there have been
>some experiments with anycast to HTTP servers and they worked OK.
>Internet routes don't usually flap enough to cause problems for
>short-lived connections.

Some comments in the mailing lists say much the same thing. Other
comments seem to be saying that the anycast route flaps happen far more
often than one might expect. I think that we're not talking about route
flaps in the usual sense, which are damped and clamped and very rare
at time scales of DNS/TCP/IP connections.

Besides, anycast is not the only kind of load balancing in front of
big DNS servers. Consider a load balancer that even distributes
incoming packets to a bag of MAC addresses. You'd hope that a load
balancer would not do that to TCP segments, but doing the right thing
would require the load balancer to keep state for bazillions of
DNS/TCP/IP connections.

Someone in the mailing lists suggested hacking a special code for
very busy DNS servers that would
- answer all SYNs to port 53 with SYN-ACK without creating a TSP or saving
any other state,
- acknowledge everything that seems to want an acknowledgement,
- assume that the DNS/TCP/IP request would fit in the first and
only non-trivial segment after the SYN,
- and generally act as if DNS/TCP/IP were the same as DNS/UDP/IP.
It would do no retransmitting, which might be hard on DNS clients beyond
lossy paths.
It would naturally defend against the current excitement by the DNS
client using the additional entropy in its TCP initial sequence number.
It is such a completely nasty ugly kludge that I love it, if it really works.

Vernon Schryver v...@rhyolite.com

Vernon Schryver

unread,

Aug 11, 2008, 8:59:42 PM8/11/08

to

In article <1391388c-34da-4bec...@d45g2000hsc.googlegroups.com>,
Didi <d...@tgi-sci.com> wrote:

>> It's not only the packets but the state, and not at your own or your
>> ISP's DNS servers but at the DNS servers to which your servers recurse.
>> When your ISP switches to TCP to prevent its cache from being polluted,
>> it won't pay much more, but other servers, such as the gTLD servers
>> that answer its requests will pay a lot more.
>
> There is no state to save between requests if you use TCP
>other than what there is if you use UDP.

That is wrong if you define "request" as broadly as required by the
situation. The socket and FD for a DNS/TCP request is created when DNS
server does a select() and accept() soon after the SYN from the client
arrives. That FD, the underlying TSP, and the client's IP address,
sequence number, TCP options, etc. are state that must be saved by the
DNS server application code and kernel and used in a poll(), select(),
or equivalent system call while waiting for the actual request from the
client. Only when the request arrives can the server send the answer
and close() the socket. (Yes, recursion adds complications, but the
gTLD servers don't do much recursing.) Thus, the DNS server as well
as the kernel must save state from the receipt of the SYN about 2 round
trip times or perhaps 0.100 to 0.500 seconds before the receipt of the
DNS/TCP request. Depending on how you butcher your TCP code to avoid
time-wait delay, your kernel might also save some state (the TSP) for a
dozen seconds after the DNS/TCP request has been answered and the socket
has been closed.

IN the other situation, DNS/UDP involves no saved state, except for the
irrelevant recursion case.

>other than what there is if you use UDP. Both transports
>are defined in RFC1034/1035, the only difference being a length word
>at the beginning of the data.

that's irrelevant.

> You may want to take note that this comes from someone who has
>implemented a TCP, DNS with caches etc.

Google is your friend.

>> the gTLD servers. No one has told me, but somehow I doubt that the
>> TLD servers have enough spare bandwidth to handle a switch to TCP
>> by most of the world, and never mind the extra CPU cycles and memory
>> required to handle the per-client state.
>
>Oh they do work over TCP, trust me. Have worked for me for years.

Of course they work over TCP, but trust me, most requests do not come
via TCP.

>BTW, the extra CPU power TCP takes is negligible to the search power
>it takes to locate the wanted resource.

While in theory and special implementations, Van Jacobson's number of
~100 cycles per TCP segment exclusive of byte copies and checksums is
right, in typical practice it is orders of magnitude small. On the
other hand, searching a DNS cache needs few cycles if you're a rabid
coder comfortable writing your own tree and hash searching code. Hacking
your DNS server to go fast incures fewer long term maintenance costs
than hacking your TCP/IP code to hit Van Jacobson's cycle count.

>With the obvious vulnerability of DNS over UDP gone public now we
>can expect TCP to become the standard for all clients before too long,
>servers support it anyway.

That's not what people with serious DNS clues have been saying. Note
that I DO NOT have serious DNS clues. I'm refering to comments in the
NANOG, namedroppers, and DNS-ops mailing lists. People with serious,
non-Cliff Claven, authoritative pronouncements based on ignorance,
interests in the issue should see
http://www.merit.edu/mail.archives/nanog/threads.html
http://lists.oarci.net/pipermail/dns-operations/
http://ops.ietf.org/lists/namedroppers/namedroppers.2008/

It might be that for a while, perhaps until DNSSEC finally comes out,
some DNS servers will defend against the current excitement by switching
to TCP for particular queries when they see signs of attempted cache
poisoning. However, that's a whole other thing than generally switching
from UDP to TCP. Switching to TCP also has some possible DoS triggers.
More likely temporary (until DNSSEC) defenses including hacks such as
0x20 entropy increase that Paul Vixie is advocating. That involves
the DNS client (actually the recursing server) randomly toggling the
case of characters in the domain name being queried.

Vernon Schryver v...@rhyolite.com

Barry Margolin

unread,

Aug 11, 2008, 9:28:55 PM8/11/08

to

In article <g7qo2g$13kt$1...@calcite.rhyolite.com>,
v...@calcite.rhyolite.com (Vernon Schryver) wrote:

> In article <barmar-5AFFB8....@newsgroups.comcast.net>,
> Barry Margolin <bar...@alum.mit.edu> wrote:
> >In article <g7qi12$at0$3...@usenet01.boi.hp.com>,
> > Rick Jones <rick....@hp.com> wrote:
> >
> >> How does carrying queries over TCP connections interact with anycast
> >> addressing?
> >
> >Since DNS doesn't generally need to keep connections around for long
> >periods, it probably wouldn't be a problem. I think there have been
> >some experiments with anycast to HTTP servers and they worked OK.
> >Internet routes don't usually flap enough to cause problems for
> >short-lived connections.
>
> Some comments in the mailing lists say much the same thing. Other
> comments seem to be saying that the anycast route flaps happen far more
> often than one might expect. I think that we're not talking about route
> flaps in the usual sense, which are damped and clamped and very rare
> at time scales of DNS/TCP/IP connections.

Maybe they're talking about packets traversing redundant connections
within the network. While it's possible that these can cause different
segments of a connection to go to different anycast instances, I think
this is usually considered a bug in the network and it should be fixed.
Many routers do flow caching, so all the segments of a connection
automatically take the same path (unless the cache overflows).

>
> Besides, anycast is not the only kind of load balancing in front of
> big DNS servers. Consider a load balancer that even distributes
> incoming packets to a bag of MAC addresses. You'd hope that a load
> balancer would not do that to TCP segments, but doing the right thing
> would require the load balancer to keep state for bazillions of
> DNS/TCP/IP connections.

Since one of the most common use of load balancers is in front of
clusters of HTTP servers, I sure hope they'd be able to keep state for
bazillions of TCP connections.

Skybuck Flying

unread,

Aug 11, 2008, 9:43:28 PM8/11/08

to

The dns request/response seems to have a reserved flag called Z which is not
used yet ?

This flag could be used to indicate a dns transaction id extension.

This extension could be added anywhere in the packet.

Be it as a new field... or as a extra fake query/answer.

Over time software needs to be updated anyway... so implement this extension
as soon as possible into new software and devices and all will be well soon
enough... at least for all those people that think it's important enough to
warrant an update ;)

Bye,
Skybuck.

David Schwartz

unread,

Aug 11, 2008, 11:03:27 PM8/11/08

to

On Aug 11, 4:17 pm, Didi <d...@tgi-sci.com> wrote:

> There is no state to save between requests if you use TCP
> other than what there is if you use UDP. Both transports
> are defined in RFC1034/1035, the only difference being a length word
> at the beginning of the data.

Umm, what? There is *NO* state to save between requests for UDP. The
server can completely forget that the client exists and can treat each
new request packet as its own universe.

With TCP, there are two choices:

1) You could use a new TCP request with each connection. In that case,
instead of one packet for a UDP request and one packet for a UDP
reply, you have a three-way TCP handshake, then the request/reply,
then the FIN/ACK packets of a TCP shutdown. This increases the network
traffic to the DNS server by a factor of 3 or so.

2) You could keep each TCP connection up between requests. This
requires the server to store the full TCP connection state for every
client connected to it. At minimum, this is the client IP, client
port, window positions in both direction, current state. If it doesn't
use a timer of some kind, it will be vulnerable to various DOS
attacks, so it will need to keep a timeout or a 'last heard' time.

TCP requires significantly greater state on the server than UDP, or it
requires about three times the network bandwidth. Most likely,
somewhere in-between these two, depending upon the ratio of requests
to connection setup/teardowns.

DS

Dick Wesseling

unread,

Aug 11, 2008, 11:09:28 PM8/11/08

to

In article <g7qo2g$13kt$1...@calcite.rhyolite.com>,

v...@calcite.rhyolite.com (Vernon Schryver) writes:
> Someone in the mailing lists suggested hacking a special code for
> very busy DNS servers that would
> - answer all SYNs to port 53 with SYN-ACK without creating a TSP or saving
> any other state,
> - acknowledge everything that seems to want an acknowledgement,
> - assume that the DNS/TCP/IP request would fit in the first and
> only non-trivial segment after the SYN,
> - and generally act as if DNS/TCP/IP were the same as DNS/UDP/IP.
>
> It would do no retransmitting, which might be hard on DNS clients beyond
> lossy paths.

Retransmissions should not be a problem, provided that you replace
"acknowledge everything that seems to want an acknowledgement", with
something just a little bit more sophisticated.
The client's sequence numbers are used to verify that it is really
talking to the server, but the server can use its sequence numbers as a
cookie. For the sake of the explanation I will assume that the server's
ISS is 0.

There are 3 things that may want an acknowledgement:

- SYN.
Always acknowledge.

- Request, with or without FIN.
Always respond with reply+FIN+ACK

If the client is on a lossy path then it will retry its request,
which triggers another reply+FIN+ACK. In other words, the client
handles retransmissions.

- FIN in a segment by itself.
Be careful. If you acknowledge this just because it seems to
want an acknowledgement then the client will assume that you've
seen its request and no longer retransmit.
We therefore only acknowledge the FIN if this segment acknowledges
our response. This is where the cookie comes in. If segment.ack <= 1
then it only acknowledges our SYN. Otherwise chances are that it
acknowledges our response.

At first sight it may seem that this causes an extra packet to be sent,
but unless the client is really broken it can combine the retransmission
of its FIN with the acknowledgement of our FIN, which needs to be sent
anyway.

Didi

unread,

Aug 12, 2008, 4:20:45 AM8/12/08

to

Vernon Schryver wrote:
> ....

> > There is no state to save between requests if you use TCP
> >other than what there is if you use UDP.
>
> That is wrong if you define "request" as broadly as required by the
> situation. The socket and FD for a DNS/TCP request is created when DNS
> server does a select() and accept() soon after the SYN from the client
> arrives. That FD, the underlying TSP, and the client's IP address,

> sequence number, TCP options,...

I know how TCP works. Like I already said, I happen to have
implemented
it, along with DNS and many other things.
All you mention has nothing to do with states to save related to DNS
as
you claimed. This is the overhead TCP adds to which we both referred
separately.

> >other than what there is if you use UDP. Both transports
> >are defined in RFC1034/1035, the only difference being a length word
> >at the beginning of the data.
>
> that's irrelevant.

Given that now you know there are no states to save between
DNS transactions it is irrelevant indeed.

>
> > You may want to take note that this comes from someone who has
> >implemented a TCP, DNS with caches etc.
>
> Google is your friend.

I tried that - spent the 2-3 minutes I was inclined to - and all I
found
was similar to what I see here. If you have authored some real stuff
it must be harder than that to locate, please indicate what.

> While in theory and special implementations, Van Jacobson's number of
> ~100 cycles per TCP segment exclusive of byte copies and checksums is
> right, in typical practice it is orders of magnitude small. On the
> other hand, searching a DNS cache needs few cycles if you're a rabid
> coder comfortable writing your own tree and hash searching code.

So you want to search the database of the root servers in a few cycles
per request. How many entries per cycle do you actually want to go
through? Sorry but this sounds like you don't have a clue of what
you are talking about.

Didi

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/

Original message: http://groups.google.com/group/comp.protocols.tcp-ip/msg/612a77c85c9f79ed?dmode=source

Rick Jones

unread,

Aug 12, 2008, 1:07:52 PM8/12/08

to

Vernon Schryver <v...@calcite.rhyolite.com> wrote:
> Besides, anycast is not the only kind of load balancing in front of
> big DNS servers. Consider a load balancer that even distributes
> incoming packets to a bag of MAC addresses. You'd hope that a load
> balancer would not do that to TCP segments, but doing the right
> thing would require the load balancer to keep state for bazillions
> of DNS/TCP/IP connections.

Or simply do its balancing decisions based on the four-tuple. If we
are talking about bazillions of DNS/TCP/IP connections one probably
gets good distribution from a hash on the four-tuple.

> Someone in the mailing lists suggested hacking a special code for
> very busy DNS servers that would
> - answer all SYNs to port 53 with SYN-ACK without creating a TSP
> or saving any other state,
> - acknowledge everything that seems to want an acknowledgement,
> - assume that the DNS/TCP/IP request would fit in the first and
> only non-trivial segment after the SYN,
> - and generally act as if DNS/TCP/IP were the same as DNS/UDP/IP.

> It would do no retransmitting, which might be hard on DNS clients
> beyond lossy paths.

So long as the ACK of the request was only piggybacked on the segment
carrying the reponse you would be OK. Then in the event of loss of
ACK you would get a retransmit from the client.

> It would naturally defend against the current excitement by the DNS
> client using the additional entropy in its TCP initial sequence
> number. It is such a completely nasty ugly kludge that I love it,
> if it really works.

Is it really sufficient to protect against the spoofing if only the
clients to be keeping state? I guess the 32 bit ISN of TCP is a nice
boost. It would be a nasty ugly kludge. How many more nasty ugly
kludges can dance on the Internet pin-head?

rick jones
--
firebug n, the idiot who tosses a lit cigarette out his car window

Martijn Lievaart

unread,

Aug 12, 2008, 1:27:17 PM8/12/08

to

On Tue, 12 Aug 2008 01:20:45 -0700, Didi wrote:

> Vernon Schryver wrote:
>> ....
>> > There is no state to save between requests if you use TCP
>> >other than what there is if you use UDP.
>>
>> That is wrong if you define "request" as broadly as required by the
>> situation. The socket and FD for a DNS/TCP request is created when DNS
>> server does a select() and accept() soon after the SYN from the client
>> arrives. That FD, the underlying TSP, and the client's IP address,
>> sequence number, TCP options,...
>
> I know how TCP works. Like I already said, I happen to have implemented
> it, along with DNS and many other things.
> All you mention has nothing to do with states to save related to DNS
> as
> you claimed. This is the overhead TCP adds to which we both referred
> separately.

I don't see how you can implement DNS over TCP without saving the socket,
thus saving extra state. Which is what Vernon said btw.

M4

Vernon Schryver

unread,

Aug 12, 2008, 1:20:29 PM8/12/08

to

In article <g7sg18$h8$1...@usenet01.boi.hp.com>,
Rick Jones <rick....@hp.com> wrote:

>> It would do no retransmitting, which might be hard on DNS clients
>> beyond lossy paths.
>
>So long as the ACK of the request was only piggybacked on the segment
>carrying the reponse you would be OK. Then in the event of loss of
>ACK you would get a retransmit from the client.

good point

>Is it really sufficient to protect against the spoofing if only the
>clients to be keeping state? I guess the 32 bit ISN of TCP is a nice
>boost.

The current kludges of less than partial fixes of varying the client's
UDP port number or toggling 0x20 bits is all about state on the client.
The real DNSSEC fix is also about client state if you squint at it from
the right angle.
The server's state is useless, expensive overhead except when the server
recurses. Real DNS/TCP/IP protects against the current cache poisoning
attack only because predicting the client's DNS/TCP initial sequence
numbers to spoof the server's answers is hard in the modern era. For
example, if the client always used an initial sequence number of 1, the
bad guy could spew DNS/TCP segments with bogus answers much like the
bogus DNS/UDP bogus answers. The main difference is that the DNS/TCP
bogus answers would need to be preceded by bogus SYN-ACKs.

> It would be a nasty ugly kludge. How many more nasty ugly
>kludges can dance on the Internet pin-head?

time-wait delays, MSS options, Nagle, congestion control, congestion
avoidance, header prediction, page flipping/zero-copy, timestamps,
extended windows, IPv6 and the rest of the list implies some job security.

oh what a tangled web we weave when first we practice to be compabible.

Vernon Schryver v...@rhyolite.com