BIND 9.3.5-P1 random UDP src ports: some DNS responses delivered to wrong process

Irwin Tillman

unread,

Jul 9, 2008, 8:57:00 PM7/9/08

to

After upgrading to BIND 9.3.5-P1, I'm seeing some DNS responses
arriving at my host being misdirected to other processes (not named)
running on my host.

It appears to be because when named needs to send a query and chooses
a random UDP source port,
it's able to bind() that port even though the port's already in-use.

--

My platform is Solaris 10 on SPARC.

I have a RADIUS server already bound to IPv4 INADDR_ANY UDP port 1812,
specifying SO_REUSEADDR.

named is running without specifying any 'listen-on' or 'query-source
address'.
I see that when it sends a query and chooses a random UDP source port
'x'
it binds the socket (which is waiting for the DNS response) to IPv4
INADDR_ANY UDP port 'x',
specifying SO_REUSEADDR.

Sometimes 'x' happens to be 1812.
Solaris 10 allows this second bind() to IPv4 INADDR_ANY UDP port 1812
to succeed;
I assume that's because of the SO_REUSEADDR.

When the DNS response to the query arrives, Solaris may deliver it
to the RADIUS server; I can confirm that because my RADIUS server
logs these packets as malformed in various ways.

(I imagine that the converse may also be true; that some of the
packets sent by RADIUS clients to
the RADIUS server may instead be delivered to named, but am not
running named at a high enough
logging level to confirm that.)

For a long-running UDP-based server running on a fixed UDP port,
I see I can work around this using named's new 'avoid-v4-udp-ports'
option.
But I imagine that won't solve the problem in general; there may
be other UDP servers (say RPC-based servers) that pick ephemeral UDP
ports each time they start;
I can't specify those ports in named's 'avoid-v4-udp-ports' option.

Have I missed something here?
(Is it right for BIND to specify SO_REUSEADDR when it binds a socket
it will use for a UDP query with a random UDP source port?)

Mark Andrews

unread,

Jul 10, 2008, 7:13:10 PM7/10/08

to

You will have the problem even without SO_REUSEADDR.

<explict-address>.<port> and <0.0.0.0.>.<port> don't collide.

Named doesn't just call bind(0.0.0.0#0) as many systems
don't do good random port selection. Lots of systems are
sequential. Linux keeps handing out the same port as long
as it is not in use then sequentially increments it.

If you can, give named its own address.

Explicitly binding the query source will help in some, but
not all cases. If you are running named on a NAT I would
bind to the internal address and have all the queries go
through the NAT process. Note this depends on how NAT is
implemented.

Very few applications use UDP ports as fast as named now
does and kernels really are not tuned to handle it.

For what it is worth named has code do deal with responses
to queries that are made on a 0.0.0.0#53 but arrive of a
socket listening for queries. The kernel does not have
enough information to deliver the UDP message to the right
socket.

This can all be avoided if everyone signs their zones.

http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf

This could also have beeen avoided if everyone implemented
BCP38 to the best of their abilities.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: Mark_A...@isc.org

Florian Weimer

unread,

Jul 11, 2008, 3:28:38 PM7/11/08

to

* Mark Andrews:

> Named doesn't just call bind(0.0.0.0#0) as many systems
> don't do good random port selection. Lots of systems are
> sequential. Linux keeps handing out the same port as long
> as it is not in use then sequentially increments it.

Linux 2.6.24 assigns non-sequential ports, but not from a PRNG which
should be considered strong enough (IMHO).

> This can all be avoided if everyone signs their zones.
>
> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf

I think part of our problem is that a presentation titled "DNSSEC in 6
minutes" consists of 77 slides. 8-)

Alan Clegg

unread,

Jul 11, 2008, 4:17:33 PM7/11/08

to

>> This can all be avoided if everyone signs their zones.
>>
>> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
>
> I think part of our problem is that a presentation titled "DNSSEC in 6
> minutes" consists of 77 slides. 8-)

A previous posting of mine:

As the author of the paper, the result is YOU being able to deploy a
DNSSEC signed zone within 6 minutes. No, you can't learn to do it in 6
minutes, but once you understand the process (and it's not really
difficult), you can easily go from unsigned (no keys, etc) to fully
signed within 6 minutes per zone (and that's doing it by hand!)

AlanC

Dan Mahoney, System Admin

unread,

Jul 11, 2008, 4:27:35 PM7/11/08

to

What is the screengrab and reference at the end?

-Dan

--

"What's with the server farm down in the basement?"

-Spider, Three Skulls Commons at Selden House, 4/15/00

--------Dan Mahoney--------
Techie, Sysadmin, WebGeek
Gushi on efnet/undernet IRC
ICQ: 13735144 AIM: LarpGM
Site: http://www.gushi.org
---------------------------

Florian Weimer

unread,

Jul 11, 2008, 4:39:53 PM7/11/08

to

* Dan Mahoney:

> On Fri, 11 Jul 2008, Alan Clegg wrote:
>
>>>> This can all be avoided if everyone signs their zones.
>>>>
>>>> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf

> What is the screengrab and reference at the end?

"There's Something About Mary", apparently.

Florian Weimer

unread,

Jul 11, 2008, 5:13:16 PM7/11/08

to

* Alan Clegg:

>>> This can all be avoided if everyone signs their zones.
>>>
>>> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
>>

>> I think part of our problem is that a presentation titled "DNSSEC in 6
>> minutes" consists of 77 slides. 8-)
>
> A previous posting of mine:
>
> As the author of the paper, the result is YOU being able to deploy a
> DNSSEC signed zone within 6 minutes. No, you can't learn to do it in 6
> minutes, but once you understand the process (and it's not really
> difficult), you can easily go from unsigned (no keys, etc) to fully
> signed within 6 minutes per zone (and that's doing it by hand!)

It's still far too involved when "auto-sign yes;" could theoretically do
it (plus some tool to extract the data to be submitted upstream). I
hope something like this is in the pipeline. Most people don't need
offline keys, I think.

Stacey Jonathan Marshall

unread,

Jul 18, 2008, 7:58:01 AM7/18/08

to

Ouch... Would it perhaps help if named tried <0.0.0.0.>.<port> first.
And then if that didn't collide it could then bind to
<explict-address>.<port>.

Regards,
Stace

> Named doesn't just call bind(0.0.0.0#0) as many systems
> don't do good random port selection. Lots of systems are
> sequential. Linux keeps handing out the same port as long
> as it is not in use then sequentially increments it.
>

> If you can, give named its own address.
>
> Explicitly binding the query source will help in some, but
> not all cases. If you are running named on a NAT I would
> bind to the internal address and have all the queries go
> through the NAT process. Note this depends on how NAT is
> implemented.
>
> Very few applications use UDP ports as fast as named now
> does and kernels really are not tuned to handle it.
>
> For what it is worth named has code do deal with responses
> to queries that are made on a 0.0.0.0#53 but arrive of a
> socket listening for queries. The kernel does not have
> enough information to deliver the UDP message to the right
> socket.
>

> This can all be avoided if everyone signs their zones.
>
> http://www.isc.org/sw/bind/docs/DNSSEC_in_6_minutes.pdf
>

Stacey Jonathan Marshall

unread,

Jul 18, 2008, 9:24:41 AM7/18/08

to

Mark,

On Solaris it appears that they do not collide. A colleague ran this
test for me:

{blu}: sock
usage: sock [ options ] <host> <port>
sock [ options ] -s [ <IPaddr> ] <port> (for server)
-s operate as server instead of client
-u use UDP instead of TCP
-A SO_REUSEADDR option

{blu}: sock -s -u 23456 & (bind *.23456, no REUSE)
[1] 26891
{blu}: netstat -an | grep 23456
*.23456 Idle
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: ifconfig -a | grep 'inet 129'
inet 129.148.226.18 netmask ffffff00 broadcast 129.148.226.255
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}: kill %1
[1] Terminated sock -s -u 23456
{blu}: sock -s -u -A 23456 & (bind *.23456, REUSE)
[1] 32279
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}: kill %1
[1] Terminated sock -s -u -A 23456
{blu}: sock -s -u 129.148.226.18 23456 & (bind IP.23456, no REUSE)
[1] 33584
{blu}: sock -s -u 23456 (bind *.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u 129.148.226.18 23456 (bind IP.23456, no REUSE)
can't bind local address: Address already in use
{blu}: sock -s -u -A 23456 (bind *.23456, REUSE)
^C (succeeds)
{blu}: sock -s -u -A 129.148.226.18 23456 (bind IP.23456, REUSE)
^C (succeeds)
{blu}:

So, no REUSE, and the bind will not succeed. With REUSE, it always
succeeds.

Irwin Tillman

unread,

Aug 14, 2008, 11:24:59 AM8/14/08

to

I've verified that in BIND 9.4.2-P2 on Solaris 10 the problem I
described has gone away.
I assume this is because of the change:

2396. [bug] Don't set SO_REUSEADDR for randomized ports.
[RT #18336]