Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: BIND 9.9.1-P4 is now available

50 views
Skip to first unread message

Fr34k

unread,
Oct 25, 2012, 9:51:16 AM10/25/12
to Bindlist
Hello,

We are finding several of our recursive BIND 9.9.1-P3 servers (on Solaris 10 OS) hung and I want to be able to qualify the symptoms in order to convince others that P4 (or 9.9.2?) will (or will not) address this.

Let me define what "hung" means in our experience:  We find that named is running but will not respond to queries, "rndc status" will respond with output but that output shows that named is not processing any queries (see below), other rndc commands appear to work as well (e.g., "rndc dumpdb").

From what I understand, P4 offers this known bug fix:

*  A deliberately constructed combination of records could cause named
  to hang while populating the additional section of a response.
  [RT #31090] -- CVE-2012-5166: Specially crafted DNS data can cause a lockup in named

Additional details are mentioned in https://kb.isc.org/article/AA-00801/74/CVE-2012-5166%3A-Specially-crafted-DNS-data-can-cause-a-lockup-in-named.html:  "A nameserver that has become locked-up due to the problem reported in this advisory will not respond to queries or control commands."

So, our hang issue qualifies for the "...will not respond to queries"; however, it seems that our issue does *not* qualify for the "... will not respond to... control commands" piece if the responses from "rndc" are considered control command.

Thoughts?

Thank you.

$ rndc status
version: 9.9.1-P3 (version.bind/txt/ch disabled)
CPUs found: 2
worker threads: 2
UDP listeners per interface: 2
number of zones: 36
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/3900/4000
tcp clients: 0/100
server is up and running

$ time host www.google.com 127.0.0.1
;; connection timed out; no servers could be reached
 
real    0m10.035s
user    0m0.017s
sys     0m0.017s
$ time host localhost 127.0.0.1
;; connection timed out; no servers could be reached
 
real    0m10.034s
user    0m0.017s
sys     0m0.017s

$ truss -p 17657
/4:     lwp_park(0xFE9AFD48, 0)         (sleeping...)
/3:     lwp_park(0x00000000, 0)         (sleeping...)
/1:     sigtimedwait(0xFFBFFBE8, 0xFFBFFB68, 0x00000000) (sleeping...)
/2:     lwp_park(0x00000000, 0)         (sleeping...)
/5:     ioctl(8, DP_POLL, 0xFE98FF80)   (sleeping...)

Fr34k

unread,
Oct 25, 2012, 12:40:23 PM10/25/12
to Bindlist
Hello Again,

I could have made my question a bit more clear as I try to understand the details behind what P4 addresses.

Perhaps I am having an internal battle between logic vs. interpretation around "or".  Let me explain.

I'm wondering if a named process affected by CVE-2012-5166 has symptoms of both (1) "not respond to queries" and (2) "not respond to control commands" at the same time, all the time.  If that is the case, then P4 will not address my issue as I am only seeing (1) and so there may be another bug affecting BIND stability which I would like to report.

Thank you.


From: Fr34k <freak...@yahoo.com>
To: Bindlist <bind-...@isc.org>
Sent: Thursday, October 25, 2012 9:51 AM
Subject: Re: BIND 9.9.1-P4 is now available
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-...@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Jeremy C. Reed

unread,
Oct 25, 2012, 3:29:19 PM10/25/12
to Fr34k, Bindlist
> Let me define what "hung" means in our experience:  We find that named is
> running but will not respond to queries, "rndc status" will respond with
> output but that output shows that named is not processing any queries (see
> below), other rndc commands appear to work as well (e.g., "rndc dumpdb").

Does it work if you restart named?

If not, can you confirm it is listening on your intended interfaces
(including 127.0.0.1) even if not working?

> $ time host www.google.com 127.0.0.1
> ;; connection timed out; no servers could be reached

Can you confirm that you can query for that without? (Such as dig
@216.239.34.10 www.google.com or dig @8.8.8.8 www.google.com)

> $ time host localhost 127.0.0.1
> ;; connection timed out; no servers could be reached

Do you have a localhost zone defined? (Sometimes the messages from host
like the one above are misleading and even the named may be working
correctly but it is slow.)

Jeremy C. Reed
ISC

Fr34k

unread,
Oct 25, 2012, 4:57:46 PM10/25/12
to Jeremy C. Reed, Bindlist
Hello Jeremy,

Thank you for your reply.


>> Let me define what "hung" means in our experience:  We find that named is
>> running but will not respond to queries, "rndc status" will respond with
>> output but that output shows that named is not processing any queries (see
>> below), other rndc commands appear to work as well (e.g., "rndc dumpdb").
>
>Does it work if you restart named?

Yes.  That is, everything is up and running again after we restart named.

9.9.1-P3 has been running on several servers since 10/3 without any known issues... until today.


>
>If not, can you confirm it is listening on your intended interfaces
>(including 127.0.0.1) even if not working?
>
>> $ time host www.google.com 127.0.0.1
>> ;; connection timed out; no servers could be reached
>
>Can you confirm that you can query for that without? (Such as  dig
>@216.239.34.10 www.google.com  or dig @8.8.8.8 www.google.com)
>
>> $ time host localhost 127.0.0.1
>> ;; connection timed out; no servers could be reached
>
>Do you have a localhost zone defined? (Sometimes the messages from host
>like the one above are misleading and even the named may be working
>correctly but it is slow.)

Yes, we do have a localhost zone defined.
However, queries for 3rd party hostnames (e.g., www.google.com) were failing as well.


>
>  Jeremy C. Reed
>  ISC
>
>

Fr34k

unread,
Oct 26, 2012, 12:51:40 PM10/26/12
to Jeremy C. Reed, Bindlist
Hello Jeremy,

Thank you for your reply.

I plan to send more information to ISC when I have it - FYI

Looks like my response didn't make it out yesterday, so here is another attempt.
Please see my responses within below:


----- Original Message -----
> From: Jeremy C. Reed <jr...@isc.org>
> To: Fr34k <freak...@yahoo.com>
> Cc: Bindlist <bind-...@isc.org>
> Sent: Thursday, October 25, 2012 3:29 PM
> Subject: Re: BIND 9.9.1-P4 is now available
>

>> Let me define what "hung" means in our experience:  We find that
> named is
>> running but will not respond to queries, "rndc status" will
> respond with
>> output but that output shows that named is not processing any queries (see
>> below), other rndc commands appear to work as well (e.g., "rndc
> dumpdb").
>
> Does it work if you restart named?

Yes.  That is, when we restart named/9.9.1-P3 it works as well as it did since it was installed 10/3/2012

>
> If not, can you confirm it is listening on your intended interfaces
> (including 127.0.0.1) even if not working?
>
>> $ time host www.google.com 127.0.0.1
>> ;; connection timed out; no servers could be reached
>
> Can you confirm that you can query for that without? (Such as  dig
> @216.239.34.10 www.google.com  or dig @8.8.8.8 www.google.com)
>

Yes, and I just didn't provide any of those examples (sorry).
That is, I can say that any query (localhost or 3rd party hostnames) results in same outcome of "connection timed out; no servers could be reached".

>> $ time host localhost 127.0.0.1
>> ;; connection timed out; no servers could be reached
>
> Do you have a localhost zone defined? (Sometimes the messages from host
> like the one above are misleading and even the named may be working
> correctly but it is slow.)

While do have a localhost zone defined, any of our spot checks for local vs. off-network queries would fail.
Once we restart 9.9.1-P3, everything works again

>
>   Jeremy C. Reed
>   ISC
>

0 new messages