DNSSEC Bogus NXDOMAIN survives authenticating RR

Niobos

unread,

Dec 7, 2009, 6:53:26 AM12/7/09

to bind-...@lists.isc.org

Hi all,

I'm having some problems with implementing DNSSEC with NSEC3. I'm fairly new to DNSSEC, so it is certainly possible that my understanding of the subject is causing me to miss something. Also, I'm not entirely sure this is the correct mailing list, more accurate pointers are welcome.

The setup contains two BIND nameservers, both version 9.6.1-P1 on a linux OS (ubuntu 9.10 and gentoo). One is configured as authorative name-server for a (test)zone; the other is configured to be an authenticating recursive resolver.

I created a zone with the following entries (besides the standard SOA and NS):
* normal A 127.0.0.1
* changed A 127.0.0.1
* removed A 127.0.0.1
I also have two DNSKEY records (one KSK and one ZSK).

After signing this zone with the keys, I intentionally modify the signed zonefile to simulate a MITM attack:
* I change the "changed" A record to point to 127.0.0.2
* I remove the "removed" A record, along with its RRSIG
I would expect DNSSEC to catch these changes and reject the bogus responses.

When requesting a lookup of "normal", I get a NOERROR and the AuthenticatedData flag is set, along with the requested data.
When requesting a lookup of "changed", I get a SERVFAIL. I'm not sure if this is the expected behaviour, but it seems logical.
When requesting a lookup of "removed", I get a SERVFAIL as well. However, every subsequent request for "removed" gets an NXDOMAIN. (dig outputs below)
Flushing the caches on the RR with "rndc flush" causes the first request to be a SERVFAIL again.

When I look at the debug output of the RR for channel dnssec, I see no additional entries after the initial request. Log in attachement (sorry for the wrong mime-type; if anyone knows how to convince Mail.app to de this decently, let me know)

dnssec.log

Hauke Lampe

unread,

Dec 8, 2009, 9:18:49 AM12/8/09

to Niobos, bind-...@lists.isc.org

Niobos wrote:

> When requesting a lookup of "removed", I get a SERVFAIL as well. However, every subsequent request for "removed" gets an NXDOMAIN. (dig outputs below)
> Flushing the caches on the RR with "rndc flush" causes the first request to be a SERVFAIL again.

I cannot reproduce this behaviour with BIND 9.7.0b3. I get a SERVFAIL
for all lookups to changed/removed records.

Maybe you can try these with 9.6.1-P1:

dig +dnssec normal.fnord.dnstest.hauke-lampe.de
should return 127.0.0.1 and the AD flag (if you use DLV with either
dlv.isc.org or dnssec.iks-jena.de).

dig +dnssec changed.fnord.dnstest.hauke-lampe.de
should return SERVFAIL and log "error (no valid RRSIG)" for the A record.

dig +dnssec removed.fnord.dnstest.hauke-lampe.de
should return SERVFAIL and log validation failures for the SOA as well
as the A record (because removing the record disrupted the NSEC3 chain).

Hauke.

signature.asc

Niobos

unread,

Dec 8, 2009, 9:52:54 AM12/8/09

to bind-...@lists.isc.org, Hauke Lampe

On 08 Dec 2009, at 15:18, Hauke Lampe wrote:

Niobos wrote:

When requesting a lookup of "removed", I get a SERVFAIL as well. However, every subsequent request for "removed" gets an NXDOMAIN. (dig outputs below)
Flushing the caches on the RR with "rndc flush" causes the first request to be a SERVFAIL again.

I cannot reproduce this behaviour with BIND 9.7.0b3. I get a SERVFAIL
for all lookups to changed/removed records.

Maybe you can try these with 9.6.1-P1:

dig +dnssec normal.fnord.dnstest.hauke-lampe.de
should return 127.0.0.1 and the AD flag (if you use DLV with either
dlv.isc.org or dnssec.iks-jena.de).

Correct

dig +dnssec changed.fnord.dnstest.hauke-lampe.de
should return SERVFAIL and log "error (no valid RRSIG)" for the A record.

Correct (I didn't check the log, but the end result is correct)

dig +dnssec removed.fnord.dnstest.hauke-lampe.de
should return SERVFAIL and log validation failures for the SOA as well
as the A record (because removing the record disrupted the NSEC3 chain).

Correct (didn't check the log), and it keeps SERVFAIL-ing on subsequent tries as well.

While trying this, I noticed something that might give some info to where the problem is located:

As soon as I activate DLV (besides the manual SEP I entered), the "removed" behaviour changes:

* First lookup still returns SERVFAIL

* Subsequent lookups now return NXDOMAIN with the AD flag *set*! (log confirms that my domain is not in the DLV and hence is insecure)

Could you try this lookup?

dig +dnssec removed.dnssec.dest-unreach.be

My keys are not (yet) in any DLV database, so you'll just have to assume my DNSKEYs are correct.

Could the problem be that the authenticating RR somehow considers this domain to be insecure when looking up "removed"?

Thanks,

Niobos

Hauke Lampe

unread,

Dec 8, 2009, 2:25:29 PM12/8/09

to Niobos, bind-...@lists.isc.org

Niobos wrote:

> As soon as I activate DLV (besides the manual SEP I entered), the "removed" behaviour changes:
> * First lookup still returns SERVFAIL
> * Subsequent lookups now return NXDOMAIN with the AD flag *set*! (log confirms that my domain is not in the DLV and hence is insecure)

That is weird. I haven't seen that before and have no good explanation
at hand.

> Could you try this lookup?
> dig +dnssec removed.dnssec.dest-unreach.be

I see now what you mean.

Even though I have added your DNSKEY as trusted key, I get SERVFAIL on
the first query and NXDOMAIN on the second, without BIND doing any
additional outgoing queries.

One of your name servers returns unsigned NXDOMAIN responses with a
higher serial number than the master server:

| $ dig +dnssec removed.dnssec.dest-unreach.be @sdns1.ovh.net.
|
| ;; Got answer:
| ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32510
| ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
| ;; WARNING: recursion requested but not available
|
| ;; OPT PSEUDOSECTION:
| ; EDNS: version: 0, flags: do; udp: 4096
| ;; QUESTION SECTION:
| ;removed.dnssec.dest-unreach.be. IN A
|
| ;; AUTHORITY SECTION:
| dest-unreach.be. 3600 IN SOA serv02.imset.org.
hostmaster.dest-unreach.be. 2009111619 3600 3600 604800 3600

serv02.imset.org returns a signed NXDOMAIN response with serial 2009081781.

That corresponds to BIND's error message:

| error (insecurity proof failed) resolving
'removed.dnssec.dest-unreach.be/A/IN': 213.251.188.140#53

> Could the problem be that the authenticating RR somehow considers this domain to be insecure when looking up "removed"?

That might well be the case, although I would expect BIND not to return
unsigned queries for names below a manually configured trust anchor.

Maybe others have an idea what's happening here and why BIND returns
NXDOMAIN responses.

Hauke.

Niobos

unread,

Dec 9, 2009, 3:40:38 AM12/9/09

to bind-...@lists.isc.org, Hauke Lampe

>> Could you try this lookup?
>> dig +dnssec removed.dnssec.dest-unreach.be
>
> I see now what you mean.
>
> Even though I have added your DNSKEY as trusted key, I get SERVFAIL on
> the first query and NXDOMAIN on the second, without BIND doing any
> additional outgoing queries.

This is the same behavior I'm observing.

> One of your name servers returns unsigned NXDOMAIN responses with a
> higher serial number than the master server:

I didn't configure the zone by the book; I corrected that now, but the results remain the same.

> serv02.imset.org returns a signed NXDOMAIN response with serial 2009081781.
>
> That corresponds to BIND's error message:
>
> | error (insecurity proof failed) resolving
> 'removed.dnssec.dest-unreach.be/A/IN': 213.251.188.140#53

The response is indeed signed, but the signature should *fail* validation, since there is no covering NSEC3 for the looked-up record.
Do I understand the error correctly like this: BIND failed to prove the domain to be insecure, hence, the NXDOMAIN response should have a correct signature, hence, the response it got is bogus?

>> Could the problem be that the authenticating RR somehow considers this domain to be insecure when looking up "removed"?
>
> That might well be the case, although I would expect BIND not to return
> unsigned queries for names below a manually configured trust anchor.

I removed DLV-validation and manually added your KSK DNSKEY as a SEP, without change in behavior: removed.fnord.dnstest.hauke-lampe.de keeps returning SERVFAIL (as it should).
It seems that my resolver is configured identical for both my and your domain; so it's possibly some difference in the served zone that causes this behaviour.
What did you change for the "removed" record? Did you remove only the A and RRSIG? Or also the corresponding NSEC3?
In attachement my full (signed) zone-file. It's a test-zone anyway, so I don't think this is a security issue.

dnssec.dest-unreach.be.zone.signed

Hauke Lampe

unread,

Dec 9, 2009, 6:59:52 PM12/9/09

to Niobos, bind-...@lists.isc.org

[I finally gave up on trying to get Thunderbird *not* to wrap long
lines. Prefixing them with ">" seems to be the only way, even if confusing]

Niobos wrote:

>>> dig +dnssec removed.dnssec.dest-unreach.be

>> Even though I have added your DNSKEY as trusted key, I get SERVFAIL on
>> the first query and NXDOMAIN on the second, without BIND doing any
>> additional outgoing queries.
> This is the same behavior I'm observing.

I think I see it clearer now.

The inner workings of the NSEC/3 mechanisms are a bit of a mystery to
me, so the following is mostly based on guesswork.

Maybe I broke my test zone in a different way and that's why we don't
see the same results. Your SOA record validates, mine doesn't:

> validating @0xb91c7968: fnord.dnstest.hauke-lampe.de SOA: no valid signature found

And there lies the problem.
The signatures on your SOA and NSEC3 records in the NXDOMAIN response
are all valid. It's their meaning, the proof of nonexistence for the
removed record, that cannot be established:

> validating @0xb4e01470: removed.dnssec.dest-unreach.be A: attempting negative response validation
> validating @0xb4e01ee0: dnssec.dest-unreach.be SOA: verify rdataset (keyid=33827): success
> validating @0xb8e98b60: 67152CME7SOELFT0OOTFB03FQ968LOM1.dnssec.dest-unreach.be NSEC3: verify rdataset (keyid=33827): success
> validating @0xb8e98b60: OKIU30OTQ4ETK8K4VP0L3MM20HUNI5R2.dnssec.dest-unreach.be NSEC3: verify rdataset (keyid=33827): success
> validating @0xb4e01470: removed.dnssec.dest-unreach.be A: NSEC3 proves name exists (owner) data=1
> validating @0xb4e01470: removed.dnssec.dest-unreach.be A: nonexistence proof(s) not found

BIND seems to cache the validation state of the signatures, not the
failed nonexistence proof. At least it doesn't re-validate cached answers:

> client 127.0.0.1#47401: UDP request
> client 127.0.0.1#47401: using view '_default'
> client 127.0.0.1#47401: request is not signed
> client 127.0.0.1#47401: recursion available
> client 127.0.0.1#47401: query
> client 127.0.0.1#47401: query (cache) 'removed.dnssec.dest-unreach.be/A/IN' approved
> client 127.0.0.1#47401: send
> client 127.0.0.1#47401: sendto
> client 127.0.0.1#47401: senddone
> client 127.0.0.1#47401: next
> client 127.0.0.1#47401: endrequest

So, while the first query returns SERVFAIL as expected, subsequent
responses from the cache even have the AD flag set. This is the one
thing that *really* puzzled me (otherwise I probably wouldn't have begun
looking at long debug logs ;)

> hauke@pope:~$ dig +dnssec removed.dnssec.dest-unreach.be
[...]
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 46781
> ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1

The response doesn't validate:

> hauke@pope:~$ dig +sigchase +trusted-key=./dnskey-dnssec.dest-unreach.be +dnssec removed.dnssec.dest-unreach.be
[...]
> ;; Impossible to verify the Non-existence, the NSEC RRset can't be validated: FAILED

I think this is a bug in BIND's resolver part. You should forward a bug
report to bind9...@isc.org.

Unbound returns SERVFAIL to all queries for
removed.dnssec.dest-unreach.be and keeps logging the failed NSEC3 test:

> unbound: [968:0] debug: Validating a nxdomain response
> unbound: [968:0] debug: nsec3: keysize 1024 bits, max iterations 150
> unbound: [968:0] info: start nsec3 nameerror proof, zone <dnssec.dest-unreach.be. TYPE0 CLASS0>
> unbound: [968:0] info: ce candidate <removed.dnssec.dest-unreach.be. TYPE0 CLASS0>
> unbound: [968:0] debug: nsec3 proveClosestEncloser: proved that qname existed, bad
> unbound: [968:0] debug: nsec3 nameerror proof: failed to prove a closest encloser
> unbound: [968:0] debug: NameError response failed nsec, nsec3 proof was sec_status_bogus
> unbound: [968:0] info: validate(nxdomain): sec_status_bogus

> Do I understand the error correctly like this: BIND failed to prove
> the domain to be insecure, hence, the NXDOMAIN response should have a
> correct signature, hence, the response it got is bogus?

Yes, domains below a trust anchor (configured manually or through DLV)
must either be signed or proven to be insecure at the delegation point.

> What did you change for the "removed" record? Did you remove only the
> A and RRSIG? Or also the corresponding NSEC3?

I removed A and RRSIG only.

Here's what I did, using 9.7 defaults and smart-signing feature:

dnssec-keygen -r /dev/urandom -3 -f ksk $zone;
dnssec-keygen -r /dev/urandom -3 $zone;
dnssec-signzone -x -S -3 - -o $zone db.test

(/dev/urandom because it's faster and this was only a test zone)

Then I edited db.test.signed, changed the "changed" record and removed
"removed" and its RRSIG.

Why we see different kinds of failures, I don't know. It's probably got
to do with some of the signey-wimey DNSSEC voodoo stuff I hope I never
have to understand in all its details.

Hauke.

Niobos

unread,

Dec 10, 2009, 2:49:37 AM12/10/09

to Hauke Lampe, bind-...@lists.isc.org

Thank you very much for your help; I'll forward the conversation to the bug-tracking list.

Since these are my first DNSSEC experiments, I just wanted to make sure that it wasn't a problem with my understanding of the concept.

Niobos

unread,

Jan 25, 2010, 1:12:58 PM1/25/10

to bind-...@lists.isc.org, hauke Lampe

On 2009-12-10 08:49, Niobos wrote:

Thank you very much for your help; I'll forward the conversation to the bug-tracking list.

Since these are my first DNSSEC experiments, I just wanted to make sure that it wasn't a problem with my understanding of the concept.

Niobos

This has been confirmed as a security-bug by ISC a while back. Due to the potential exploit, they asked me not to release this information until the fix was released.

BIND 9.6.1-P3 now contains the fix:

827. [security] Bogus NXDOMAIN could be cached as if valid. [RT #20712]

I can confirm that this version behaves as expected: keeps returning SERVFAIL on bogus NXDOMAIN response.

Niobos