[cabfpub] CAA look up failures and retry logic

Doug Beattie

unread,

Oct 3, 2017, 12:01:43 PM10/3/17

to CA/Browser Forum Public Discussion List

The BR requirement for retrying failed lookups is ambiguous and we’d like to receive some clarification, and eventually a ballot to help clarify the BRs.

The BRs stay this:

CAs are permitted to treat a record lookup failure as permission to issue if:

- the failure is outside the CA's infrastructure;

- the lookup has been retried at least once; and

- the domain's zone does not have a DNSSEC validation chain to the ICANN root.

RFC 6844 Errata 5065 says this:

- If CAA(X) is not empty, R(X) = CAA (X), otherwise

- If A(X) is not null, and CAA(A(X)) is not empty, then R(X) = CAA(A(X)), otherwise

- If X is not a top-level domain, then R(X) = R(P(X)), otherwise

- R(X) is empty.

The BRs say if a lookup has been retried at least once that is permission to issue. Does this mean doing

- a full CAA lookup, or

- re-doing one failed CAA(X) look-up, or

- redoing every CAA(X) lookup that failed in the course of doing a full CAA validation?

If we follow the RFC processing logic and we encounter one failed lookup (e.g., SERVFAIL on shop.example.com), then we retry and it fails again, then do we exit the CAA checking and issue because the BRs say we may issue if we retry the lookup, which we just did? Reading the specs this seems to be permitted (we did “a” retry for a failed lookup), common logic says no.

Another interpretation is that we do the full RFC CAA validation series of “look ups”, and if it fails anywhere along the lines, we do another full CAA validation set of “look ups”, and if that fails we issue. Probably not realistic.

The most likely interpretation is that we retry each failed CAA(X) lookup, then proceed with the RFC processing logic to completion. In this model any one or more specific DNS lookup may fail (and retry failed) the CA has permission to issue. In fact, every DNS lookup could fail and that would be permission to issue as well (assuming DNSSEC didn’t block it)

Can we agree that the BR statement “lookup has been retried at least once” means retrying each CAA(X) lookup that failed while performing the CAA validation algorithm specified in RFC 6844 Errata 5065?

Look up failure means Timeout (with arbitrarily short timeout period since none is specified), SERVFAIL, REFUSED and NXDOMAIN (and maybe more DNS RCODES, but these are the obvious ones)

Geoff Keating

unread,

Oct 3, 2017, 10:06:44 PM10/3/17

to Doug Beattie, CA/Browser Forum Public Discussion List

On Oct 4, 2017, at 12:01 AM, Doug Beattie via Public <pub...@cabforum.org> wrote:

The BRs say if a lookup has been retried at least once that is permission to issue. Does this mean doing
-          a full CAA lookup, or
-          re-doing one failed CAA(X) look-up, or
-          redoing every CAA(X) lookup that failed in the course of doing a full CAA validation?

If we follow the RFC processing logic and we encounter one failed lookup (e.g., SERVFAIL on shop.example.com), then we retry and it fails again, then do we exit the CAA checking and issue because the BRs say we may issue if we retry the lookup, which we just did? Reading the specs this seems to be permitted (we did “a” retry for a failed lookup), common logic says no.

That’s an interesting point. We could treat a (second) failure as meaning:

- Assume there is no CAA record here, continue with the algorithm, and maybe find a lower CAA record which denies issuance

- Assume there is a CAA record here which specifically allows issuance.

I believe the current wording is the second, not the first. I think considering we’re just getting started with mandatory CAA, it’s OK to have this rule at the moment. Switching to the first rule might be a way to tighten things once we’ve gotten some experience.

Doug Beattie

unread,

Oct 4, 2017, 5:52:44 AM10/4/17

to geo...@apple.com, CA/Browser Forum Public Discussion List

We run into a lot of failures when looking up the full host name for CAA records, but then we find CAA records at the Top Level Domain (where most domain administrators put them). If I’m understanding your comments, I’m not sure your second option is good in these cases because the failure would permit issuance without looking harder for a CAA record, but this is a good discussion point. Anyone else have a comment?

Jacob Hoffman-Andrews

unread,

Oct 4, 2017, 4:16:31 PM10/4/17

to Doug Beattie, CA/Browser Forum Public Discussion List

You make a good point. To reiterate the language from the BRs:

> CAs are permitted to treat a record lookup failure as permission to issue if:

> • the failure is outside the CA's infrastructure;

> • the lookup has been retried at least once; and
> • the domain's zone does not have a DNSSEC validation chain to the ICANN root.

Specifically, this talks about a single record lookup failure, but allows treating that as permission to issue. I think the behavior we'd really like here is to treat a record lookup failure as equivalent to a successful, empty response if those conditions are met. That way, for instance if a CAA lookup for "nonexistent.example.com" returns NXDOMAIN, the CA is still required to attempt looking up a CAA record for "example.com".

So I agree that your "most likely" option is the ideal, and is what CAs should be implementing to be conservative, but the BRs do not currently say that. I would support a ballot to amend it.

Doug Beattie

unread,

Oct 4, 2017, 4:28:41 PM10/4/17

to Jacob Hoffman-Andrews, CA/Browser Forum Public Discussion List

From: Jacob Hoffman-Andrews [mailto:js...@letsencrypt.org]
Sent: Wednesday, October 4, 2017 4:17 PM
To: Doug Beattie <doug.b...@globalsign.com>; CA/Browser Forum Public Discussion List <pub...@cabforum.org>
Cc: geo...@apple.com
Subject: Re: [cabfpub] CAA look up failures and retry logic

You make a good point. To reiterate the language from the BRs:

CAs might be motivated to run into fewer failures and to issue more certificates, so being conservative might not be what’s happening. I know we’re blocking many more requests than some other CAs (mutual customers have informed us), which is driving this line of questioning for clarity.

- What constitutes a failure?

- How are failures retried and processed?

- What’s an acceptable timeout period?

None of this is called out in the BRs or RFC.

Reply all

Reply to author

Forward