Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Updated Revocation Best Practices

490 views
Skip to first unread message

Wayne Thayer

unread,
Feb 12, 2019, 6:43:17 PM2/12/19
to mozilla-dev-security-policy
Mozilla's guidance for incident response lives at
https://wiki.mozilla.org/CA/Responding_To_An_Incident

I just made some significant changes to the Revocation section that reflect
the approach we took with the recent underscore sunset.

Most notably, the following paragraph:

However, it is not our intent to introduce additional problems by forcing
> the immediate revocation of certificates that are not BR-compliant when
> they do not pose an urgent security concern. Therefore, we request that
> your CA perform careful analysis of the situation. If there is
> justification to not revoke the problematic certificates, then your report
> will need to explain those reasons and provide a timeline for when the bulk
> of the certificates will expire or be revoked/replaced.
>

Has been replaced with:

Mozilla recognizes that in some exceptional circumstances, revoking
> misissued certificates within the prescribed deadline may cause significant
> harm, such as when the certificate is used in critical infrastructure and
> cannot be safely replaced prior to the revocation deadline. However,
> Mozilla does not grant exceptions to the BR revocation requirements. It is
> our position that your CA is ultimately responsible for deciding if the
> harm caused by following the requirements of BR section 4.9.1.1 outweighs
> the risks created by choosing not to meet this requirement.
>

Additions have also been made to our expectations when a CA doesn't revoke
on time, along with a number of minor updates.

You can view a comparison of all the changes at
https://wiki.mozilla.org/index.php?title=CA%2FResponding_To_An_Incident&type=revision&diff=1207675&oldid=1185707

I will greatly appreciate everyone's feedback on these changes.

- Wayne

Wayne Thayer

unread,
Mar 15, 2019, 7:48:20 PM3/15/19
to mozilla-dev-security-policy
As I mentioned last week [1], the "serial number entropy" issue has
identified some improvements that could be made to Mozilla's guidance for
CAs on revocation when responding to an incident. These are relatively
minor clarifications and in no way do they represent a fundamental change
in our guidance. I have updated a portion of the Revocation section on the
wiki page [2] as follows:

> Mozilla recognizes that in some exceptional circumstances, revoking
> misissued certificates within the prescribed deadline may cause significant
> harm, such as when the certificate is used in critical infrastructure and
> cannot be safely replaced prior to the revocation deadline, or when a
> defect affects a massive number of Subscribers and certificates. However,
> Mozilla does not grant exceptions to the BR revocation requirements. It is
> our position that your CA is ultimately responsible for deciding if the
> harm caused by following the requirements of BR section 4.9.1 outweighs the
> risks that are passed on to individuals who rely on the web PKI by choosing
> not to meet this requirement.
>
> If your CA will not be revoking the certificates within the time period
> required by the BRs, our expectations are that:
>
> - The decision and rationale for delaying revocation will be disclosed
> to Mozilla in the form of a preliminary incident report immediately;
> preferably before the BR mandated revocation deadline. The rationale must
> include an explanation for why the situation is exceptional. Responses
> similar to “we deem this misissuance not to be a security risk” are
> generally not acceptable, and must be discussed on the
> mozilla.dev.security.policy list. When revocation is delayed at the request
> of specific Subscribers, the rationale should be provided on a
> per-Subscriber basis.
> - Any decision to not comply with the timeline specified in the
> Baseline Requirements must also be accompanied by a clear timeline
> describing if and when the problematic certificates will be revoked and
> supported by the rationale to delay revocation.
> - The issue will need to be listed as a finding in your CA’s next BR
> audit statement.
> - Your CA will work with your auditor (and supervisory body, as
> appropriate) and the Root Store(s) that your CA participates in to ensure
> your analysis of the risk and plan of remediation is acceptable.
> - That you will perform an analysis to determine the factors that
> prevented timely revocation of the certificates, and include a set of
> remediation actions in the final incident report that aim to prevent future
> revocation delays.
>
> If your CA will not be revoking the problematic certificates as required
> by the BRs, then we recommend that you also contact the other root programs
> that your CA participates in to acknowledge this non-compliance and discuss
> what expectations their Root Programs have with respect to these
> certificates.
>
I will once again appreciate everyone's constructive feedback on these
changes.

- Wayne

[1]
https://groups.google.com/d/msg/mozilla.dev.security.policy/S2KNbJSJ-hs/HNDX5LaZCAAJ
[2] https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation

Ryan Sleevi

unread,
Mar 15, 2019, 9:14:52 PM3/15/19
to Wayne Thayer, mozilla-dev-security-policy
While I realize the thinking is with regards to the recent serial number
issue, a few questions emerge:

1) Based on the software vendor reporting, they don’t view this as a
software defect, but a CA misconfiguration. Do you believe the current
policy, as worded, addresses that ambiguity?

2) We’ve seen CAs fail to do things like validate the well-formedness of
domain names or ensure consistent validation of their certificates. Given
the current (new) policy allows a CA to make a determination as to whether
a “massive” number of certificates / Subscribers are affected by a given
defect, and given that many CAs have historically viewed material and
substantial, dangerous non-compliance as “minor defects,” are you concerned
that this may place Mozilla directly in a position of requiring revocation
when CAs otherwise decline to?

3) With the rephrasing about acceptability to be “general” regarding the
severity of the issue, is there any concern that this may introduce
liability to Mozilla in assessing whether or not a given issue is a
security risk? It would seem that the previous intent is for the CA to
demonstrate their careful and thoughtful analysis as to the severity of
things, while this new policy would seem to permit CAs to make blanket
statements, without any expectations of them showing their analysis. While
it includes discussion on this forum, it’s unclear what acceptable
expectations there are.

4) This new policy seems to explicitly allow a CA never revoking a
non-compliant Certificate. Is that your intent? If so, is there any concern
that this introduces the risk of CAs presenting revocation as being
“required by Mozilla” as opposed to the factually correct and accurate
“required by the Baseline Requirements” if Mozilla or this community should
disagree with such a decision?

5) If multiple CAs are affected by a common incident, this seems to
encourage delaying revocation as long as possible. It’s unclear whether a
CA that can and does revoke their certificates will be more or less
favorably considered, both by the ecosystem and by this community. Given
the economic incentives, it seems to strongly discourage revocation, as a
way of competitive differentiation.

In general, this seems to significantly weaken the assurances that Relying
Parties have as to whether or not CAs will follow the BRs, and to place
Mozilla specifically, and this Forum generally, into a role of determining
whether or not revocation is required and whether the timelines are
reasonable. Given that the vast majority (all?) of the non-compliance
incidents we’ve seen have been argued as defects (in ACLs, in policy, in
procedures), I do worry that this encourages CAs to not revoke, whether
it’s a major matter - such as malformed DNS - or a “minor” matter (if such
a thing exists).

This seems to create some of the wrong incentives, although I do understand
and appreciate the point from which it is coming from, in that it seems to
actively discourage revocation, unless and until Mozilla explicitly
requires it. This is certainly a position Mozilla could take, but it does
seem to be significantly different than the past conversations.

I’m not yet sure how to best suggest clarifications, but I did want to
highlight how the relatively small changes seem to do more to significantly
alter policy, rather than to clarify existing policy.

Wayne Thayer

unread,
Mar 16, 2019, 12:49:20 PM3/16/19
to Ryan Sleevi, mozilla-dev-security-policy
Ryan - Thank you for the feedback.

On Fri, Mar 15, 2019 at 6:14 PM Ryan Sleevi <ry...@sleevi.com> wrote:

> While I realize the thinking is with regards to the recent serial number
> issue, a few questions emerge:
>
> 1) Based on the software vendor reporting, they don’t view this as a
> software defect, but a CA misconfiguration. Do you believe the current
> policy, as worded, addresses that ambiguity?
>
>
As the language is an example, I don't believe it needs to address this
distinction. I intended "defect" to mean a defect in the certificate, so
perhaps it would help to specify that - i.e. "certificate defect"?

2) We’ve seen CAs fail to do things like validate the well-formedness of
> domain names or ensure consistent validation of their certificates. Given
> the current (new) policy allows a CA to make a determination as to whether
> a “massive” number of certificates / Subscribers are affected by a given
> defect, and given that many CAs have historically viewed material and
> substantial, dangerous non-compliance as “minor defects,” are you concerned
> that this may place Mozilla directly in a position of requiring revocation
> when CAs otherwise decline to?
>
>
Are you asking if I'm concerned that CAs will abuse this guidance to avoid
revocation of misissued certificates? If so, the answer is yes, both with
the current/proposed and former wording. I don't feel that this additional
example changes the situation.

3) With the rephrasing about acceptability to be “general” regarding the
> severity of the issue, is there any concern that this may introduce
> liability to Mozilla in assessing whether or not a given issue is a
> security risk? It would seem that the previous intent is for the CA to
> demonstrate their careful and thoughtful analysis as to the severity of
> things, while this new policy would seem to permit CAs to make blanket
> statements, without any expectations of them showing their analysis. While
> it includes discussion on this forum, it’s unclear what acceptable
> expectations there are.
>
>
I can see how the term "generally" could be abused to mean "except in
whatever current mess we find ourselves in", and on that basis I would
support taking it back out.

4) This new policy seems to explicitly allow a CA never revoking a
> non-compliant Certificate. Is that your intent? If so, is there any concern
> that this introduces the risk of CAs presenting revocation as being
> “required by Mozilla” as opposed to the factually correct and accurate
> “required by the Baseline Requirements” if Mozilla or this community should
> disagree with such a decision?
>
>
Is there any difference between delaying revocation until a certificate
expires and not revoking at all? Is there any difference between CAs
misrepresenting revocation as "being required by Mozilla to happen by X
date" and "being required by Mozilla"?

5) If multiple CAs are affected by a common incident, this seems to
> encourage delaying revocation as long as possible. It’s unclear whether a
> CA that can and does revoke their certificates will be more or less
> favorably considered, both by the ecosystem and by this community. Given
> the economic incentives, it seems to strongly discourage revocation, as a
> way of competitive differentiation.
>
>
I'm not following how these changes have the effect of encouraging multiple
CAs to delay revocation as long as possible. but I do think it would be
useful to state that CAs who violate the BRs will always be looked upon
less favorably than those who do not.

In general, this seems to significantly weaken the assurances that Relying
> Parties have as to whether or not CAs will follow the BRs, and to place
> Mozilla specifically, and this Forum generally, into a role of determining
> whether or not revocation is required and whether the timelines are
> reasonable. Given that the vast majority (all?) of the non-compliance
> incidents we’ve seen have been argued as defects (in ACLs, in policy, in
> procedures), I do worry that this encourages CAs to not revoke, whether
> it’s a major matter - such as malformed DNS - or a “minor” matter (if such
> a thing exists).
>
>
I agree with this. The intent of the additional language stating that this
forum must discuss any decision not to revoke based on lack of risk is
intended to strengthen the requirement by forbidding CAs from unilaterally
declaring that a particular issue is not a security risk, but the actual
effect could be that it encourages CAs to try to punt every revocation
decision to this forum. This language will need to change or be removed.

This seems to create some of the wrong incentives, although I do understand
> and appreciate the point from which it is coming from, in that it seems to
> actively discourage revocation, unless and until Mozilla explicitly
> requires it. This is certainly a position Mozilla could take, but it does
> seem to be significantly different than the past conversations.
>
> I’m not yet sure how to best suggest clarifications, but I did want to
> highlight how the relatively small changes seem to do more to significantly
> alter policy, rather than to clarify existing policy.
>
> I welcome suggestions for how to improve the language in this guidance
with the aim of adding clarity without introducing more risk.

- Wayne

Ryan Sleevi

unread,
Mar 18, 2019, 3:00:59 PM3/18/19
to Wayne Thayer, Ryan Sleevi, mozilla-dev-security-policy
On Sat, Mar 16, 2019 at 12:49 PM Wayne Thayer <wth...@mozilla.com> wrote:

> Ryan - Thank you for the feedback.
>
> On Fri, Mar 15, 2019 at 6:14 PM Ryan Sleevi <ry...@sleevi.com> wrote:
>
>> While I realize the thinking is with regards to the recent serial number
>> issue, a few questions emerge:
>>
>> 1) Based on the software vendor reporting, they don’t view this as a
>> software defect, but a CA misconfiguration. Do you believe the current
>> policy, as worded, addresses that ambiguity?
>>
>>
> As the language is an example, I don't believe it needs to address this
> distinction. I intended "defect" to mean a defect in the certificate, so
> perhaps it would help to specify that - i.e. "certificate defect"?
>

I guess the challenge is it introduces the ontology that some folks have
advocated, but no one actually knows where the lines should be drawn, as
every example has had flaws. That is, a "certificate defect" could be
everything from granting basicConstraints:CA=true (e.g. as we saw with
Turktrust [1]) due to a misconfigured certificate profile (which, like
this, was an "off by one" error) to something like misencoded sequences [2].

My biggest worry with the proposal is that it seems to actually favor not
revoking/responding to systemic issues (those which can affect a
significant portion of the CA's issued certificates), whereas I think the
intent is that non-revocation should be exceptional and that the CA should
be moving to systemically address things.

I think the end-goal, for both cases, remains the same: that the CA take
holistic steps to make revocation easier and painless, whether they're
dealing with systemic issues (such as serial numbers or validation methods)
or exceptional situations (such as a rogue RA or validation agent). Looking
at Heartbleed as the example, we know that a massive number of Subscribers
and certificates were affected - it seems like this example would have
encouraged non-revocation, by choosing the size of impact as the
illustrative example.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=825022
[2]
https://wiki.mozilla.org/SecurityEngineering/mozpkix-testing#Things_for_CAs_to_Fix


> 4) This new policy seems to explicitly allow a CA never revoking a
>> non-compliant Certificate. Is that your intent? If so, is there any concern
>> that this introduces the risk of CAs presenting revocation as being
>> “required by Mozilla” as opposed to the factually correct and accurate
>> “required by the Baseline Requirements” if Mozilla or this community should
>> disagree with such a decision?
>>
>>
> Is there any difference between delaying revocation until a certificate
> expires and not revoking at all? Is there any difference between CAs
> misrepresenting revocation as "being required by Mozilla to happen by X
> date" and "being required by Mozilla"?
>

Fair points. I think the previous policy encouraged a more concrete plan of
action ("when"), and did not leave the CA decision making capability ("if")
which could create a conflict between the CA's decisions and the community
expectations. That said, you make a good point - if their "when" is "when
the certificate expires", then it's implicitly an "if" as well, and that
remains unless/until "when" is more prescreptive.


> 5) If multiple CAs are affected by a common incident, this seems to
>> encourage delaying revocation as long as possible. It’s unclear whether a
>> CA that can and does revoke their certificates will be more or less
>> favorably considered, both by the ecosystem and by this community. Given
>> the economic incentives, it seems to strongly discourage revocation, as a
>> way of competitive differentiation.
>>
>>
> I'm not following how these changes have the effect of encouraging
> multiple CAs to delay revocation as long as possible. but I do think it
> would be useful to state that CAs who violate the BRs will always be looked
> upon less favorably than those who do not.
>

If a given CA is faced with a systemic issue - such as serial numbers -
then they have a decision whether to replace a majority of certificates or
not. Independent of any analysis, there will naturally be a preference to
not revoke "if we don't have to". Because the encouragement to post on the
Forum, and because these discussions show that people's opinions about the
seriousness/reasonableness of the matter is, in some way, impacted by how
many other CAs are impacted, there's a natural incentive to delay
revocation as much as possible (and to draw out discussions as much as
possible), in the hopes that a decision to not revoke will end up being
more favorable. If the determination is that revocation is not necessary,
the CAs that reported and revoked effectively went through more "pain" that
was needed.

I think this ties back up to the first remarks, about understanding what
CAs are systemically doing to prevent further issues. I would think that
the end goal is that, regardless of severity, CAs should be moving to
systems where it's easier to mass-revoke. If large CAs are finding the
serial number issue problematic, I think it's reasonable to expect systemic
changes, such that the next time a "massive" number of subscribers are
affected, they make a determination to revoke, and sooner, and that such a
decision is less impactful.

Wayne Thayer

unread,
Apr 15, 2019, 7:45:33 PM4/15/19
to Ryan Sleevi, mozilla-dev-security-policy
Ryan - Again, thank you for the feedback, and please forgive me for the
delayed response. I've attempted to address your concerns on the wiki page
(since this isn't official policy, I'm editing the live document):

https://wiki.mozilla.org/index.php?title=CA%2FResponding_To_An_Incident&type=revision&diff=1210671&oldid=1207675

- Wayne
0 new messages