On Sat, Mar 16, 2019 at 12:49 PM Wayne Thayer <
wth...@mozilla.com> wrote:
> Ryan - Thank you for the feedback.
>
> On Fri, Mar 15, 2019 at 6:14 PM Ryan Sleevi <
ry...@sleevi.com> wrote:
>
>> While I realize the thinking is with regards to the recent serial number
>> issue, a few questions emerge:
>>
>> 1) Based on the software vendor reporting, they don’t view this as a
>> software defect, but a CA misconfiguration. Do you believe the current
>> policy, as worded, addresses that ambiguity?
>>
>>
> As the language is an example, I don't believe it needs to address this
> distinction. I intended "defect" to mean a defect in the certificate, so
> perhaps it would help to specify that - i.e. "certificate defect"?
>
I guess the challenge is it introduces the ontology that some folks have
advocated, but no one actually knows where the lines should be drawn, as
every example has had flaws. That is, a "certificate defect" could be
everything from granting basicConstraints:CA=true (e.g. as we saw with
Turktrust [1]) due to a misconfigured certificate profile (which, like
this, was an "off by one" error) to something like misencoded sequences [2].
My biggest worry with the proposal is that it seems to actually favor not
revoking/responding to systemic issues (those which can affect a
significant portion of the CA's issued certificates), whereas I think the
intent is that non-revocation should be exceptional and that the CA should
be moving to systemically address things.
I think the end-goal, for both cases, remains the same: that the CA take
holistic steps to make revocation easier and painless, whether they're
dealing with systemic issues (such as serial numbers or validation methods)
or exceptional situations (such as a rogue RA or validation agent). Looking
at Heartbleed as the example, we know that a massive number of Subscribers
and certificates were affected - it seems like this example would have
encouraged non-revocation, by choosing the size of impact as the
illustrative example.
[1]
https://bugzilla.mozilla.org/show_bug.cgi?id=825022
[2]
https://wiki.mozilla.org/SecurityEngineering/mozpkix-testing#Things_for_CAs_to_Fix
> 4) This new policy seems to explicitly allow a CA never revoking a
>> non-compliant Certificate. Is that your intent? If so, is there any concern
>> that this introduces the risk of CAs presenting revocation as being
>> “required by Mozilla” as opposed to the factually correct and accurate
>> “required by the Baseline Requirements” if Mozilla or this community should
>> disagree with such a decision?
>>
>>
> Is there any difference between delaying revocation until a certificate
> expires and not revoking at all? Is there any difference between CAs
> misrepresenting revocation as "being required by Mozilla to happen by X
> date" and "being required by Mozilla"?
>
Fair points. I think the previous policy encouraged a more concrete plan of
action ("when"), and did not leave the CA decision making capability ("if")
which could create a conflict between the CA's decisions and the community
expectations. That said, you make a good point - if their "when" is "when
the certificate expires", then it's implicitly an "if" as well, and that
remains unless/until "when" is more prescreptive.
> 5) If multiple CAs are affected by a common incident, this seems to
>> encourage delaying revocation as long as possible. It’s unclear whether a
>> CA that can and does revoke their certificates will be more or less
>> favorably considered, both by the ecosystem and by this community. Given
>> the economic incentives, it seems to strongly discourage revocation, as a
>> way of competitive differentiation.
>>
>>
> I'm not following how these changes have the effect of encouraging
> multiple CAs to delay revocation as long as possible. but I do think it
> would be useful to state that CAs who violate the BRs will always be looked
> upon less favorably than those who do not.
>
If a given CA is faced with a systemic issue - such as serial numbers -
then they have a decision whether to replace a majority of certificates or
not. Independent of any analysis, there will naturally be a preference to
not revoke "if we don't have to". Because the encouragement to post on the
Forum, and because these discussions show that people's opinions about the
seriousness/reasonableness of the matter is, in some way, impacted by how
many other CAs are impacted, there's a natural incentive to delay
revocation as much as possible (and to draw out discussions as much as
possible), in the hopes that a decision to not revoke will end up being
more favorable. If the determination is that revocation is not necessary,
the CAs that reported and revoked effectively went through more "pain" that
was needed.
I think this ties back up to the first remarks, about understanding what
CAs are systemically doing to prevent further issues. I would think that
the end goal is that, regardless of severity, CAs should be moving to
systems where it's easier to mass-revoke. If large CAs are finding the
serial number issue problematic, I think it's reasonable to expect systemic
changes, such that the next time a "massive" number of subscribers are
affected, they make a determination to revoke, and sooner, and that such a
decision is less impactful.