Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Results of 2025 Roundtable Discussion

3,291 views
Skip to first unread message

Ben Wilson

unread,
Jun 2, 2025, 6:58:31 PMJun 2
to dev-secur...@mozilla.org
All,

Here are: (1) an executive summary of the roundtable discussion (with sections of "Highlights" and "More Details"), (2) a list of action items / potential improvements, and (3) the full transcript of the Roundtable Discussion held May 16, 2025. 

Thanks to those who participated.

Ben


Executive Summary: 

Mozilla CA Program Roundtable Discussion – May 16, 2025

The roundtable discussion of May 16, 2025, convened 43 diverse members of the Mozilla community to identify opportunities for improving CA compliance, policy clarity, incident management, revocation practices, and automation adoption. Held under the Chatham House Rule, the session encouraged candid input and focused on collaborative improvement. Below is a summary of the key topics and themes discussed.


1. Revocation Policy and CPS Alignment


HIGHLIGHTS:

  • Participants raised concerns that minor documentation errors in a CPS—despite full technical compliance with the Baseline Requirements—still require mass revocation.

  • There is a "perverse incentive" to write vague CPS language to avoid the risk of revocation.

  • Suggested solution: introduce a mechanism to allow documented CPS corrections (e.g., footnotes, versioning) that maintain transparency but avoid unnecessary revocation.

  • There was broad support for reviewing and potentially updating BR section 4.9.1.1.

MORE DETAILS:

There was broad concern that minor misalignments between a CA's Certification Practice Statement (CPS) and its actual practices—when those practices still comply with the Baseline Requirements (BRs)—can trigger unnecessary and disruptive mass revocations.

"A good faith, trivial error in somebody's CPS can require that 100% of their certificates need to be revoked and replaced with certificates that are identical in every way except for the not-before date."

"We want to encourage transparency and process improvement, not discourage accurate documentation by threatening revocation."

Participants emphasized that this creates a disincentive to document narrow or enhanced security practices. It was proposed that alternative remedies, such as timely correction of documentation with annotations about the discrepancy, could maintain transparency without causing ecosystem-wide disruptions.

There was support for revisiting BR section 4.9.1.1 to clarify expectations for revocation when CPS discrepancies arise, and for exploring mechanisms that allow for process improvement without excessive punitive consequences.

"It's not a punitive measure to have to revoke. It is a process failure. So we need a way to make sure that this is fixed."

  • Issue Raised: There is currently a rigid requirement for revocation when CA documentation (CP/CPS) is misaligned with actual practices, even if certificate issuance was technically compliant with the Baseline Requirements (BRs).

  • Concerns:

    • This leads to “pointless mass revocations” when the only discrepancy is outdated or incomplete documentation.

    • Participants noted the perverse incentive to write vague CPS documents to avoid being held accountable to overly specific details.

"There's this kind of perverse incentive to never specify anything in your documentation that's not in the requirements itself."

  • Suggestions:

    • Consider creating a mechanism for corrective documentation updates instead of mandatory revocation in such cases.

    • Possibly update section 4.9.1.1 of the BRs to allow for exceptions where the error is trivial and does not affect certificate validity.

    • Include historical footnotes or explanatory notes in the CPS identifying the gap and its relevance period.

2. Clarifying Compliance Expectations and the Wiki


HIGHLIGHTS:

  • Mozilla’s CA Wiki pages (especially "Forbidden and Problematic Practices" and "Recommended Practices") were seen as important but in need of frequent maintenance.

  • New problematic practices should be added.

MORE DETAILS:

Mozilla’s CA guidance (e.g., the "forbidden and problematic practices" page and recommended practices) was recognized as useful but in need of more frequent updates. Participants recommended a community-driven, iterative approach.

"This is another live page that should update regularly, especially when new incidents are being treated."

"The wiki needs more proactive maintenance. It's been useful, but parts of it are probably outdated."

"Linting used to be a recommended practice—now it's a requirement. That's how the guidance should evolve."

The wiki could also better reflect common pitfalls drawn from the "lessons learned" page, which was recently expanded with categories derived from recent incidents.

3. Incident Reporting and Bugzilla Process


HIGHLIGHTS:

  • Several calls were made to improve clarity around the timing and completeness of responses:

    • Responses should be posted within 7 days.

    • Questions from community members should be clearly worded.

  • All CAs should respond via an official account.

  • Guidance is needed for:

    • Handling follow-up questions after a closure summary.

    • Deciding when a bug is “closed” or re-opened.

    • Differentiating between clarifying dialogue and “fishing expeditions.”

  • Mozilla/CCADB Steering Committee should publish criteria for how they evaluate incident reports.

MORE DETAILS:

Timing and Quality of Responses and Updates


Participants discussed the need for CAs to consistently respond within 7 days and to clearly answer all questions raised. However, ambiguity in questions—especially from anonymous accounts—can create uncertainty about what requires a formal response.

"Sometimes it's difficult to know when there are questions in a comment... if there is a question, please isolate it and put a question mark at the end."

"Sometimes people think they've answered the question, but they haven't, or the question wasn't clearly phrased as a question."

"We need to think why these processes exist—do they actually provide value to anyone?"

  • Challenge: Inconsistent expectations about update frequency and when/if incident reports can be considered closed.

  • Suggested Improvements:

    • Define a standard response timeline for root programs (e.g., respond to new reports and closure summaries within X days).

    • Clarify when weekly updates are still required after a closure summary has been posted.

Best Practices for Asking and Responding to Questions

There was consensus that using official CA accounts for incident response would increase clarity and promote blameless, process-oriented discussion. Suggestions included documenting best practices for asking and answering questions in incident threads.

"When it's from this account, it's from the CA, and if it's not from the CA's account, it's not from the CA."

  • Problems Identified:

    • Not all questions are clearly marked as such.

    • Unclear if questions are rhetorical, hypothetical, or require a formal CA response.

    • Difficulty in determining which questions must be answered (especially when raised by anonymous commenters).

  • Proposals:

    • Encourage clear formatting, e.g., explicit question marks, quoting the question before responding, or numbering responses.

    • Consider publishing a template or FAQ for best practices in incident responses and community questions.

    • Require CAs to post from official accounts to distinguish authoritative responses from personal opinions.

Use of Incident Reports for Policy Development: Incident discussions can be valuable for identifying insecure practices that are not explicitly covered by existing rules. However, participants warned against letting these discussions become unstructured fishing expeditions.

"It does help to probe for potentially weak or unadvisable practices... as long as it's relevant within the scope of what's being discussed."

  • Insightful Use: Incident discussions can expose underlying operational weaknesses or highlight emerging security concerns.

  • Concern: Some feel discussions veer into speculative or unfocused territory, creating unnecessary burdens on CAs.

  • Recommendation: Create guidance that differentiates probing for systemic risk from inappropriate fishing expeditions.

Bug Management and Closure Expectations: Frustration was expressed over inconsistent attention from root program representatives and the lack of clear procedures for what happens after a closure summary is posted. 

"There’s no written procedures, guidance, or list of expectations for what CAs are expected to do when that happens."

"We should have root program commitments to review new bugs and closure summaries within a set time frame."

Calls were made for more transparency and consistency in how root programs evaluate and respond to incident reports. Participants noted that some CAs face intense scrutiny, while others receive little engagement.

"I rarely see you or other people from Mozilla participate actively in the bug. It's mostly like you're expecting others to do that."

"Some bugs get closed even when the CA didn’t really give a good incident report."

"You take two incident reports for the same issue, and they’re treated differently depending on the CA."

"The worst thing that you can do to a good employee is tolerate a bad employee."

Participants also supported defining root program commitments, such as:

  • Response within 3–7 days for new bugs.

  • Timely review of closure summaries.

  • Clearly communicating when a comment or question does not imply a rule violation.



  • Problem: Lack of clarity on how root programs assess and process incident reports.

  • Suggestions and proposals:

    • Publish clear root program commitments for response times.

    • Clarify whether follow-up questions reset the closure timeline.

    • Document criteria for bug closure and reopening.

    • Document and publish evaluation procedures used by root programs (Mozilla, Chrome, etc.).

    • Clarify who reviews closure summaries and under what criteria bugs are kept open or closed.

    • Define what constitutes a “complete” response or action plan.

    • Possibly rotate root program reviews via a common Bugzilla account (as Mozilla has done with incident-...@ccadb.org).

4. Cross-Signing and Subordinate CA Oversight


HIGHLIGHTS:

  • The group discussed increased cross-signing activity and the inconsistent oversight it brings.

  • There was consensus that minimum oversight expectations for externally operated subordinate CAs should be documented.

  • Mozilla’s policies should be aligned or contrasted with Chrome’s publicly, with clearer procedures for:

    • Approving externally operated subordinate CAs.

    • Pre-conditions for cross-signing.

MORE DETAILS:

None - see transcript and list of potential improvements for more details.

5. Root Program Transparency and Evaluation


HIGHLIGHTS:

  • Community members asked for more visibility into how Mozilla/CCADB evaluates incidents.

  • Some noted uneven treatment of CA bugs.

  • Requests included:

    • Publishing root program “commitments” (e.g., review deadlines).

    • Clarifying when CAs must post additional closure comments.

    • Making the role of the CCADB Steering Committee more visible in incident review.

MORE DETAILS:

  • Proposed Commitments:

    • Review new bugs and closure summaries within a defined timeframe (e.g., 5–7 business days).

    • Provide clear closure signals or required follow-ups when a closure summary is posted.

    • Acknowledge when a question is not a compliance issue to reduce unnecessary CA responses.

    • Commit to blameless analysis, focusing on systemic improvements rather than individual accountability.

6. Automation, ACME, and End-User Support

HIGHLIGHTS:

  • CAs reported doing extensive work to promote automation but noted challenges with subscriber uptake.

  • Shortening certificate lifetimes (e.g., to 45 days) was seen as a forcing function.

  • Concerns were raised about legacy systems, firewall compatibility, and increased attack surfaces.

  • ACME isn’t suitable for all users; broader definitions of automation should be supported.

  • Suggested actions:

    • Catalog real-world barriers to automation.

    • Broaden guidance to include ACME-alternatives that would be considered automation.

MORE DETAILS:

There was strong agreement that most CAs are already promoting automation as much as they can, and that end-user barriers—not lack of effort from CAs—are the primary challenge.

"We’ve poured ludicrous amounts of effort into promoting automation over the last 5 years."

"People who say they have automation sometimes haven’t actually set it up right—it gets revealed later."

"We're expending hundreds of thousands of dollars to get fully automated, but it's not easy for the last 0.4%."

"Some of the remaining systems just don't support ACME, or are blocked by firewalls that need custom solutions."

"There is an overemphasis on ACME. It’s not a magic wand. We need to broaden the conversation."

Some end users echoed that while they're supportive of automation and are mostly automated, the remaining edge cases involve legacy systems or security-sensitive environments where automation introduces risks.

"Automation requires installation of software... and that increases the attack surface."

Key takeaways:

  • ACME isn’t suitable for all environments.

  • Shorter certificate lifetimes (e.g., 45 days) may help drive adoption.

  • The industry needs a broader definition and framework for automation.

  • There is a need to catalog and address real-world blockers to adoption.

"If we want more automation, we need to stop talking about ACME and start talking about other things."

  • Debate Points:

    • CAs report investing heavily in promoting ACME and automation, but many subscribers still lag.

    • Shortening certificate lifetimes (e.g., 45 days) may be the strongest lever to drive adoption.

  • End User Challenges:

    • Some subscribers, particularly in high-security or regulated environments, report technical or organizational barriers to automation.

    • Common blockers include:

      • Incompatible devices or firewalls.

      • Fear of increasing the attack surface by installing new ACME clients.

      • Lack of support from vendors or IT teams.

  • Suggestions:

    • Root programs and CAs could:

      • Publish a clear definition of "automation" (e.g., key management + DCV + renewal).

      • Maintain a public matrix of tools and client compatibility for different use cases.

      • Shift the conversation beyond ACME, recognizing that not all environments are suitable for it.

      • Encourage subscribers to treat automation as lifecycle management, not just certificate renewal.

7. Improving Community Engagement and Policy Development

HIGHLIGHTS:

  • Some participants expressed frustration that discussions in Bugzilla sometimes veer off-topic or become unproductive.

  • There was support for escalating appropriate issues to the Mozilla dev-security-policy list or the CCADB public list.

  • Clarification was requested about when incident discussions should shift to broader policy forums.

MORE DETAILS:

  • Challenge: Some incident reports touch on broader policy implications, which are not easily resolved within Bugzilla.

  • Recommendations:

    • If a Bugzilla discussion raises questions of precedent or future policy, transition the conversation to the Mozilla dev-security-policy list or CCADB Public.

    • Maintain a list of potential policy questions for future ballots or community consensus.

8. Closing Thoughts and Next Steps


HIGHLIGHTS:

  • Mozilla reiterated its commitment to transparency and continuous improvement.

  • Future discussions may explore aligning Mozilla and CA/B Forum policies, improving the user experience, and promoting sustainable automation.

MORE DETAILS:

The discussion revealed multiple areas where greater clarity, consistency, and structure would benefit both CAs and root programs. Specific ideas include:

  • Better guidance and policy development around documentation discrepancies and revocation.

  • Improved documentation of incident handling expectations and timing.

  • Updating and maintaining CA guidance and Wiki pages.

  • Creating a formal set of root program commitments.

  • Expanding guidance and tooling around automation.

The meeting ended with appreciation for the broad participation and an invitation to continue the discussion on mailing lists or via future roundtables.

"Let’s keep on securing the free web."

----------------------------------------------------- 

Action Items: Potential Improvements based on 2025 Roundtable Discussion

1. Revocation Policy Improvements

Propose or support a ballot to clarify BR section 4.9.1.1 to address minor CPS discrepancies, prepare guidance for annotating CPS updates without triggering revocation, and adopt policy on when revocation is not required due to CPS misalignment, especially when BR compliance is maintained.

2. Incident Reporting Improvements 

Update Mozilla or CCADB Guidance to: emphasize clear timing expectations (e.g., 7-day rule), provide best practices for responding (e.g., quoting questions, structuring answers), and clarify who is expected to respond (e.g. CAs via official accounts); create a Q&A guidance page on how to frame questions and address community input that is considered helpful vs. rhetorical or speculative; and discuss with CCADB Steering Committee formal root program procedures, including the review of new incident reports within X days, providing responses to closure summaries within a specified timeframe, documenting incident closure workflows (e.g. what happens when follow-up questions come in after closure summaries, whether new closing summaries are needed, and when is an incident report considered complete) and criteria for evaluating incident report responses, deciding to close vs. follow up, handling reports when there has been no community feedback.  

Also, move policy-level discussions that arise from incident reports to the Mozilla dev-security-policy list or CCADB Public list. Work to develop criteria for when an issue in Bugzilla should be elevated to a broader policy discussion. Propose ballots in the CA/Browser Forum to address Mozilla policy issues (e.g., mass revocation rules, revocation reason codes).

3. CA Guidance, Wiki Maintenance and Problematic Practices

Create a structured review and update process to maintain the “Forbidden and Problematic Practices”, “Recommended CA Practices”, and “Lessons Learned” wiki pages. Gather community suggestions on how to keep these resources up to date.

Also, align root store policies and clarify them for cross-signing and providing minimum expectations for overseeing the operations of external CAs (e.g., audits, sample checking, joint incident reviews). Add this issue and track it using Mozilla’s GitHub repository for PKI Policy.

4. Automation Support & Strategy

Create guidance on automation that goes beyond ACME. Define what constitutes “automation” (e.g., key management + validation + renewal) and offer guidance for high-security/legacy environments. Document known blockers and “real-world constraints” to automation (e.g., firewall incompatibility, risk concerns). Highlight examples of any creative or secure ACME-equivalent deployments that are discovered.

-----------------------------------------------------
 

Transcript - Mozilla CA Program Roundtable Discussion - May 16, 2025

Moderator: Ben Wilson

Attendees: Aaron Gable, Adrian Mueller, Andrew Ayer, Alison Wang, Atsushi Inaba, Andy Warner, Boryana Uri, Brian Holland, Bruce Morton, Ben Wilson, Chris Marget, David Adrian, Antonios Chariton, Dimitris Zacharopoulos, Enrico Entschew, Eric Kramer, Fatima Khalifali, Iñigo Barreira, Israr Ahmed, J.C. Jones, James Renken, Jurger Uka, Larry Seltzer, LV McCoy, Martijn Katerbarg, Matthew McPherrin, Joe DeBlasio, Mrugesh Chandarana, Matthias Wiedenhorst, Nicol So, Nuno Ponte, Rollin Yu, Jeremy Rowley, Sandy Balzer, Michael Slaughter, Stephen Davidson, Tim Callan, Tobias Josefowitz, Trevoli Ponds-White, Wayne Thayer

Moderator: Welcome everyone, and thanks for joining. We have a great group gathered here today of stakeholders who are interested in this topic and in this format. And it's the first time we've ever had, to my knowledge, this type of roundtable discussion.

Moderator: Our aim today is to bring together all perspectives and have an open, constructive dialogue, and I want to hear from everyone that's willing to speak. If you don't feel like speaking, you're very welcome to just sit and listen. I'm going to try to make sure that everyone has an opportunity to speak and facilitate the discussion. I'll ask questions, or answer them, and we'll try to keep things moving because we have a short amount of time, and we want to use to cover as much ground as possible. I appreciate your patience as we move forward in this sort of open format. There are a quick few notes as we begin. I don't think we should go around the room for introductions. That would take too much time. I'm hopeful that everyone can see who the attendees are, and that you'll see when people are talking. You'll see their names, so there shouldn't be any need to identify yourself or affiliation unless you want to. Please allow others to finish speaking before jumping in. Talking over one another makes it difficult for everyone else to appreciate the content. And if it gets a little bit busy and if you've got great ideas, or there's a lot of quick dialogue, then what we should do is use the raise hand feature or the chat, and we'll call people in order as much as we can. When you speak, try to be concise and to the point. This dialogue is going to be conducted in accordance with the Mozilla Community Participation Guidelines. So please speak respectfully and constructively. We're here to share ideas, not to win arguments.

Moderator: We'll be recording this conversation, but that's to keep accurate minutes, and we’ll use the Chatham House Rule, which means that in the minutes I won't attribute anything that anyone says to that person or that organization, but if you want for some reason for something to be attributed to you, then let me know.

Moderator: Just to repeat, this will be conducted under the Chatham House Rule. That's to encourage open and candid discussion. And if you have any concerns about how the recording will be used or the notes will be prepared, then just let me know.

Moderator: My hope is that everyone leaves today's meeting feeling that they've received some positive and valuable information. And thanks again for participating. The goal here is to improve the Mozilla Root Store program. So that's why we're conducting this roundtable discussion.

Moderator: I want to make sure that everyone around the table can feel like they have a say, and some involvement in what we are doing. For the most part, the main resource that we'll look to is the Mozilla CA wiki, and I'll put the link in chat. I'm going to put a link here to the Mozilla Community Participation Guidelines in case anyone wants to review that.

Moderator: Are there any questions about anything on the agenda? Is there anything off the bat that I should address, or any concerns?

Q: Is there a final agenda?

A: There's the draft agenda, which is the final agenda. I haven't modified it, although there might be some of the bullet items under some of the main categories that we won't have time to get to.

Moderator: The first part of today's call that we’ll talk about is Mozilla's expectations regarding CA compliance, and we’ll also brainstorm. We’ll see if there is a forbidden or problematic practice that we should put into the CA Wiki page. The second part of the agenda is root store improvements to bring clarity or positive things that CAs can do. That's a 20-minute section for that. We'll try to look at anything that people have as suggestions where things can be clarified. During the third segment of our roundtable discussion, we'll talk about trying to improve the customer experience or that of the end user, concerns about automation or shorter certificate lifetimes, and any frustrations about incident reporting or anything that we can do to address some of those things. Then we'll have another 10 minutes for wrap-up.

Moderator: Okay, everyone should be able to see my screen, which is the homepage for the Mozilla CA wiki. And I’ll go down to the section “Information for CAs”. Note that we have a section on forbidden or problematic CA practices, and rather than go back over those things, because the whole page is probably very outdated, we’ll talk about issues that CAs have encountered more recently, or that we, as a community, feel are forbidden, or should be forbidden or that are problematic.  We should mainly focus on things that are probably more problematic, because some of the forbidden things are now either in the Baseline Requirements or the Mozilla Root Store Policy.

Moderator: I don’t want to dominate the whole call, because I want to hear from you, but there is this section in the wiki titled, “Maintenance and Enforcement”. We should look at Mozilla's compliance expectations, and the “Maintenance and Enforcement” wiki page goes over that. So, we won't have time to get into this today, but offline, if you have any suggestions on improving this, or after the call, once we've gone over a lot of these things, maybe we can talk about that.

Moderator: So, let's see here. Basically, our expectations are that CAs report incidents as promptly as possible, that they follow the CCADB's incident reporting guidelines, and that CAs demonstrate accountability, urgency, and transparency when they fill out or complete their incident reporting obligations. Later down in this page we emphasize things that would cause us to distrust a CA, such as patterns of neglect, vague responses, and repeated issues. Overall, this page talks about the goal of protecting our users.

Moderator: One other thing before we launch into this is the “Lessons Learned” page, which I've revised recently. I ran a report of compliance incidents since June of last year, starting in July, and we have 150 incidents since then. I have been looking at those and then editing and adding different categories for the “Lessons Learned” wiki page. While I haven't been able to get through the list totally, it should be something that everyone should be aware of, especially CAs, and at some point in the next several weeks I will remind everyone that this resource is available to look at.

Moderator: So, I'm going to open it up to the floor now. And let's just have a discussion about things we can do with regard to compliance or to clarify what our compliance expectations are, or to help CAs do a better job with their compliance posture. I'll make some notes here on the side as we discuss this, but then also we'll include it in the notes from the meeting. So, if you want me to open up a particular page or to go somewhere on the Wiki, just let me know.

Q: Just to clarify, are you looking for input on the information that's already here? Or are you looking for other things that we should be adding?

A: Mainly things that we should be adding. I don't know if it'd be an efficient use of our time for me to just go through some of the incidents, or I could through some of the things that I've added recently to the “Lessons Learned” page, which might help prime the pump, but if anyone has any things that they've been thinking about, then let's start with that.

Comment: Here is one of the big ones. Suppose there is documentation where your CPS doesn't match your practices, but your actual practices match the Baseline Requirements and what those expectations are. Right now, there is a kind of perverse incentive to never specify anything in your documentation that's not in the Baseline Requirements themselves. If you restrict your practices at all, and then you screw it up somehow but comply with the Baseline Requirements, then you end up revoking a bunch of certificates, and you also end up going through the Bugzilla process and having a bug filed. The bug filing is not that big of a deal-it’s good and gives transparency. But it would be good to see more CAs describing things that they do in their CPSes, or their other documentation, that are more narrow than the BRs, without necessarily having to risk mass revocation or something like that. We have seen quite a few times lately, where people have posted their CPS with wrong information. They still issued certificates compliant with the BRs, but they have to replace those certificates, and they look identical to what they just issued. It's just the validity period that’s different. Because it's now after the CPS update, and what do we do about that?

Comment: This issue would benefit from some clarity, because every time it comes up people say, “Oh, I don't have to revoke, because all I have to do is fix the documentation.” That's been proven not true in past bugs. That expectation on exactly what you do there is not clear for people who don't follow all the other CAs’ bugs, and I know they should follow the CAs’ bugs, but sometimes people miss that stuff.

Comment: One of the things people rely on is section 4.9.1.1 of the Baseline Requirements, and that subsection says it must be revoked if it does not comply with the CA’s own CP or CPS. That is the thing that people hold on to.  Maybe there is a way to handle that scenario.

Comment: We talk internally about this. We are very troubled by this idea that a good faith, trivial error in somebody's CPS can require that 100% of their certificates need to be revoked and replaced with certificates that are identical in every way except for the not-before date, and that feels out of whack. We understand and appreciate the idea that you need to be able to look at a certificate and look at the CPS of that time to understand what is going on, but we wonder if there's a way to correct the record so that the useful value of the CPS is still there without requiring what does seem like a senseless revocation. And we agree that the rules as written today do require that. We just think the rules as written today should be rewritten to give another remedy that still solves the transparency problem without requiring this pointless mass revocation. We'd like to have the community driving that. And we're probably going to put this on the agenda for the next face-to-face.

Comment: Well said. That's why I like the Bugzilla process, it gives transparency that something went wrong.

Comment: Let's fix it, but we can't just turn around and declare the rules ad hoc not to apply. What we need to do is adjust the rules. And I'd like to see us adjusting the rules on this. The rules we have now are not serving the Web PKI. They're not serving relying parties. They're not serving subscribers, they're not serving CAs, and they're not serving browsers. They're not serving anybody. And let's fix them so they are. It is something we'd really like to see, and we'd like to help be part of the effort, even though we don't know what the answer is.

Comment: I don't think that's really serving anybody any good in terms of having a minor issue in a CP or CPS that forces revocation of all certificates. I don't think that's doing any good to the overall Web PKI community at all.

Comment: It's not a punitive measure to have to revoke. It is a process failure--you did everything, but you changed something, and you forgot to update the CPS. So something internally did not work as it should, and we need a way to make sure that this is fixed. If you have to do a lot of work, you can justify the resources, so it can indirectly drive the management commitment to get that work done. If you don't have anything, if you just file a report in 15 minutes, then maybe there is not so much of an incentive to change things. So I would like to see if there is any change in the rules in a way to make sure that this has been given adequate importance and that people can get the commitment they require.

Comment: No one is against filing an incident report or making it visible, for an error in a CP or CPS, but there shouldn't be a mandated need to revoke all certificates because of that. Instead, the incident report should be filed to make it visible so that everyone can learn from it.

Comment: And that incident report has to have an action plan for how you're going to fix the process failure. So, the bigger question is whether the action plan is sufficient to remind people that they need proper documentation. It's a balancing act, but we have shifted too much towards revocation on that balancing act right now, which discourages transparency rather than encourages it.

Comment: Maybe a suggestion would be to describe that glitch in the CPS in an updated CPS, or to somehow explain the difference between the policy documented and the practice.  To avoid cluttering of the document--because it can be patched with too many glitches--would be to keep that description available until the last certificate that was falling under that difference is expired or revoked. And then the CA could be clean with that description about the CPS.

Comment: That makes sense, and I'm not saying that we should revoke every time.

Comment: Yes, sure, we all agree on that.

Comment: What I was saying is that the CPS needs to have some value. So why is there a CPS? It's used in audits by the auditors. They make sure that what you write there is what you're doing. So if we add the ability to retroactively change this document, then it loses its value as well. That's what I was saying -- I'm not saying to retroactively change it.

Comment: I'm just suggesting that we have a kind of note saying that until that date certificates issued were issued under that acceptable condition, and keep that note until those certificates expire. Once they expire, they are out of the scope of the CPS.

Comment: One thing I like about that, or keeping a note in your CPS that there was this mistake, is that it encourages shorter lifetime certificates. The shorter validity periods then mean you can update your CPS sooner to say, “Hey, these are our current practices. We don't have any issue with this. This is a non-issue.” So that's a pretty clever solution.

Moderator: Okay, should we go into another topic? I want to cover as many different topics as we can.

Comment: Problematic practices, which are more important because at least for the forbidden practices you can remove the whole thing because everything is accounted for in the BRs or the Mozilla Root Store Policy.

Comment: All 8 of them should be okay. And I believe in the potential problematic practices. Section 2.5 is also something that is part of the Baseline Requirements. So what other problematic practices have people witnessed that are not currently listed?

Comment: Let's say you're talking about external entities wanting to operate subordinate CAs. We are seeing a lot of legitimate questions from the community and the browsers. When a CA decides to do a cross-signing agreement or allow an externally operated CA, maybe the community should describe the minimum expectations for the signing CA to oversee the activities of the cross-signed entity.

Comment: I've talked to many experts, and from many CAs around the world, and they all have their own checklists—from “I only check the audit report and nothing else” to “I am doing regular meetings, doing internal audits, doing independent quarterly certificate checks.” I have heard everything. Maybe it is time to establish better standards and the minimum expectations before a cross-signing agreement is signed?

Comment: And are there different expectations where you're cross-signing somebody who's already in the root program for ubiquity versus signing somebody who isn't in the root program and giving them trust? In the first place, I think that the latter doesn't exist according to the practices of the community. And then on-premises operated sub CAs do exist, which, although those aren't effectively a cross sign, they may as well be.

Comment: We already have a precedence of a new CA coming to play asking for a cross-signing agreement, and they first had to apply to Mozilla. They had to independently be approved before being allowed to get cross-signed by another CA.

Moderator: We can obviously improve this and triple the size of what we say or explain here. It wouldn't be that hard to come up with more detailed requirements. We should probably put this issue into the GitHub issues list, and maybe even an issue is still open regarding externally operated subordinate CAs. There is also a section in the wiki for the process for adding an externally-operated CA. It's a very good point, and you're right, we have seen an increase in these, and the issues haven't been totally addressed. The Mozilla process provides more leeway for existing CAs that are in the program when you compare it to the Google Chrome process. There’s an advance notice requirement in the Google Chrome root program. We could also take a look at that and try to align the two programs, and I could speak with the people at Chrome about their approach how we could use it, or how they could use some of our approaches.

Comment: For what it's worth. I don't believe that the Chrome Root Program has any special requirements for external CAs other than you need pre-approval.

Comment: You need to get approval, but it doesn't say what you need to do to prepare yourself. And what are the expectations during the cross-signing period.

Comment: There is a carve out in the policy that if it is a signing of a CA whose operator is already in the trust store that the requirements are lower for pre-approval.

Moderator: Part of the oversight is that the signing CA needs to be more detailed. It can't be just that the CA has a Webtrust or an ETSI audit for their operation. There are things like CPSes that should be looked at, sampling of certificates that are issued, those kinds of things. They should be doing pre-issuance linting if they're not. These are things that the whole CA industry is working on.

Moderator: Okay, we've got about four more minutes on this topic of forbidden practices. Does anyone have any other things that are behaviors, patterns, or trends that have been observed in incident reports, or otherwise, that need to be or should be discussed? Back to these forbidden versus problematic CA practices. It seems maybe we should focus on the problematic practices, and I don't want to rename the page. Maybe we could move backdating. I don't know if that is, well, it can be problematic, but not necessarily outright, forbidden. I mean, in certain situations it should be listed as forbidden. Maybe it already is in one of the Baseline Requirements. There's a limit on what you can do. Maybe someone can think of something that should be in the forbidden list.

Comment: In general, this list should be maintained because the threat models change, the needs change, and if there is a practice that was needed 10 years ago, and we don't need it anymore, perhaps it can be added here, just continuously have this evolving document of forbidden and problematic practices. Because the not-before may not be needed as much as it was 10 years ago or 20. So maybe we don't need to allow this additional risk from someone doing it. I'm not speaking specifically about the notes before, but this list has to evolve.

Comment: And another comment would be it's a good venue to develop these ideas here, but as an implementer of these requirements, we're all generally happier when they bubble up and gravitate towards the TLS BRs themselves, where appropriate, so that there's universality where it becomes difficult. Sometimes, as an implementer, it is difficult when different root programs have policies that are intending the same thing but maybe worded slightly differently. And there's a lot of debate that often happens whether there is actually a difference in implementation required. And so, if those ideas can be documented in the TLS BRs, then those points of confusion don't exist.

Moderator: That could be something that we could talk about during today's call, probably in our last 20 minutes, to the extent that we have this mass revocation requirement only in Mozilla, and can that be moved into the BRs and one of the other instances where we did something within Mozilla, which we then had to port over to the BRs, were the revocation reason codes. We should focus on getting things into the CA Browser Forum first, and get those discussed to the extent that we can also make sure that the Mozilla community has a voice and an opportunity to comment or be involved in it. Many people feel that the CA/Browser Forum is isolated, but there's that dichotomy that we need to work out.

Moderator: So, in the next 20 minutes we'll talk about root store improvements, Mozilla guidance, and things that we can do to make it more clear. Is there any place in the Mozilla Root Store Policy, or in the recommended practices, or in our GitHub issues, where we can make improvements that you see or where you see that there's an opportunity for confusion?

Moderator: With regard to recommended CA practices, these are the kinds of things that bubble up--the recommended practices bubble up from things that we feel are important but aren't quite ready to go into the Baseline Requirements or the Mozilla Root Store Policy. I’ll take a look at this list, and then when I'm editing the template for the CCADB Annual Compliance Self-Assessment, I see whether I need to say anything about any of these things. In the self-assessment, there is a Mozilla tab. We have the Baseline Requirements tab, and then we have a Mozilla tab, and anything that jumps out from the Mozilla Root Store Policy that isn't in the Baseline Requirements gets added to that Mozilla tab.

Moderator: Again, this list needs to be maintained more proactively and needs continuous updating. So, is there anything that anyone wants to talk about here under this category, or to help clarify anything else that is a requirement?

Comment: Yes, this is another live page that should be updated regularly, especially when new incidents are being treated. It should definitely include some good practices based on the remediations and the prevention controls that CAs recommend, or the community recommends. It does require attention and maintenance. Maintaining these wiki pages is a collective effort to propose improved the language or removal of some things that are pretty trivial. I see linting listed, for example, as a recommended practice, but now it's a requirement.

Comment: In the bugs active right now, there are a lot of issues with people not filing responses within 7 days, or not answering all the questions. The CCADB requirements are pretty clear on that, but maybe there should be something in the Mozilla wiki to emphasize that as well, or even dictate how those responses to questions should look. The format required for incident reports has helped that get organized, but maybe a format for answering questions might be useful as well under recommended practices, or to cite the questions and post a response. Moreover, if you look through all the current bugs, there are so many that either missed the 7 days because maybe they thought they answered the questions, and they didn't, or they just missed a question that looks like a statement, and they couldn’t tell? So that might be helpful to clarify.

Moderator: There are two good points you're making--timing for responding is within 7 days; and then they need to answer all the questions. We should have additional guidance and clearer requirements that go beyond what's in the CCADB, or just a reiteration of it. We have a Wiki page where we can address that--it's the “Responding to an Incident” wiki page.

Comment: Sometimes there are rhetorical questions in bugs or questions that stray far from the subject of the bug, and it is difficult to know when there are questions in a comment. There should be advice on how to write a question--if there is a question, please isolate it and put a question mark at the end, etc. Also, sometimes it's difficult to determine if you need to answer a question when it's not clear that there's actually a rule that you're violating. It would be nice if one of the root store representatives would weigh in and say that actually it is not a rule violation. Bugs would be closed sooner, but some of the comments are nitpicky, and they're not sufficiently clear. Sometimes there is a rule violation, and then some things are just not rule violations at all, so it's hard to answer a question or comment when it appears to come from some random, anonymous account on the Internet, e.g. a generic name or initials without an indicated affiliation, interest, or background, or why you're commenting on the bug.

Moderator: Or, you can't find the person by searching.

Moderator: We could prepare guidance to address the types of questions and to guide people towards asking the right kinds of questions.

Comment: A good improvement to incident reporting would be to require all CAs to have an official account that they post from. This will focus discussion on the process and help keep it blameless. When we have responses from individuals, then sometimes people get caught up in that. I mean, the browsers could do it, too. I think that would do a lot to improve and also clarify communications when it's from this account, it's from the CA, and if it's not from the CA's account, it's not from the CA. When people who work at CAs want to comment on it in bugs, then it'd be more clear because it didn't come from the company's account.

Comment: It might be easier than having that wiki page that lists people and their affiliations. There's that page that says I'm not posting on my account, or I am posting as this person, but having an official account per CA would make it so that wiki page isn't needed anymore. The Chrome root program does that with their root program. They post as the official Chrome root program account.

Comment: I was just going to suggest that is an interesting thing, people do use that list on the wiki, although it might be obviated if we had this other process. But I didn't actually know we had that list until recently, and I've never put myself on it.

Moderator: I think it's something that Gerv either started or that he emphasized when he was running the Mozilla root program. See https://wiki.mozilla.org/CA/Policy_Participants

Comment: I just wanted to say that this also has to be balanced. What are the reasons for an incident report? Why does the CA file it? And one thing is for Ben to see that, and decide whether they should still be trusted and whether they should do something. Another is for risk assessment and policy development. Maybe someone misinterpreted the rule. So through clarification questions we might be able to figure that out or set precedents, or maybe create a new rule to make it clear. But another thing is that it can help us determine insecure practices. And I view this whole thing from a security engineering point of view that maybe someone does something today that's not actually technically secure. They don't violate any rules. Everything is fine. All of the compliance stuff is fine. But maybe we shouldn't be doing that anymore. Maybe someone allows you to issue a certificate via faxed documents. And this was needed 20 years ago, but we don't consider it equally secure today. So, in line with this conversation and these discussions, I don't think you can limit the scope of them very easily without harming the long-term effects and the future goals of the root programs.

Comment: That's a really great example of why we should get clarity, because if someone opens up an incident report like that, then discussion should be moved to the CCADB Public or Mozilla dev-security-policy list. So if someone says that's not OK, that's not actually against the rules. Because what? Because an incident report is actually the wrong mechanism to achieve an improvement to rules where one doesn't exist because a CA is obligated to explain how it will resolve the incident report, and only the CA has the responsibility to show action on what they're doing to close it. Whereas if we want to have a community discussion on what we think this new rule should be, then that's exactly the kind of thing that we should move to the list so that we can say, this is a new rule here, and we should get clarity on it and what it should be, because otherwise I don't actually think an incident report will become a new rule. It will just become a cautionary tale about a time that a CA had to respond to a thing that was not an actual incident.

Comment: No, I agree with that, and we should be having these discussions on the list. But sometimes I read reports, and it's difficult to understand what actually happened. Maybe some details are not included, and it's difficult to understand exactly what the issue is and if it's a violation or not. I can say that I think this might violate this rule, depending on how you implemented it, but some other times, there are things you didn't even think about.

Moderator: I was thinking that it does help to probe for potentially bad practices that we should start to consider, or we should consider as weak or unadvisable, or things like that. But if it's just on the email list, you can't ask more probing questions about something that is specific to the CA that they're doing. There's a balance between just engaging in what is referred to as a fishing expedition, which wouldn't be good, and looking into what is relevant and within the scope of what's being discussed.

Comment: I just wanted to raise a cautionary tale, having been involved in some bugs in the past that have sprawled on for 100 or 200 comments. In precisely this case, where it's not really clear in the course of reporting an incident, it turns into a lot of interpretation dialogue between root program representatives, the community, and so forth. That changes the bar a little bit on the outskirts of that bug, leading to the CA needing to restate its responses. But then sometimes the bug can degrade into recriminations where someone says you're shifty. You changed your story. And so I would just like to state that bugs provide an important feedback loop for the development, and new policy that can sometimes happen within bugs, but there needs to be a recognition somewhere that it does change the interpretation of the circumstances that the certificate issuer was facing in making the incident report.

Moderator: Okay, we've run out of time on this topic, but we can come back to this topic, probably at the end of the call. We've got the next 30 minutes, but we didn't get to looking at any of the GitHub issues, and I didn't expect that we would. The next area of discussion is community feedback and concerns. The thing that drew my attention was the request that we discuss things like end user automation and certificate lifetime changes, and any frustrations about them. We talked a little bit about incident reporting just now, and there are things that we can do to improve. And we talked about efforts to have the Mozilla Root Store Policy match the Baseline Requirements, and to go through the Baseline Requirement adoption process so that there isn't a divergence. We talked a little bit about how the recommended practices can be used to move standards towards becoming requirements. But let's go back to this idea of things that we can do better as outreach to consumers or to end users. It seems to me that more of an industry-wide effort needs to be done to help move things to more automation. That is sort of the topic for this last half hour, if there is a lack of any other topics to talk about. Does anyone have suggestions, recommendations, insight, opinions, or views on how this should be done, or whether it should be done?

Comment: I have heard it said a lot that CAs should do a better job of promoting automation, but we as CAs have poured ludicrous amounts of effort into promoting automation over the last 5 years. We all want automation. It makes everybody's life better. And I see that CAs are communicating with the public. So I feel like being told to put something more into place to promote automation more is completely empty and won't change anything. There's a more basic situation with subscribers, which is for whatever reason they're not motivated, or they don't care. They're not listening. And maybe shortening lifespans is going to change that. But CAs have been working hard on this.

Comment: CAs are marketing all they can on this stuff and getting people to move to automation or take the time to set up the automation is actually the barrier, and then sometimes there are people who say they set up automation, and they actually haven't.

Comment: Yeah. I think the 45 days reduction is actually the thing that's going to move the needle the most toward automation.

Comment: As an end user that manages internal PKI for a Fortune 100 company, it will move us toward using private PKI. We're automated to 99.58%, but the final .42% is where we have a challenge. And that's where we have our outages. But there is a complexity that others don't see. We are expending hundreds of thousands of dollars to try and get to the point to where we're fully automated. But you're pushing us to 45 days.  While I am 100% supportive, you need to understand that the speed by which you move to automate isn't the speed by which we move. And I've done this 36 years.

Comment: You believe that more time is necessary than 2029, then? The CA/B Forum is waiting for useful feedback on that.

Comment: Which is why I'm here for this meeting.

Comment: There are concerns with the removal of client authentication, at least from the root store, and with another big push on several ecosystems moving to privacy, and the amount of use cases where Web PKI is currently being used which are now coming forward, and we're moving to shorter lifetimes, etc.

Comment: So what I would add to that is that this has been happening with automation as well. For example, when Let's Encrypt launched, they created ACME first, and they created the clients and the tools that would help most people automate it. And since I have worked at a company that deprecated the existing solution before the new one was ready. I need to say that there is some need for pressure to eventually get there. If you keep postponing the deadline, these things will not be prioritized ever, and this makes sense as a business. I would prioritize it only if I had to do it.  If not, and it could wait like IPv6, there's no reason. I can wait 30 years, but then the U.S. Government requests it, and suddenly every vendor runs to support IPv6 everywhere. So I would say, it's similar here. What we can do is provide the tools, because now a lot of solutions support ACME. For example, when Let's Encrypt launched, it was just a single implementation that someone had to download, and it only worked on Linux, but now a lot of things support ACME. So we just have to do that to get there, and I would see it as an opportunity, as well. Certificate lifecycle management depends on how you phrase it. It's not just punishment. It can be a benefit for companies here, for potential lost revenue from private PKI, which is not necessarily a bad thing.

Comment: And I think with the description of automation, one of the issues that comes up is that there is not a clear definition of what we mean by that. And there's a lot of automation in use in different places. But it seems to me that often from a browser perspective, you're thinking of a kind of united trinity of key management, domain control, and ARI, or an ability for early renewal with those 3 things together. There's a lot of automation that may be doing one of those things with other ways of accomplishing one or two of them. But I have a pull request out there for something in the TLS Baseline Requirements that would require CAs to disclose more about what they do with either ACME or ACME-equivalent automation. But it just seems that we need a better definition of what really is the expectation.

Comment: I think my experience has been different. I've been implementing ACME throughout our systems now, and I can tell you there are still a lot of devices that make it super hard to use ACME, and it is not easy to set up, and when you have your firewall that won't support ACME, and it's in front of your server, even if your server supports ACME, you still have to figure out some custom coding to get the firewall to work with it. So I do think we probably need to put on more pressure. I think the 45 days helps with this again. But there needs to be more pressure on people who need to use certificates to make it easier to get these certificates installed via automation. My personal experience has been it's not easy to set up for devices that don't natively support it.

Comment: We should list the reasons why automation is not being adopted at the pace we want. One reason that has not been discussed is the increase on the attack surface. So usually, automation requires installation of software, additional protocols, and additional services running with the special accesses and privileges. The administrators of high security domains fear installing software that has to be maintained and increases the surface threat and can lead to escalation. So that is also a deterrent.

Comment: I was just going to add to what has been said. I do think that there's a general overemphasis on ACME, and it's shown that people can automate, but everyone is not automating.  However, everyone is not lazy or choosing not to do it or choosing not to prioritize it. ACME is not a magic wand. It does not fit for everyone's solutions. And also for certain types of workloads--it's less secure than other options. So I hope that as a community when we're talking about automation, we need it defined. Maybe some more people will put in ACME, but if we want more automation, then we need to stop talking about ACME and start talking about other things.

Comment: When we see incident reports and interactions between the community and the browser representatives, we don't see the same attention in different bugs. We rarely see Ben or other people from Mozilla participate actively in the bug. We don't see the Mozilla positions, or trying to improve or help the CA, or to identify problems in the incident report. It's mostly like you're expecting from other people. Like do that, and if this is guidance or this is somewhere, and it would be nice to clarify your expectations on this.

Moderator: Well, over the past couple of days, I've been going through some of these bugs. And looking at also the ones that we’ve closed and noticed that we've closed them, even though the CA didn't really give good responses, or the incident report wasn't really a well-written incident report, or I've looked at it and said to myself, I should have asked this question, but I'm hesitant to do that because I don't want to be too nitpicky on things, but maybe I need to be more so when they don’t get into enough of the detail. I will attempt to dig down more into their incident responses and ask more questions, and that's the kind of thing that I can engage in more.

Comment: It’s been said, the worst thing that you can do to a good employee is tolerate a bad employee. So sometimes you need to step in.

Comment: Others that are trying and show some effort, but they're being hammered with questions and nitpicking, and all of that.

Comment: It's interesting, because you take two incident reports, and let's say it's the same type of incident. But it might be a different CA, they might get treated differently. One slides through, and one doesn't   and might have no comments from the community. And it might sit there for with nothing, and everyone's supposed to at least post things weekly. But let's say it gets to the point where they've submitted a closure summary, and no one has said anything.

Moderator: And on another point, the CCADB Steering Committee is now taking turns looking at incident closure summaries and processing those during our 2-week, on-duty assignments.

Comment: Do you have the resources as a collection of root programs, as CCADB, to do these reviews because a lot of these more detailed incident reports are because someone found the free time to contribute to that and dig deeper?

Comment: And we cannot depend on someone having free time this afternoon. And what if they don't have next week?

Comment: If nobody's viewing these incident reports, maybe they are less valuable. And if someone is just posting every week, yeah, we're looking into it. We're monitoring the thread, or whatever it is, without giving any updates. Is there also any value on that?

Comment: It's all over the place in terms of the different practices and the different approaches and the different treatment.

Comment: It would be hugely beneficial, as far as transparency goes, to know the process that the CCADB community uses or that members use to evaluate the closing summary. I've seen some now where Chrome comes in and posts additional questions and others that get closed, or you get a closing summary, and there's a date to be closed. And then there are additional questions on the bug after the closing summary. And it's unclear what happens with that expected closing date, or whether you have to post a new closing summary or something like that. So, additional process around what happens after you post a closing summary and expectations after the closing summary is posted in a closing data set would be extremely useful for the community in knowing whether new issues could be opened, whether you can revisit past issues, and what the expectation for the community is.

Moderator: Right. That is something that the CCADB hasn’t documented yet. There was one that came up recently where there were comments after the closing summary. And after the question was answered, we didn't indicate whether they had to do a new closing summary, but we did indicate that it would get closed on such and such date. But there's no written procedures, guidance, or instructions, or list of expectations for what CAs are expected to do when that happens. So that's a good point.

Comment: I think it would also be valuable to have this sort of idea written down. This is how we expect it to go. This is how the root programs evaluate bug reports. This is what we're looking at, I think would be valuable to have that also. For the sake of the CA's understanding, when a CA posts a closure summary, and then no one comments on it for 7 days, are they still supposed to post another comment like “We're still monitoring this bug” even after they've posted the closure summary? It seems obvious that the intent is no. And this is a thing that's getting incorporated into the next version of the CCADB requirements. And, if you say, “Here's our set of action items. The first action item has a due date a month from now. The next action item has a due date a week after that. Please set our next update to a month from now.” But then no one actually comes through and updates the whiteboard to point at that date a month from now. The status is unclear. Was the intent “no, actually, we do still want updates from you” or was it “Sorry I was on vacation. I didn't check it.”? It would be good to know whether in the absence in the comments saying otherwise, “you're in the clear,” or in the absence of saying otherwise, “no, you're not in the clear, you need to provide updates.” I don't really care which way it is, but clarity all around would be nice.

Comment: It's similar to the comment that was said earlier about the need for clarity on questions. It's just not clear when you need to update or respond to bugs.

Comment: I think we need to think why these processes exist, and is there any value? We should not do things just so we do things. We should do things because they matter, and they provide value to someone, maybe to the CA, maybe to the program, maybe to relying parties. For example, if someone posts a closing summary because there has been no other comment. And then someone adds a question--maybe nobody else had the time to even look at it. And just because a month passed doesn't mean that everything's fine here. Otherwise, we should open all the incidents during the summer months, maybe August, so that we can close them quickly.

Comment: The problem with leaving bugs open-ended for a really long time is that we're supposed to regularly review bugs for value. And if we just have a bunch of random bugs open, and it's not clear what the closure is, there's no closure.

Comment: One problem is there is not a really good mechanism to identify when there has been a substantial update versus not.  You have to go open every single bug once a week, or whatever cadence you review them. Sometimes the bugs just update because their tags change. I do not agree that it is good to just leave bugs open just in case someone had a question, and they happened to be out on vacation for a month. When there has been a closing summary, and no one has chosen to comment, well, if there are multiple people in the community that are commenting on bugs, literally, everyone on the Internet can't be on vacation all at once.

Comment: My core thesis is that despite the fact that CAs have no leverage in this regard, I think it would be really nice to have commitments from root programs around how they interact with incident reports and a few other things, such as respond to newly filed bug reports within X days. Take for example when a CA like Let’s Encrypt files in Bugzilla saying that it is 99% sure that it was not an incident but that it only wants to share its evidence and reasoning, yet someone shows up on the thread and says that actually, they think it is an incident. By that time, some of their 5-day revocation timeline budget had been spent. A preliminary report would need to be filed within 24 hours, and a final report filed within a certain timeframe, and those timelines retroactively kick in. When a bug report like that is filed, the CA needs feedback within 24 hours so that it knows whether it is an incident, but the CA has no leverage to demand that. I would like to politely request commitments from root programs that they will review new incident reports within X days, and root programs will respond to closure summaries within X days, and things like that, so that CAs can plan their own timelines appropriately.

Moderator: We have had very good comments. We’re going to try and wrap up here because we're running out of time now. We’d like to thank everyone for participating today. We probably could talk about these topics a little bit more, but we've heard lots of things that we need to work on, or that we need to follow up with further discussion on, or take offline, or discuss on the dev-security-policy list. Hopefully, we can create some minutes, and again those will be under the Chatham House Rule. We might send a short survey out asking your opinion on whether this was helpful and whether you think we should do this in the future, and if so, what cadence we should do it in. We don’t have time for more comments or questions. So, if you have other things you want to discuss, and you didn't get to say, put it in an email, or message me somehow. We really appreciate it, and we cannot express enough how thankful we are for all of you appearing here today, participating, and giving suggestions. 

So with that, let’s keep on securing the free web.


Ryan Hurst

unread,
Jun 4, 2025, 7:52:02 PMJun 4
to dev-secur...@mozilla.org, Ben Wilson

Thanks for sharing this comprehensive summary, Ben.

I'm deeply concerned about the direction of the CPS discussion in this roundtable. The framing that documentation discrepancies create "perverse incentives" fundamentally misses the point of what these documents are for.

CPs and CPSs are binding public commitments, not bureaucratic paperwork. When a CA issues millions of certificates under policies that contradict their documented promises, the accountability mechanism isn't broken, it's working exactly as intended. The suggestion that we should make it easier for CAs to violate their commitments without consequences would gut the very foundation of ecosystem trust.

The real problem revealed by incidents like Microsoft's isn't overly strict enforcement; it's that CAs lack proper automation between their documented policies and actual certificate issuance. This wasn't just a "typo." It exposed the absence of systems that would automatically catch such discrepancies before millions of certificates were issued under incorrect policies.

Too many CAs want the easy way out: patching documents after problems surface rather than investing in the automation and processes needed to prevent mismatches in the first place. Root programs that tolerate retroactive fixes inadvertently encourage CAs to cut corners on the systems and processes that would prevent these problems entirely.

The solution isn't to weaken accountability. It's to demand that CAs invest in proper compliance infrastructure. Good change control practices and automation makes policy violations nearly impossible; without it, even simple documentation errors can lead to massive compliance failures.

I've written more about why these policy documents matter more than most people think: https://unmitigatedrisk.com/?p=1038

Ryan Hurst

Amir Omidi (aaomidi)

unread,
Jun 4, 2025, 9:31:58 PMJun 4
to dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Thank you for this summary! Super useful for folks who weren't able to attend.

I concur with what Ryan Hurst said about the importance of CP & CPS documents. Beyond that, I'm very curious to hear from CAs about what the issues they've faced in adopting ACME and the issues their customers have faced with the automation it provides?

E.g. more specifically: What can we do at the IETF level to help improve this?

Mike Shaver

unread,
Jun 5, 2025, 10:43:20 AMJun 5
to Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
(Not speaking for my employer.)

The CPS conversation is very confusing to me. The contents of the CPS are incorporated by (messy and imprecise) reference into every certificate issued, so that a relying party can...rely on the practices that are documented in the CPS. If it weren't for the ever-present size concerns, the CPS would be fields in the certificate directly. If the element in question isn't relevant to the trust decision by the relying party, then take it out of the certificate, which means taking it out of the CPS. You can document non-trust-relevant practices in your TOS or some other doc that isn't bound to every certificate you issue.

We aren't talking about a typo in Microsoft's case, or the similar cases cited in Entrust's history of misbehaviour less than a year ago. A typo is an inconsequential error in form, like a misspelled word or two sections with the same number. Describing the omission of a relevant clause as a "typo" is an attempt to diminish the significance of it and portray the consequences of misissuance as excessive; it borders on operating in bad faith in my opinion. We're talking about material differences between versions, which is *why they made the correction at all*. The issuer *knows* that this thing is relevant to security, which is why it's in the critical, fragile CPS document at all.

The idea that requiring CPS correctness will be a "race to the bottom" is similarly difficult for me to understand. The entire point of exceeding the BRs is so that relying parties can depend on the things that a CA does that exceed the BR minimum. Relying parties can only depend on those things if they are reliably represented (by reference) in the certificate involved in the trust decision. It's a race to the bottom if the industry *doesn't* take material CPS error seriously, because then relying parties actually *can't* depend on anything but the minimum of the BRs, regardless of what a CA might want to claim in the certificates they issue.

I understand that to a layperson on r/sysadmin who has to roll a couple thousand Azure certs by hand (for some reason), this may seem like a "minor documentation error", because it is something that happened in a document on a web site instead of being part of the cert bytes. I do *not* understand active participants in the industry, who have been able to see Entrust and others attempt the exact same arguments and seen why they are not accepted, can genuinely hold the same misconception.

Mike

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.
To view this discussion visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/2ad07871-a862-4aef-96c4-7e180245be39n%40mozilla.org.

Jeremy Rowley

unread,
Jun 5, 2025, 3:54:54 PMJun 5
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson

Hi Mike,

I didn't hear any disagreement on the call that the  current policy mandates revocation for a CPS misalignment. I'm also not sure that the conversation was about Microsoft as that was a cert profile question, not just a typo. I think the primary question (in my view) was whether there was a way to encourage better transparency while still making sure the CP/CPS is a binding commitment (as Ryan said).

Personally, I think how do you balance encouraging transparency with the need for accuracy is an interesting question. On the one hand, as you and Ryan both mentioned, relying parties depend on the CPS to know how the certificate was issued. On the other hand, one major revocation can be enough to convince any CA to copy and paste the BRs as much as possible. I commented on the Apple bug recently that the industry would benefit from encouraging better transparency in CPS docs while still expecting them to accurately reflect the CAs practices. Although lots of CAs put additional controls on their CA above and beyond the BRs, I would not put those into a CP. Instead, I would offer them as an SLA to the agreement or similar practice. If you violate one of those, the customer gets a credit instead of a revoked cert. The CA still shows that they are doing more than the minimum but they don't risk revocation if a control fails.

I think we could foster a more transparent and secure ecosystem if there was a way to allow for timely corrections of documentation discrepancies that are not trust related without necessitating mass revocations. I do not know what the best approach is for this though. However, I do like the suggestion that the CA specify all non-trust related items in a TOS or other document that is incorporated into contracts, but doesn't that end up making the CPS an inconsequential document as there's even more incentive to copy and paste the BRs into your own document? 

I think Ryan hit the real issue right on the head: 

"The real problem revealed by incidents like Microsoft's isn't overly strict enforcement; it's that CAs lack proper automation between their documented policies and actual certificate issuance."

My biggest issue with CPS docs is that they are written by a person and usually someone who is working in the compliance org. The CPS doc is expected to be a combination of several different departments that one or two people are putting together. The document can also be 100 pages long. I would like to see the industry move towards a more automated creation process for CPS docs. Something where humans aren't writing the document - maybe AI? 

"Too many CAs want the easy way out." I disagree with Ryan on this one. I think most CAs want the CPS to be accurate but want a better way to do it - something automatable and repeatable. 

I also disagree here: "The solution isn't to weaken accountability. It's to demand that CAs invest in proper compliance infrastructure. Good change control practices and automation makes policy violations nearly impossible; without it, even simple documentation errors can lead to massive compliance failures."  Human processes with human reviews writing a human-readable document are going to have mistakes. 

Jeremy


On Wed, Jun 4, 2025 at 9:32 PM 'Amir Omidi (aaomidi)' via dev-secur...@mozilla.org <dev-secur...@mozilla.org> wrote:
Thank you for this summary! Super useful for folks who weren't able to attend.

I concur with what Ryan Hurst said about the importance of CP & CPS documents. Beyond that, I'm very curious to hear from CAs about what the issues they've faced in adopting ACME and the issues their customers have faced with the automation it provides?

E.g. more specifically: What can we do at the IETF level to help improve this?

On Wednesday, June 4, 2025 at 4:52:02 PM UTC-7 Ryan Hurst wrote:

Thanks for sharing this comprehensive summary, Ben.

I'm deeply concerned about the direction of the CPS discussion in this roundtable. The framing that documentation discrepancies create "perverse incentives" fundamentally misses the point of what these documents are for.

CPs and CPSs are binding public commitments, not bureaucratic paperwork. When a CA issues millions of certificates under policies that contradict their documented promises, the accountability mechanism isn't broken, it's working exactly as intended. The suggestion that we should make it easier for CAs to violate their commitments without consequences would gut the very foundation of ecosystem trust.

The real problem revealed by incidents like Microsoft's isn't overly strict enforcement; it's that CAs lack proper automation between their documented policies and actual certificate issuance. This wasn't just a "typo." It exposed the absence of systems that would automatically catch such discrepancies before millions of certificates were issued under incorrect policies.

Too many CAs want the easy way out: patching documents after problems surface rather than investing in the automation and processes needed to prevent mismatches in the first place. Root programs that tolerate retroactive fixes inadvertently encourage CAs to cut corners on the systems and processes that would prevent these problems entirely.

The solution isn't to weaken accountability. It's to demand that CAs invest in proper compliance infrastructure. Good change control practices and automation makes policy violations nearly impossible; without it, even simple documentation errors can lead to massive compliance failures.

I've written more about why these policy documents matter more than most people think: https://unmitigatedrisk.com/?p=1038



Jeremy Rowley

unread,
Jun 5, 2025, 4:05:43 PMJun 5
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Actually - the more I think about it, the more I like Mike's idea. You could split the document into 3 components: 
1) What does the CA do to meet its compliance requirements, 
2) What are the cert profiles the CA is issuing
3) What are the items the CA is doing that are compliance requirements but are there for more description on how the CA operates

Mistakes in any of the 3 require an incident report (to ensure transparency) but mistakes in 1 or 2 definitely require revocation. 

Requiring an incident report still encourages accuracy on part 3 but it also warns relying parties that parts of this can be fixed without revocation. 

Thoughts? 

Mike Shaver

unread,
Jun 5, 2025, 4:12:59 PMJun 5
to Jeremy Rowley, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
On Thu, Jun 5, 2025 at 3:54 PM Jeremy Rowley <rowl...@gmail.com> wrote:

Hi Mike,

I didn't hear any disagreement on the call that the  current policy mandates revocation for a CPS misalignment. I'm also not sure that the conversation was about Microsoft as that was a cert profile question, not just a typo.

I am mixing my venues, apologies. Microsoft's CPS error has been described (minimized) as "a typo" elsewhere, and I inappropriately carried that over to this discussion. My apologies to the readers and to Microsoft.

Instead, I would offer them as an SLA to the agreement or similar practice. If you violate one of those, the customer gets a credit instead of a revoked cert. The CA still shows that they are doing more than the minimum but they don't risk revocation if a control fails.

This doesn't make sense to me. Many of the items in CPSes (whether they exceed the BRs or not) are commitments to the relying party about how the certs are generated or protected. And it's for things that aid the relying parties that I want to see CAs exceed the BRs in the first place: tighter controls, shorter validity, etc. How does a relying party "get a credit" if they rely on a certificate property specified in the CPS that turns out to not hold, because the CPS was incorrect? That certificate will continue to carry the misleading characteristic for the duration of its validity, which is why we want to see its validity terminated.

The CA *isn't* doing more than the BRs if relying parties can't expect that those extra things apply to every certificate claiming so. They're *trying* to do more, and *maybe* you can trust that a given certificate has the properties that its linked CPS claims. If the CPS isn't a reliable reference for the practices under which the certificate was issued, then let's take that link out of the certificate entirely and replace it with inline fields for whatever "important" (read: revocation-worthy if mismanaged) attributes are needed.

The right path isn't to make CPS errors into "diet incidents" distinct from other errors related to attributes of certificate issuance. It's to make revocation simple and painless so that we don't have CAs "forced" to delay the revocation of millions of inaccurate certificates, due to a failure to implement best practices advocated by their own organization (like CRL sharding). I'm referencing Microsoft here again because they are a fresh example, but they are definitely not alone in having cases of delayed revocation that were preventable through diligent application of the practices the community has learned through painful lessons.

We heard the same "it is a wafer-thin error, don't make us do the thing for which we have ill-prepared ourselves and our Subscribers" complaint about country codes and OIDs and fields being lowercase or uppercase--basically anything else that isn't a straight-up key material leak. The underlying principle remains the same: if it's not important, don't put it (including by reference) in the certificate in the first place. But the *entire point* of the dance we do with CAs and CT and validity restrictions and revocation and paid-for-by-the-auditee-but-that's-another-thread WebTrust audits and *even having BRs* is this: relying parties can rely on the assertions made by the certificate if it is valid and the issuance chain can be verified. If that's not going to hold, then there really is no point and we can let ssh-style cert continuity suffice for the web instead.

Mike

Jeremy Rowley

unread,
Jun 5, 2025, 4:25:08 PMJun 5
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
> This doesn't make sense to me. Many of the items in CPSes (whether they exceed the BRs or not) are commitments to the relying party about how the certs are generated or protected. And it's for things that aid the relying parties that I want to see CAs exceed the BRs in the first place: tighter controls, shorter validity, etc. How does a relying party "get a credit" if they rely on a certificate property specified in the CPS that turns out to not hold, because the CPS was incorrect? That certificate will continue to carry the misleading characteristic for the duration of its validity, which is why we want to see its validity terminated.

They don't, but what is the incentive of the CA to give the relying party more protection while risking revocation if someone writes the information incorrectly. We are in violent agreement that the certificate would be mis-issued if the CPS was incorrect though and that revocation is the correct way to address that.

> The CA *isn't* doing more than the BRs if relying parties can't expect that those extra things apply to every certificate claiming so. They're *trying* to do more, and *maybe* you can trust that a given certificate has the properties that its linked CPS claims. If the CPS isn't a reliable reference for the practices under which the certificate was issued, then let's take that link out of the certificate entirely and replace it with inline fields for whatever "important" (read: revocation-worthy if mismanaged) attributes are needed.

Yeah - definitely agree with you here. That's the problem with coming up with a solution. Any error in the CPS means teh promise was completely empty nor can the relying party trust any other part of the CPS in that case. The CA could be lying about those whole CPS document if you allow the CA to determine what constitutes a typo vs. any other error. 

> The right path isn't to make CPS errors into "diet incidents" distinct from other errors related to attributes of certificate issuance. It's to make revocation simple and painless so that we don't have CAs "forced" to delay the revocation of millions of inaccurate certificates, due to a failure to implement best practices advocated by their own organization (like CRL sharding). 

Sure. I agree with that as well, but I also don't think it addresses the issue of diminishing transparency in CPS docs. Not all, but a lot of them read like they are the BRs. 

> We heard the same "it is a wafer-thin error, don't make us do the thing for which we have ill-prepared ourselves and our Subscribers" complaint about country codes and OIDs and fields being lowercase or uppercase--basically anything else that isn't a straight-up key material leak. The underlying principle remains the same: if it's not important, don't put it (including by reference) in the certificate in the first place. But the *entire point* of the dance we do with CAs and CT and validity restrictions and revocation and paid-for-by-the-auditee-but-that's-another-thread WebTrust audits and *even having BRs* is this: relying parties can rely on the assertions made by the certificate if it is valid and the issuance chain can be verified. If that's not going to hold, then there really is no point and we can let ssh-style cert continuity suffice for the web instead.

Yeah - no disagreement here. It's also a long established rule that CAs aren't allowed to make "its not a security issue" arguments with bugs. Similar item applies here. My dislike of the current CPS process is two fold: 1) it encourages copying and pasting the BRs instead of giving the community extra transparency on what is going on. The extra transparency always happens around bugs instead of before them. I think it would be far better if CAs could include architecture diagrams, process flows, and similar information for the browser/RP review that avoids the marketing gloss that gets put on many public documents. 2) Some human is expected to write this document and get it right. We should encourage more automated CPS document creation where practices are pulled from systems rather than having a person write what the system is doing. 

Mike Shaver

unread,
Jun 5, 2025, 4:28:53 PMJun 5
to Jeremy Rowley, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
On Thu, Jun 5, 2025 at 4:25 PM Jeremy Rowley <rowl...@gmail.com> wrote:
They don't, but what is the incentive of the CA to give the relying party more protection while risking revocation if someone writes the information incorrectly. 

There's a small part of me, even after all these years, that believes that the whole point of being a CA is to help secure the web for its users. If that's not a shared motivation, then our only option is the force of the BRs and root programs, and we should stop negotiating entirely with misaligned members of the ecosystem.

Mike

Jeremy Rowley

unread,
Jun 5, 2025, 4:29:53 PMJun 5
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Hi Amir - I'm one of the people who mentioned ACME on the call. I've been doing a lot of ACME related setups lately. It works wonderfully for server devices but non-servers (like firewalls) are a pain. They require my team to write scripts into the API to get the system working correctly. I don't think this is an IETF problem but a device manufacturer problem where we need to encourage better ACME adoption for non-traditional servers. 

For example, I set up a website with ACME that worked wonderfully. However, when I went to get my firewall set up with the same cert, I couldn't automate the thing. Because I couldn't get the firewall to work with ACME, I only got have my system automated.

Jeremy Rowley

unread,
Jun 5, 2025, 4:30:58 PMJun 5
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Sorry - that was a cynical view. I do think CAs are trying to secure the web for users for sure, but I think a lot of CAs would argue that their particular mass revocation didn't help that cause :)

Mike Shaver

unread,
Jun 5, 2025, 4:31:50 PMJun 5
to Jeremy Rowley, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
(Accidental send.)

Like when Taher was originally designing SSL and needed to anchor trust in something, Netscape reached out to companies who (it was believed) could do a good job anchoring that trust such that, wait for it, relying parties could trust the identity of the site they were connecting to. The ability to extract rent from having one's company's random number embedded in the browser is very much a secondary outcome, and clearly not an entirely benign one.

Mike

Wayne

unread,
Jun 5, 2025, 4:34:34 PMJun 5
to dev-secur...@mozilla.org
Also concurring with Ryan, excellent summary of the issues.

I'd like to emphasize the seemingly forgotten detail whenever a CP/S a discussed: it is a legal document. It discusses the technical controls, certificate profiles, and contractual bindings between subscriber and CA. The kneejerk reaction to attempt to rewrite this is showing an inherent misunderstanding of what the CP/S documents are, and very much seems like people trying to rush a solution without following considering the problems.

If we take a step back is the issue really where the boundaries of contract law and technical documents crosses over, or is it down to mass revocation plans?

I've been thinking of this during the ongoing Microsoft incident, but is there a particular reason we lack an arbitrary maximum number of live certificates per intermediary? We lack actual hard figures on client limitations for CRL processing, CRP were pointing out active CRLs far exceeding the 10MB figure. A carve-out for short-lived certs, and planning from the worst-cast of a full revocation event what would be the ideal threshold for maximum number of certs? I'm not proposing this for BRs, or as a Root Program requirement - but certainly an option to minimize the blast radius for higher-level key compromise scenarios.

- Wayne

Amir Omidi

unread,
Jun 5, 2025, 5:13:07 PMJun 5
to Jeremy Rowley, Mike Shaver, dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Hi there!

This seems like it's a certificate distribution problem, rather than an automation problem. I don't think there is any solution that can be made at the CCADB or root program level which would solve this problem? Maybe the issue is short lived certs, rather than requirement of automation?

This is an issue folks have to deal with today, and realistically the option here is that you either work with the vendor to build in support for a protocol like ACME or write a few hundred lines of shell script to do the issuance on a separate machine and then load it onto the firewall.

Am I missing something here?

Jeremy Rowley

unread,
Jun 5, 2025, 5:31:02 PMJun 5
to Amir Omidi, Mike Shaver, dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Yeah - that's correct and you aren't missing anything. This is a distribution problem only. The call to action I was hoping to achieve by bringing it up was to encourage more communication (for everyone) about the impact of moving to 45 day certs and the need for automation. For example, CAs should be demanding that HSM providers support ACME and short lived certs. Despite CAs being a big customer,  most HSMs don't support great automation. 

Jeremy Rowley

unread,
Jun 5, 2025, 5:34:50 PMJun 5
to Wayne, dev-secur...@mozilla.org
> I've been thinking of this during the ongoing Microsoft incident, but is there a particular reason we lack an arbitrary maximum number of live certificates per intermediary? We lack actual hard figures on client limitations for CRL processing, CRP were pointing out active CRLs far exceeding the 10MB figure. A carve-out for short-lived certs, and planning from the worst-cast of a full revocation event what would be the ideal threshold for maximum number of certs? I'm not proposing this for BRs, or as a Root Program requirement - but certainly an option to minimize the blast radius for higher-level key compromise scenarios.

This has been proposed in the past but never adopted. IIRC it was because of the offline nature of key ceremonies so mass issuers would need to do a lot more signing. I still support this proposal though. You can batch up key ceremonies pretty easily.  

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Suchan Seo

unread,
Jun 5, 2025, 11:03:50 PM (14 days ago) Jun 5
to dev-secur...@mozilla.org, Jeremy Rowley, dev-secur...@mozilla.org, Wayne
In ideal world CA would want too put CSP doc in some utility that convert it to a linter or other way around: but not sure if it's something reasonable to make.

2025년 6월 6일 금요일 오전 6시 34분 50초 UTC+9에 Jeremy Rowley님이 작성:

Roman Fischer

unread,
Jun 6, 2025, 2:51:20 AM (14 days ago) Jun 6
to dev-secur...@mozilla.org

Hi,

 

For me that sounds like the TSPS (Trust Service Practice Statement = all the stuff that is common to the whole CA), CPS (Certificate Practice Statement = How does the CA do the validations, …) and CPR (Certificate Profiles) structure that some CAs moved to but will have to revert back to combined CP/CPS documents with loads of duplicate content. :-\

 

Rgds
Roman

Aaron Gable

unread,
Jun 6, 2025, 7:28:18 PM (13 days ago) Jun 6
to Mike Shaver, Amir Omidi (aaomidi), dev-secur...@mozilla.org, Ryan Hurst, Ben Wilson
Just my personal 2c on the CPS conversation:

On Thu, Jun 5, 2025 at 7:43 AM Mike Shaver <mike....@gmail.com> wrote:
The idea that requiring CPS correctness will be a "race to the bottom" is similarly difficult for me to understand. The entire point of exceeding the BRs is so that relying parties can depend on the things that a CA does that exceed the BR minimum. Relying parties can only depend on those things if they are reliably represented (by reference) in the certificate involved in the trust decision. It's a race to the bottom if the industry *doesn't* take material CPS error seriously, because then relying parties actually *can't* depend on anything but the minimum of the BRs, regardless of what a CA might want to claim in the certificates they issue.

 I'll give a concrete example of how the current system means that CPSes have to be more general than we'd like: the Let's Encrypt 90 Days + 1 Second incident.

Let's Encrypt thought that we were doing the right thing, by saying in our CPS that our certificates were valid for 90 days. That's an accurate, human-readable description of how the CA's certificates are compliant with the BRs' requirement that they be valid for 398 days or less. But then it turned out that the certificates were actually valid for 1 second more than 90 days. To be clear, this was a real error, and it exposed a real misunderstanding of how x509 validity periods work (inclusive of their end time). But the fix for that mistake was twofold:
1) Fix the issuance code to reduce the validity period of all certificates by one second; and
2) Change the CPS to say "less than 100 days".

Are LE certs valid for less than 100 days? Yes, it's a true statement. But it's not an optimally useful statement -- the thing a human wants to read in that document is "90 days"! But we can't ever say "90 days" in our CPS ever again, just in case there's some other tiny error. Does some definition buried three RFCs deep mean that we're actually off every time IERS decides to insert a leap second? I strongly believe the answer is no, but the example is still illustrative.

There are very strong incentives for CAs to write CPSes that are still "looser" than their actual practices: don't give a second-precision validity period, don't say exactly how many bits of entropy are in your serial, don't over-specify your OCSP or CRL URLs in case your CDN setup changes, etc. The cost to a CA of having an overly-specific CPS is mass revocations (which are not a punishment, but are undoubtedly a cost). The cost to a CA of having an under-specified CPS is, currently, nothing.

I don't love this situation. I'd much prefer for LE's CPS to be precise as well as accurate. But the risk of tiny errors creeping into a human-maintained, non-machine-readable document is simply too high.
Aaron

Matt Palmer

unread,
Jun 6, 2025, 9:30:40 PM (13 days ago) Jun 6
to dev-secur...@mozilla.org
So very, very many things to respond to...

On Mon, Jun 02, 2025 at 04:58:14PM -0600, 'Ben Wilson' via dev-secur...@mozilla.org wrote:
> There was broad concern that minor misalignments between a CA's
> Certification Practice Statement (CPS) and its actual practices—when those
> practices still comply with the Baseline Requirements (BRs)—can trigger
> unnecessary and disruptive mass revocations.

Arguing that CAs shouldn't need to revoke misissued certificates (and a
certificate that was issued in contravention of the CPS _is_ misissued)
because mass revocations are "disruptive" is, in the _proper_ sense of
the word, begging the question. Specifically, mass revocations should
_not_ be disruptive, and in fact I would argue that it is _required_ by
Mozilla policy to not be disruptive.

> "A good faith, trivial error in somebody's CPS can require that 100% of
> their certificates need to be revoked and replaced with certificates that
> are identical in every way except for the not-before date."

That is a gross mischaracterisation. Certificates are more than the
bytes-on-the-wire. They're a representation of a specific set of
validation practices, as of a particular point-in-time. Two
certificates with identical representations, but issued under different
validation practices, are _not identical_.

> "We want to encourage transparency and process improvement, not discourage
> accurate documentation by threatening revocation."

Revocation shouldn't be a threat, and the fact that CAs appear to think
it is a threat says a lot about how they're (un)able to handle the
necessity of revocation.

> Concerns:
>
> - This leads to “pointless mass revocations” when the only discrepancy
> is outdated or incomplete documentation.

Suggesting that outdated or incomplete documentation is somehow trivial,
when a large part of a CA's entire purpose is to maintain accurate and
up-to-date documentation, is concerning.

> - Participants noted the perverse incentive to write vague CPS
> documents to avoid being held accountable to overly specific details.
>
> "There's this kind of perverse incentive to never specify anything in your
> documentation that's not in the requirements itself."

Hopelessly vague documentation is a different problem, and speaks to the
inadequacy of CPS vetting practices, and to a lesser extent auditing
deficiencies.

If a CPS does not contain sufficient information to allow a
reasonably-competent third party to gain an accurate picture of a CA's
practices (because this is a Certification _Practice_ Statement, after
all), then it is not truly a CPS, and should be rejected until such time
as it is brought "up to code". Similarly, an auditor that waves through
a CPS that does not have sufficient detail to be able to be evaluated
against actual practices is not performing their duties diligently.

> Suggestions:
>
> - Consider creating a mechanism for corrective documentation updates
> instead of mandatory revocation in such cases.

Alternate suggestion: consider implementing change control practices
that ensure that a CA is doing what it says it is doing.

> - Possibly update section 4.9.1.1 of the BRs to allow for exceptions
> where the error is trivial and does not affect certificate validity.

Alternate suggestion: don't put things in your CPS that do not affect
certificate validity.

> 2. Clarifying Compliance Expectations and the Wiki
>
> HIGHLIGHTS:
>
> - Mozilla’s CA Wiki pages (especially "Forbidden and Problematic
> Practices" and "Recommended Practices") were seen as important but in need
> of frequent maintenance.
> - New problematic practices should be added.

As the name suggests, it's a wiki. Every participant of this call is,
presumably, able to obtain an account and edit it. That the edit
history of these pages does not show a wave of contributions from the
non-Mozilla call participants, suggests that this is not a serious point
of discussion, but is instead some species of red herring.

> 3. Incident Reporting and Bugzilla Process
> Participants discussed the need for CAs to consistently respond within 7
> days and to clearly answer all questions raised. However, ambiguity in
> questions—especially from anonymous accounts—can create uncertainty about
> what requires a formal response.

There are no "anonymous accounts" in Bugzilla. There may be
*pseudonymous* accounts, but that is not the same thing, and it is also
irrelevant -- a good question is a good question, regardless of whether
the CA can identify an entity to which they can send legal process.

The impression I get from this sort of comment is that CAs want to know
who is "important enough" to have to respond to, and who they can safely
ignore. This impression is reinforced by the stark difference in
responsiveness to, say, questions and comments from the
chrome-root-program account, as opposed to individual members of the
Mozilla community. This is disappointing, as the entire point of the
Mozilla root program is that it is supposed to be _community driven_,
which suggests that all members of that community should be equally
valued, until such time as their behaviour indicates otherwise.

> "Sometimes it's difficult to know when there are questions in a comment...
> if there is a question, please isolate it and put a question mark at the
> end."

This would be a more plausible observation if there weren't CAs that
ignored questions that were prefaced with "My question for <CA> is
thus:". In reality, it's laughable that the excuse for not answering
questions is "we didn't know it was a question!", when there are no
shortage of unanswered clearly-a-question questions.

> "Sometimes people think they've answered the question, but they haven't, or
> the question wasn't clearly phrased as a question."

And sometimes, it's absolutely, blatantly obvious to everyone that CAs
are failing to answer the question, and attempts to claim otherwise are
the most egregious form of gaslighting.

> Best Practices for Asking and Responding to Questions
>
> There was consensus that using official CA accounts for incident response
> would increase clarity and promote blameless, process-oriented discussion.

I have seen no evidence that using official CA accounts is "increasing
clarity". Mostly it seems to be a way to encourage bland platitudes and
non-answers.

I'd also like to highlight the hypocrisy of encouraging pseudonymity
from CA representatives through the use of role accounts, whilst
simultaneously decrying the use of pseudonymous accounts by the Mozilla
community at large. The logical conclusion is that Mozilla community
members should create a "friends-of-the-webpki" Bugzilla account, and
ask all questions of CAs through that account.

> - Not all questions are clearly marked as such.

Questions that _are_ clearly marked as such are still ignored. Do we
need to send a bevy of dancing clowns, holding signs saying "this is a
question!" and gesturing to the relevant sentence?

> - Unclear if questions are rhetorical, hypothetical, or require a
> formal CA response.

How about erring on the side of caution, and going with more
communication rather than less?

(Determining whether this is a rhetorical question is left as an
exercise for the reader)

> - Difficulty in determining which questions must be answered
> (especially when raised by anonymous commenters).

Everyone who asks a question in Bugzilla is a member of the Mozilla
community, and as such is prima facie entitled to an answer.

> Use of Incident Reports for Policy Development: Incident discussions can be
> valuable for identifying insecure practices that are not explicitly covered
> by existing rules. However, participants warned against letting these
> discussions become unstructured fishing expeditions.

>From the transcript, the only use of the phrase "fishing expedition"
appears to come from the moderator. No participant appears to have used
that phrase, and it's not obvious to me, at least, what prior comment
by a participant was being characterised as a warning against such
practices.

Thus, I am extremely curious to hear more about these supposed
activities that are so common as to apparently have multiple
participants warning against them in deeply coded language. I don't
recall seeing any particular rash of anything I'd describe as a "fishing
expeditioy" in any of the issues I've looked into.

> There was strong agreement that most CAs are already promoting automation
> as much as they can, and that end-user barriers—not lack of effort from
> CAs—are the primary challenge.

Strong agreement from CAs, perhaps. I don't see how anyone outside a CA
can have sufficient evidence to be able to support such a statement.

> "We’ve poured ludicrous amounts of effort into promoting automation over
> the last 5 years."

That's a statement that could be read in at least two very different ways.

> "People who say they have automation sometimes haven’t actually set it up
> right—it gets revealed later."

... and? Stuff breaks all the time. You fix it and move on.

> "We're expending hundreds of thousands of dollars to get fully automated,
> but it's not easy for the last 0.4%."

OK, what are the barriers -- and please, be specific -- to the adoption
of automation of that last 0.4%? What approaches have been considered,
and why were they rejected? What is the assessment of the costs and
benefits to the various approaches?

Absent such analysis, this reads as hyperbole.

> "There is an overemphasis on ACME. It’s not a magic wand. We need to
> broaden the conversation."

I feel like this "it's not all ACME!" is either a red herring or
evidence of some sort of delusional thinking. I don't recall seeing a
mass of people advocating specifically and exclusively for ACME as the
only solution to the entire certificate lifecycle. It is promoted
widely, yes, because it is the first protocol I'm aware of to get such
widespread acceptance and adoption, and to be able to be a "lingua
franca" for certificate issuance. But just as the existence of a
common language in the meatspace world hasn't led to the immediate
elimination of all other languages, the existence of ACME does not imply
the elimination of all other approaches to automating certificate
issuance.

> Some end users echoed that while they're supportive of automation and are
> mostly automated, the remaining edge cases involve legacy systems or
> security-sensitive environments where automation introduces risks.

This reminds me of the old Yes, Prime Minister scene where all the
ministers were broadly in support of hiring quotas for women, but there
were reasons why their own department wasn't able to adopt such a quota,
but in principle, certainly, it should be adopted as policy.

> "Automation requires installation of software... and that increases the
> attack surface."

If a small shell script that makes HTTPS requests to a specific endpoint
is a meaningful increase in attack surface, you've got much bigger
problems.

Anyone who has spent more than 10 minutes in infosec knows how the
"ermahgerd securiteh!" argument is wielded regularly to stop anything
that someone doesn't really want to do. I'll bet that a sizeable
proportion of organisations that argue they can't automate certificate
installation "because attack surface" has done all manner of insecure
stuff, and management has accepted the risk, because it was deemed
necessary for some executive's pet project.

> - ACME isn’t suitable for all environments.

That's a strawman big enough to be seen from space.

> - The industry needs a broader definition and framework for automation.

The problem isn't definitions and frameworks, it's willingness to
actually invest in the changes required.

> - There is a need to catalog and address real-world blockers to adoption.

Since CAs are the only parties on the call that have access to the
people who have knowledge of the real-world blockers to adoption, this
work is entirely on them. I look forward to the detailed case-studies
that are sure to be published Real Soon Now.

> "If we want more automation, we need to stop talking about ACME and start
> talking about other things."

No, if you want more automation, you need to build more automation.
The existence of ACME is not the reason that work isn't being done; the
lack of willingness to do that work is the reason it isn't being done.

> Root programs and CAs could:
>
> - Encourage subscribers to treat automation as lifecycle management,
> not just certificate renewal.

Waitaminute... CAs claim they "are already promoting automation as much
as they can", but until this moment, it didn't occur to any of them to
mention that automation might be more than just certificate renewal?

> 7. Improving Community Engagement and Policy Development
>
> HIGHLIGHTS:
>
> - There was support for escalating appropriate issues to the Mozilla
> dev-security-policy list or the CCADB public list.

Given that the CCADB public list is invite-only, I do not support the
use of that list for anything that is properly within the remit of the
Mozilla trust store.

> Recommendations:
>
> - If a Bugzilla discussion raises questions of precedent or future
> policy, transition the conversation to the Mozilla
> dev-security-policy list or CCADB Public.

What has been stopping CAs from starting threads on mdsp that spin out
of Bugzilla issues before now?

- Matt

Matt Palmer

unread,
Jun 6, 2025, 9:36:01 PM (13 days ago) Jun 6
to dev-secur...@mozilla.org
On Thu, Jun 05, 2025 at 01:54:34PM -0600, Jeremy Rowley wrote:
> My biggest issue with CPS docs is that they are written by a person and
> usually someone who is working in the compliance org. The CPS doc is
> expected to be a combination of several different departments that one or
> two people are putting together. The document can also be 100 pages long. I
> would like to see the industry move towards a more automated creation
> process for CPS docs. Something where humans aren't writing the document -
> maybe AI?

I look forward to a future where a CA has to mass-revoke all their
certificates because a stochastic parrot confabulated some complete
nonsense into the CPS.

- Matt

Matt Palmer

unread,
Jun 6, 2025, 9:41:38 PM (13 days ago) Jun 6
to dev-secur...@mozilla.org
On Thu, Jun 05, 2025 at 02:24:51PM -0600, Jeremy Rowley wrote:
> Some human is expected to
> write this document and get it right. We should encourage more automated
> CPS document creation where practices are pulled from systems rather than
> having a person write what the system is doing.

What forms would this encouragement take, in your opinion? I don't see
any overt *barriers* to anyone doing this at the moment, so if the
benefits were to outweigh the costs, at least one CA would be doing this
already. Hence, presumably the benefits do not outweigh the costs, so
to make this happen, either the benefits need to increase, or the costs
need to decrease. How do you envisage either of those things could be
made to happen?

- Matt

Matt Palmer

unread,
Jun 6, 2025, 9:47:00 PM (13 days ago) Jun 6
to dev-secur...@mozilla.org
On Thu, Jun 05, 2025 at 02:29:36PM -0600, Jeremy Rowley wrote:
> Hi Amir - I'm one of the people who mentioned ACME on the call. I've been
> doing a lot of ACME related setups lately. It works wonderfully for server
> devices but non-servers (like firewalls) are a pain. They require my team
> to write scripts into the API to get the system working correctly. I don't
> think this is an IETF problem but a device manufacturer problem where we
> need to encourage better ACME adoption for non-traditional servers.

Yeah, that's not an ACME problem, that's a market problem. Mentioning
the protocol as being involved at all, when the problem is some
proprietary device and its lack of feature development, is bordering on
disingenuous -- although it does have a long and storied history in the
"stymying progress" movement (just look at IPv6 and the "my routers
don't support it!" dance).

- Matt

Matt Palmer

unread,
Jun 6, 2025, 9:51:23 PM (13 days ago) Jun 6
to dev-secur...@mozilla.org
On Thu, Jun 05, 2025 at 02:04:36PM -0600, Jeremy Rowley wrote:
> Actually - the more I think about it, the more I like Mike's idea. You
> could split the document into 3 components:
> 1) What does the CA do to meet its compliance requirements,
> 2) What are the cert profiles the CA is issuing
> 3) What are the items the CA is doing that are compliance requirements but
> are there for more description on how the CA operates
>
> Mistakes in any of the 3 require an incident report (to ensure
> transparency) but mistakes in 1 or 2 definitely require revocation.
>
> Requiring an incident report still encourages accuracy on part 3 but it
> also warns relying parties that parts of this can be fixed without
> revocation.

Revocation might be a blunt instrument of limited effectiveness, but as
a means of notifying relying parties of a misissued certificate it is
almost infinitely better than an incident report posted to Bugzilla.

- Matt

Matt Palmer

unread,
Jun 7, 2025, 12:49:17 AM (13 days ago) Jun 7
to dev-secur...@mozilla.org
On Fri, Jun 06, 2025 at 04:28:02PM -0700, 'Aaron Gable' via dev-secur...@mozilla.org wrote:
> Just my personal 2c on the CPS conversation:
>
> On Thu, Jun 5, 2025 at 7:43 AM Mike Shaver <mike....@gmail.com> wrote:
>
> > The idea that requiring CPS correctness will be a "race to the bottom" is
> > similarly difficult for me to understand. The entire point of exceeding the
> > BRs is so that relying parties can depend on the things that a CA does that
> > exceed the BR minimum. Relying parties can only depend on those things if
> > they are reliably represented (by reference) in the certificate involved in
> > the trust decision. It's a race to the bottom if the industry *doesn't*
> > take material CPS error seriously, because then relying parties actually
> > *can't* depend on anything but the minimum of the BRs, regardless of what a
> > CA might want to claim in the certificates they issue.
>
> I'll give a concrete example of how the current system means that CPSes
> have to be more general than we'd like: the Let's Encrypt 90 Days + 1
> Second incident.

[...]

> But the fix for that mistake was twofold:
> 1) Fix the issuance code to reduce the validity period of all certificates
> by one second; *and*
> 2) Change the CPS to say "less than 100 days".

To be clear, either of these fixes would have been sufficient, correct?
(A useful adjunct to (1) would have been a lint to ensure that the
issued certificate did, indeed, match the requirements of the CPS with
regards to validity period).

> Are LE certs valid for less than 100 days? Yes, it's a true statement. But
> it's not an *optimally* *useful* statement -- the thing a human wants to
> read in that document is "90 days"!

Is it, though? Marketing materials, end-user documentation, sure, go
with 90 days, it's close enough for government work. In a CPS, though,
correctness matters. If a CPS says "we will not issue a certificate for
more than 90 days", then the sort of weirdo that actually reads CPSes
might come to rely on that, for some reason, and failing to abide by
that might potentially cause some sort of problem. By being "looser"
with the CPS language, you're ensuring that (absent Hyrum's Law,
violation of which is not listed in 4.9.1.1) nobody comes to rely on a
behaviour which turns out to not be valid.

> But we can't ever say "90 days" in our
> CPS ever again, just in case there's some other tiny error. Does some
> definition buried three RFCs deep mean that we're actually off every time
> IERS decides to insert a leap second? I strongly believe the answer is no,
> but the example is still illustrative.
>
> There are very strong incentives for CAs to write CPSes that are still
> "looser" than their actual practices: don't give a second-precision
> validity period,

I think there's a case to be made, in the leap second example
particularly, that the cause of that issue would be a *lack* of
second-precision validity period. "90 days" is ambiguous in its mapping
to the actual certificate contents. If the CPS said "we will not issue
a certificate with a validity period of greater than 7,776,000 seconds",
then it would be easier to map that to a lint, which would prevent
misissuance. Being so precise might also have triggered the thought
"hey, does 'not before' and 'not after' mean that it's valid *at* those
times?", catching the actual problem LE saw, in advance.

> don't say exactly how many bits of entropy are in your
> serial, don't over-specify your OCSP or CRL URLs in case your CDN setup
> changes, etc. The cost to a CA of having an overly-specific CPS is mass
> revocations (which are not a punishment, but are undoubtedly a cost). The
> cost to a CA of having an under-specified CPS is, currently, nothing.

The problem of woefully under-specified CPSes is a big one, and it's
something that does need attention. But I'd prefer a CPS that was
looser over one which was specific but wrong, because at least then I
know when I'm entering mysterious, unexplored lands. Like Majikthise
and Vroomfondel, I demand rigidly defined areas of doubt and
uncertainty.

Your examples are nicely illustrative of the value of this principle.

If a CPS specifies that cert serials have (say) 64 bits of entropy, I
might decide that's not enough to stop some attack I'm worried about,
and I'll decide to implement additional mitigations. If you *actually*
have, say, 100 bits, then no harm done -- I'm doubly protected. But if
you claim to have 100 bits, I might decide to not implement that
additional mitigation, figuring the 100 bits protects me. If you then
fail to have 100 bits, I'm toast.

Similarly, for OCSP/CRL URLs, if your CPS says "here they are", I might
forego the extra coding effort required to, say, pull them out of each
individual certificate I'm looking at, because hey, they're in the CPS,
they can't change without notice. If the CPS didn't specify them, then
yeah, I'd have to take the extra time to do it properly, but at least
when you go and change the URLs, everything still works.

> I don't love this situation. I'd much prefer for LE's CPS to be *precise* as
> well as accurate.

There's a concept in physics (which I will admit I have not seriously
studied in some years, so my explanation might be a little fuzzy) of
"excessive precision". The idea is that, when making measurements,
estimates, and such like, that it is possible to be *too* precise, and
that this is problematic for various reasons -- not least of which is
people see the unnecessary precision and think it's more meaningful than
it is.

I make this observation to note that, while precision in a CPS is a good
property to have, being overly precise has its own downsides (beyond the
need for occasional mass-revocations).

> But the risk of tiny errors creeping into a
> human-maintained, non-machine-readable document is simply too high.

I don't concede that a CPS is any more vulnerable to errors than
anything else we fallible meat sacks touch. Errors creep into anything
that humans have anything at all to do with. We deal with it by having
things like change control and review, redundancy, and "closed-loop"
feedback systems.

Without such controls, it would be entirely possible for someone to make
the "tiny error" of changing the validity period value in a cert profile
to change the leading '7' in '7,776,000' to a '9', making for some
CPS-violating certificates. I'm guessing that LE probably has about 17
different ways in which that that error would be identified before a
"live" certificate went out, even though it's a tiny, single character
error.

- Matt

Seo Suchan

unread,
Jun 7, 2025, 2:37:35 AM (13 days ago) Jun 7
to dev-secur...@mozilla.org
x509 writes time in stringish way (see asn.1 UTCtime /generalizedtime) so leaf notafter theoretically can injected between notbefore and notafter after certificate is signed.

A day for certificate perpose is defined as 86,400 seconds (BR 6.3.2)


On 2025년 6월 7일 오후 1시 49분 12초 GMT+09:00, Matt Palmer <mpa...@hezmatt.org> 작성함:

Zacharias Björngren

unread,
Jun 7, 2025, 5:14:43 PM (12 days ago) Jun 7
to dev-secur...@mozilla.org
I must object to the attitude CAs seem to have against ”anonymous” accounts on Bugzilla. I think it must be hard to avoid noticing ”Wayne” on Bugzilla, who has been very active and asked many (understatement) questions. I understand that from the perspective of having to answer those questions it can sometimes be uncomfortable because experience has shown us that they often highlight shortcomings. But this would only be problematic for a CA if that CA were not committed to the principles of the webPKI community, and instead had other priorities.

I find it quite distasteful that some seem to have such low regard for community participants just because they choose to operate under a pseudonym.

Best regards
Zacharias


--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Ryan Hurst

unread,
Jun 7, 2025, 5:33:03 PM (12 days ago) Jun 7
to Seo Suchan, dev-secur...@mozilla.org
Aaron's "90 days + 1 second" example perfectly illustrates the point I was making originally. This wasn't a documentation typo - it revealed a fundamental gap between intended practice and actual implementation. The response of changing the CPS to "less than 100 days" is exactly the race to the bottom I'm concerned about.

When Aaron says "we can't ever say 90 days in our CPS ever again," that's the perverse incentive in action. We're pushing CAs to make their public commitments vaguer rather than pushing them to invest in systems that ensure those commitments are reliable. This is the problem we need to fix with better processes and automation, not with less enforceable and less useful governance.
The thread also reveals a troubling pattern.

 We hear about "good faith errors" and inevitable human mistakes in this space constantly, yet this is an industry that has automated domain validation, linting of issued certificates, logging all issuance on the web via certificate transparency, and manages very large-scale cryptographic operations for the world. The claim that policy compliance checking can't be similarly automated doesn't hold up to scrutiny.

What we need is to stop treating CPSs as compliance artifacts written after the fact and start making them operational documents that sit at the center of how CAs work. A properly designed CPS should be machine-readable on one side - directly governing issuance systems and preventing the very mismatches we're debating - while remaining human-readable on the other for auditors and relying parties. This is actually possible today; we just need to care enough to do it.

After 30 years in this space, I can't look at most CPSs and understand what a CA actually does. But instead of accepting this as inevitable, we should be demanding that these documents serve their intended purpose: clearly communicating operational reality to everyone who needs to understand it.

There are 8 billion people depending on this system. Are we really going to allow fewer than 50 root CAs to keep treating their public commitments as legal paperwork instead of operational specifications?

The solution isn't weaker enforcement - it's making CPSs the living center of CA operations, where policy drives practice instead of scrambling to document it afterward.

Ryan

Jeremy Rowley

unread,
Jun 7, 2025, 5:56:02 PM (12 days ago) Jun 7
to Ryan Hurst, Seo Suchan, dev-secur...@mozilla.org
+1. - especially on how CPS docs need to evolve.

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Watson Ladd

unread,
Jun 8, 2025, 12:59:59 PM (11 days ago) Jun 8
to Ryan Hurst, Seo Suchan, MDSP
On Sat, Jun 7, 2025 at 2:33 PM Ryan Hurst <ryan....@gmail.com> wrote:
>
> Aaron's "90 days + 1 second" example perfectly illustrates the point I was making originally. This wasn't a documentation typo - it revealed a fundamental gap between intended practice and actual implementation. The response of changing the CPS to "less than 100 days" is exactly the race to the bottom I'm concerned about.
>
> When Aaron says "we can't ever say 90 days in our CPS ever again," that's the perverse incentive in action. We're pushing CAs to make their public commitments vaguer rather than pushing them to invest in systems that ensure those commitments are reliable. This is the problem we need to fix with better processes and automation, not with less enforceable and less useful governance.
> The thread also reveals a troubling pattern.
>
> We hear about "good faith errors" and inevitable human mistakes in this space constantly, yet this is an industry that has automated domain validation, linting of issued certificates, logging all issuance on the web via certificate transparency, and manages very large-scale cryptographic operations for the world. The claim that policy compliance checking can't be similarly automated doesn't hold up to scrutiny.

Let's start with disclosing which checks in the certificate issuance
process are done by humans.

What's become clear over a number of incidents is that CAs can have
very low degrees of automation, creating the kind of situation where
humans will have to be on the lookout for a slight abnormality among
hundreds or thousands of other things, over 40 hours a week. That's
not something humans can do. We really need to fix this as a
community.

These security relevant differences aren't in the CPS, audit reports
or discussed in root program addition bugs. It does seem that we're
barely able to get commitments to the bare minimum of BR requirements,
rather than CAs working to improve their processes and policies.

>
> What we need is to stop treating CPSs as compliance artifacts written after the fact and start making them operational documents that sit at the center of how CAs work. A properly designed CPS should be machine-readable on one side - directly governing issuance systems and preventing the very mismatches we're debating - while remaining human-readable on the other for auditors and relying parties. This is actually possible today; we just need to care enough to do it.

For DV yes, maybe. For OV and up it's going to be a harder slog. I do
however think that there's a very real human element here in
assessment.

>
> After 30 years in this space, I can't look at most CPSs and understand what a CA actually does. But instead of accepting this as inevitable, we should be demanding that these documents serve their intended purpose: clearly communicating operational reality to everyone who needs to understand it.
>
> There are 8 billion people depending on this system. Are we really going to allow fewer than 50 root CAs to keep treating their public commitments as legal paperwork instead of operational specifications?
>
> The solution isn't weaker enforcement - it's making CPSs the living center of CA operations, where policy drives practice instead of scrambling to document it afterward.
>
> Ryan
>
> --
> You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.
> To view this discussion visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/CALVZKwbvoJQ%2BBSMVEsx4YJm-T3uyggu7YAY_z79aoXf_e3pXoA%40mail.gmail.com.



Astra mortemque praestare gradatim

Jeremy Rowley

unread,
Jun 8, 2025, 1:44:49 PM (11 days ago) Jun 8
to Watson Ladd, Ryan Hurst, Seo Suchan, MDSP
I don't think OV vs. DV makes much of a difference from a CPS perspective as most of the CPS document is about the processes and security involved in running a CA, not validating the entity behind a cert. There's only two sections involved in identity (3.2.3 and 3.2.5) compared to domain validation. 

I definitely would support seeing more description in CPS docs around what's automated vs. human performed. It's a good place to start.

Watson Ladd

unread,
Jun 11, 2025, 11:07:55 AM (8 days ago) Jun 11
to Jeremy Rowley, Ryan Hurst, Seo Suchan, MDSP


Astra mortemque praestare gradatim

On Sun, Jun 8, 2025, 10:44 AM Jeremy Rowley <rowl...@gmail.com> wrote:
I don't think OV vs. DV makes much of a difference from a CPS perspective as most of the CPS document is about the processes and security involved in running a CA, not validating the entity behind a cert. There's only two sections involved in identity (3.2.3 and 3.2.5) compared to domain validation. 

It's definitely been a source of incidents such as the Bremerhaven is not in Saxony one.

I definitely would support seeing more description in CPS docs around what's automated vs. human performed. It's a good place to start.

As some CAs has noted this would mean process improvements have to be rolled out with CPS changes, with a big revocation as the "prize" for messing up. Now obviously there's a degree to which the process is the product here, and proper identification of what process was in place at a time is essential.

But at the same time the incentives towards vaguness are bad.

Tobias S. Josefowitz

unread,
Jun 12, 2025, 8:50:02 AM (8 days ago) Jun 12
to dev-secur...@mozilla.org, Jeremy Rowley
Hi Jeremy,

On Thu, 5 Jun 2025, Jeremy Rowley wrote:

> Although lots of CAs put additional controls on their CA above and
> beyond the BRs, I would not put those into a CP. Instead, I would offer
> them as an SLA to the agreement or similar practice. If you violate one
> of those, the customer gets a credit instead of a revoked cert. The CA
> still shows that they are doing more than the minimum but they don't
> risk revocation if a control fails.

Could such additional controls not be put into CP/CPS as a conditional
commitment on either fulfilling them or offering the subscriber credit?
Such that it would then only be a CP or CPS violation if the control was
violated AND the subscriber did not get credited? This would offer
transparency to Relying Parties, as well as a commitment to either uphold
the controls backed by a mechanism incentivising the CA to follow through?
I imagine that might be useful for trust decisions by a Relying Party...

Tobi

Jeremy Rowley

unread,
Jun 12, 2025, 9:05:24 AM (8 days ago) Jun 12
to Tobias S. Josefowitz, dev-secur...@mozilla.org
Would that be allowed and also avoid revocation of your missed it? Could you put something in the CPS that said “We endeavor perform everything as stated in this CPS but there
May be slight deviations. If there’s any deviation from the CPS, we don’t revoke but offer a credit of X. We always revoke for noncompliance with the BRs. ” 

If this would actually allow you to disclose more info without revoking if something goes sideways with that additional disclosure, then it’s a good solution and worth trying.

Tobias S. Josefowitz

unread,
Jun 12, 2025, 9:25:10 AM (8 days ago) Jun 12
to Jeremy Rowley, dev-secur...@mozilla.org
On Thu, 12 Jun 2025, Jeremy Rowley wrote:

> Would that be allowed and also avoid revocation of your missed it? Could
> you put something in the CPS that said ?We endeavor perform everything as
> stated in this CPS but there
> May be slight deviations. If there?s any deviation from the CPS, we don?t
> revoke but offer a credit of X. We always revoke for noncompliance with the
> BRs. ?

I cannot make any definitive statements on whether that is allowed or not
at this point, but I do think this might be worth exploring. If it is not
allowed - CAs are seeking a rule change anyway, we could just allow this
instead if it is not a currently allowed approach. Theoretically speaking.

And also it would probably not hurt to see if there will be any relevant
discussion on the benefits and problems of such an approach. :)

Tobi

Ryan Hurst

unread,
Jun 12, 2025, 11:02:49 AM (7 days ago) Jun 12
to Jeremy Rowley, Tobias S. Josefowitz, dev-secur...@mozilla.org
A document that says "We do X, we do Y" but also says "YOLO" isn't much of a promise in my opinion, and CPSs are intended to be a promise. 

Again, the problem here isn't that we have too much transparency, or that revoking certificates that break the few promises CAs make to relying parties is a problem. The problem is that these documents are treated as a compliance artifact, rather than a governing artifact and are therefore loosely created as "as built" specifications. 

In other words, they are something that is created to describe what the CA thinks they do, rather than what they hold themselves accountable to. To address the revocation problem, ironically, the CPSs need to be more explicit and used as governing documents as part of how operations and engineering are done. They need to be written with accountability and testability in mind to be useful in that function, not reduced to platitudes. 

Ryan

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Jeremy Rowley

unread,
Jun 12, 2025, 11:06:30 AM (7 days ago) Jun 12
to Ryan Hurst, Tobias S. Josefowitz, dev-secur...@mozilla.org
Well yeah - because they’re often written by a compliance person who often has a very loose connection or understanding of anything engineering related. Even if they have engineering acumen, then they’re often distant enough from the process that they can’t capture what the CA is doing accurately. I still think the best fix is to transform the CPS doc into something machine readable and pulled straight from systems, but this could help with the diminishing transparency issue until we get to that state.

Tobias S. Josefowitz

unread,
Jun 12, 2025, 11:09:18 AM (7 days ago) Jun 12
to dev-secur...@mozilla.org, Ryan Hurst
On Thu, 12 Jun 2025, Ryan Hurst wrote:

> A document that says "We do X, we do Y" but also says "YOLO" isn't much of
> a promise in my opinion, and CPSs are intended to be a promise.

It isn't, and Relying Parties could interpret that as such. However when a
CA makes meaningful commitments and backs those up with "If we don't,
it'll actually have a price tag for us.", that's something else. I can
easily see how this can usefully inform trust decisions by Relying
Parties. Much better than not mentioning it.

Some mechanism to prevent this from straying into a PR opportunity "we
strive for excellence, but no promises!" might be adequate, but that
doesn't preclude the usefulness of commitments backed by something less
than revocation that otherwise wouldn't be made.

Tobi

Wayne

unread,
Jun 12, 2025, 11:24:51 AM (7 days ago) Jun 12
to dev-secur...@mozilla.org
It's not just 'a promise', it's a contractual agreement. I honestly find the resurgence of this CP/S discussion rather odd as I was under the impression it was re-discussed and agreed over a year ago with Entrust.

The decision to start talking about credits to subscribers is also rather narrow-minded on commercial CAs and their financial relationship with subscribers. That has no bearing on a CA's trust to relying parties, nor is it relevant to CAs that do not operate commercially such as Let's Encrypt.

With regards to a disconnection between actual practices and what the contractual document states: that is a CA's problem to fix, and incentives must exist to that end.

What is the actual outcome that people want from a discussion on this front ultimately? We're approaching a bizarre choice to lower all expectations for any statement by a CA in legally binding documents to mean anything, on the off-chance that they are held accountable and must face minor repercussions. How is this creating a more trustworthy and transparent environment for the WebPKI to operate in?

Frankly that the topic is also being brought up at CAB/F shows a lack of willingness by CAs to keep to their own agreements, and that reflects on the trust between parties. We can talk about automatically generating the certificate profile off of the actual configuration of issuance systems, but that seems to be a minor point of discussion and a bit irrelevant to the issue at-hand.

- Wayne

Tobias S. Josefowitz

unread,
Jun 12, 2025, 11:56:39 AM (7 days ago) Jun 12
to dev-secur...@mozilla.org, Wayne
Hi Wayne,

On Thu, 12 Jun 2025, Wayne wrote:

> The decision to start talking about credits to subscribers is also rather
> narrow-minded on commercial CAs and their financial relationship with
> subscribers. That has no bearing on a CA's trust to relying parties, nor is
> it relevant to CAs that do not operate commercially such as Let's Encrypt.

I am not sure I agree with that frame. While credits are a concept that
only applies to commercial CAs, there might be other concepts that achieve
the same for non-commercial CAs: A quantifiable incentive to adhere to the
stipulations in CP/CPS that go beyond the BRs.

Even if for-free CAs do not find a concept that allows them to commit to a
similar, quantifiable incentive, it might still be a win for the
ecosystem if at least commercial CAs could make binding, incentivised
commitments this way.

This would of course have to be testable and verifiable. In order to
achieve that, CAs making such commitments could also commit to disclosing
any violation of the "primary" commitment, including information allowing
to identify affected certificates. And in order for this to be an
effective incentive, the credit would also have to be credited if the
issue is reported by someone other than the subscriber.

I imagine a second set of CRLs could be offered, in principle, stating
such affected certificates as revoked. This would allow for ecosystem
participants to determine relevant state with current tooling, and it
would even allow Relying Parties to use this second set of CRLs for their
revocation checking purposes - in theory - if that is what they seek.

It is troubling that CAs do not have the confidence in their abilities to
adhere to the BRs, Root Store policies and even their own intentions to
the degree that they try to minimize "exposure" by minimizing additional
commitments in their CP/CPS, but ... that in itself is as valid as it is
regrettable. I sense we are stuck, and to move forward, we must think out
of the box a little bit. Especially if the alternative is publishing
errata, which is way less of an incentive not to screw up, if you ask me.

Tobi

Jeremy Rowley

unread,
Jun 12, 2025, 2:31:12 PM (7 days ago) Jun 12
to Tobias S. Josefowitz, dev-secur...@mozilla.org, Wayne
>> It's not just 'a promise', it's a contractual agreement. I honestly find the resurgence of this CP/S discussion rather odd as I was under the impression it was re-discussed and agreed over a year ago with Entrust.

It was well-discussed that you right now a CA must revoke where your CPS does not match your operations. This has a race to the bottom effect on CPS docs where every CA is incentivized to copy and paste the BRs as their CPS. More disclosure = more risk = less transparency, which I think is a bad outcome.  To date, all of these mis-issuances are primarily related to cert profiles and CAA records. I am awaiting the day where a CA must revoke all their certs because of a mistake in the choice of law provision or incorrect information in Section 8. Note that most contracts do not require you to terminate the services if wrong. You generally update the language or get a waiver from the party. The contract comparison doesn't really work because most breaches in contract do not result in turning off services - they result in the breaching party curing or obtaining a waiver.  

>> The decision to start talking about credits to subscribers is also rather narrow-minded on commercial CAs and their financial relationship with subscribers. That has no bearing on a CA's trust to relying parties, nor is it relevant to CAs that do not operate commercially such as Let's Encrypt.

This was just an example. The primary question is whether you can carve out sections from revocation where the CA exceeds the BRs but wants to offer more information about how they operate. 

>> With regards to a disconnection between actual practices and what the contractual document states: that is a CA's problem to fix, and incentives must exist to that end.

There's already an incentive to fix it as you'd still need to file a bug on any incorrect information. The question I'm asking is if there is a way to limit revocation to mistakes made in language that are not part of the BRs.  

>> What is the actual outcome that people want from a discussion on this front ultimately? We're approaching a bizarre choice to lower all expectations for any statement by a CA in legally binding documents to mean anything, on the off-chance that they are held accountable and must face minor repercussions. How is this creating a more trustworthy and transparent environment for the WebPKI to operate in?

A way to encourage transparency from CAs without interfering with the role of the CPS. I'm looking at a way to get more transparency and accuracy around CA operations, which is not by having someone write more documentation.

>> Frankly that the topic is also being brought up at CAB/F shows a lack of willingness by CAs to keep to their own agreements, and that reflects on the trust between parties. We can talk about automatically generating the certificate profile off of the actual configuration of issuance systems, but that seems to be a minor point of discussion and a bit irrelevant to the issue at-hand.

I actually think that's the major point of discussion as I see the fundamental problem with CPS docs as human created. So machine readable profiles and operational docs are a good way to address some of the issues. Most of these CPS errors aer human mistakes so eliminating humans from CPS docs does address the problem in a meaningful way.


--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Ryan Hurst

unread,
Jun 12, 2025, 4:32:06 PM (7 days ago) Jun 12
to Jeremy Rowley, Tobias S. Josefowitz, dev-secur...@mozilla.org, Wayne

In the WebPKI, the contract analogy collapses: the "other party" isn't a single customer who can waive a breach; it's billions of relying parties who get zero say and zero cure period. 

That's why revocation is tied to CPS alignment. Some folks claim that easing enforcement will somehow coax CAs into greater transparency, but when has relaxing the rules ever improved openness in the WebPKI? 

So far no one has presented an argument that credibly tells the story that the core problem is over-enforcement. Until that happens, I hope the community defaults to a more common-sense interpretation—maybe mine: that CPSs are still after-the-fact paperwork, hand-typed, not used by the organization that publishes them, and everyone just prays no one looks. That's a governance failure, not a transparency strategy.

Ryan

Tobias S. Josefowitz

unread,
Jun 12, 2025, 5:54:17 PM (7 days ago) Jun 12
to dev-secur...@mozilla.org, Ryan Hurst
Hi Ryan,

On Thu, 12 Jun 2025, Ryan Hurst wrote:

> In the WebPKI, the contract analogy collapses: the "other party" isn't a
> single customer who can waive a breach; it's billions of relying parties
> who get zero say and zero cure period.

That's precisely the point behind my thought experiment. Where CAs (well,
at least DigiCert, according to Jeremy) make contractual statements in
their TOS instead of in their CP/CPS, the benefit of such statements is
lost or at least diminished when it comes to Relying Parties. Making such
statements in the CP/CPS instead would make it accessible to Relying
Parties, to be factored into their trust decision.

> That's why revocation is tied to CPS alignment. Some folks claim that
> easing enforcement will somehow coax CAs into greater transparency, but
> when has relaxing the rules ever improved openness in the WebPKI?

A CPS stating "we will do X, if we don't do X we will relinquish the
revenue we made on the offending cert via credit to the subscriber", in
ways that are testable, verifiable and enforceable would actually
strengthen the ability and position of Relying Parties.

> So far no one has presented an argument that credibly tells the story that
> the core problem is over-enforcement. Until that happens, I hope the
> community defaults to a more common-sense interpretation?maybe mine: that
> CPSs are still after-the-fact paperwork, hand-typed, not used by the
> organization that publishes them, and everyone just prays no one looks.
> That's a governance failure, not a transparency strategy.

Obviously, if this sets off a trend where even the currently existing
guarantees made to Relying Parties as "absolute" (backed by revocation)
will be downgraded to a credit, or a donation to Mozilla or, say, the
local Zoo, that would be an undesirable consequence. But I would like to
explore of such a dynamic could be avoided, while the benefits of making
such commitments that are currently offered to Subscribers accessible to
Relying Parties, I think that is worth exploring.

This would not change the mechansim you're referring to, the guarantee
given by the CPS would still have to be upheld. Just the guarantee
wouldn't be "we don't do X", but "we don't do X, or else, we will refund
the subscriber, disclose the circumstances that lead to a certificate
issued where we did X, and make the certificate identifiable". What
happens if that guarantee is violated (alltogether), it would be all the
same as now. What changed would only be the shape of the commitment.

I absolutely do not want to encourage CAs - or anyone in the WebPKI
ecosystem - to exploit loopholes, or even to deviate drastically from
established practices, and I don't think it would be a good idea for
anyone to start doing this without reaching some form of consensus on this
suggestion I am discussing. But that said, which parts of RFC 3647 or the
BRs would you to interpret to actually disallow commitments conforming to
such a structure as proposed, assuming they'd include commitments ensuring
testability, verifiability and measurable consequences for scenarios the
CA "weakly" commits to avoiding?

Tobi

Mike Shaver

unread,
Jun 12, 2025, 7:22:58 PM (7 days ago) Jun 12
to Tobias S. Josefowitz, dev-secur...@mozilla.org, Ryan Hurst
(Not speaking for my employer.)

Is this actually a problem that needs attention? How many certs, pre-Microsoft(*), have been revoked due to CPS errors? Is it even six figures?

Maybe this energy would be better directed towards things like greater audit transparency, which might actually improve the security of the web.

(*) Microsoft is in the millions but they’re going to just let most of them expire instead of actually doing the revocation—shh! I don’t think we’re supposed to notice that!

Mike

Mike Shaver

unread,
Jun 12, 2025, 7:24:02 PM (7 days ago) Jun 12
to Tobias S. Josefowitz, dev-secur...@mozilla.org, Ryan Hurst
And to borrow an old turn of phrase, if the penalty for a violation is a fine, then the law only applies to poor CAs.

Mike

Roman Fischer

unread,
Jun 13, 2025, 4:20:18 AM (7 days ago) Jun 13
to dev-secur...@mozilla.org

Hi all,

 

> I actually think that's the major point of discussion as I see the fundamental problem with CPS docs as human created. So machine

>readable profiles and operational docs are a good way to address some of the issues. Most of these CPS errors are human mistakes

>so eliminating humans from CPS docs does address the problem in a meaningful way.

 

I only partially agree. Automation is also (at least for now) written by humans and does and will contain errors. You shift the problem from humans creating documents to humans creating automation. Yes, once it's running properly, it will continue to so until something changes and it doesn't run or produces wrong output.

 

Kind Regards

Roman

Tobias S. Josefowitz

unread,
Jun 13, 2025, 5:58:07 AM (7 days ago) Jun 13
to dev-secur...@mozilla.org, Mike Shaver
Hi Mike,

On Thu, 12 Jun 2025, Mike Shaver wrote:

> And to borrow an old turn of phrase, if the penalty for a violation is a
> fine, then the law only applies to poor CAs.

I clearly agree that would be an issue when it comes to BR violations, but
I am not suggesting for CAs to be able to opt out of them in such a
fashion. My curiosity only covers optional, additional commitments a CA
might award Relying Parties through their CP/CPS.

And in that scenario, I fail to see how transparently communicated
commitments alongside the incentive structure of the CA to follow them
would create a dynamic subject to your concern.

Is the concern that "rich" CAs would voluntarily commit to additional
limitations, only to then violate them as some part of a "weird flex"? If
not that, what is it?

Tobi

Mike Shaver

unread,
Jun 13, 2025, 8:35:23 AM (7 days ago) Jun 13
to Tobias S. Josefowitz, dev-secur...@mozilla.org
On Fri, Jun 13, 2025 at 5:58 AM Tobias S. Josefowitz <to...@opera.com> wrote:
And in that scenario, I fail to see how transparently communicated
commitments alongside the incentive structure of the CA to follow them
would create a dynamic subject to your concern.

Is the concern that "rich" CAs would voluntarily commit to additional
limitations, only to then violate them as some part of a "weird flex"? If
not that, what is it?

Right now there is nothing punitive about a CA’s responsibilities in the face of CPS-related misissuance: the things required of them are strictly remedial, being simply to correct the error they imposed on the WebPKI’s collection of valid certs. They don’t even require revocation, if the CA has chosen to structurally mitigate the risk of misissuance by issuing Short-Lived certificates. There is nothing inflicting harm as an attempt to balance the possible operational disincentive a poorly-organized CA might have against performing appropriate remediation. Even the prospect of distrust is remedial: if the CA can’t be trusted, they shouldn’t be trusted.

What you’re proposing is that we add something punitive, which means that, I assume, you believe that we would need something to motivate action that the CA might otherwise not take without the prospect of that punishment. That motivational effect would not be equal across all CAs, which means that the calculus of be-careful-or-pay would not be a reliable means to get back to the important state: the WebPKI’s corpus of valid certificates being trustable *in all their details* by relying parties. (I think that the focus on there being some commercial element to issuance and trust would not age well either, given that a large and quickly-growing portion of the web’s certs are not issued on a commercial basis. Who would get credits from Microsoft for their recent misissuance? Another part of Microsoft?)

And—I’m sorry this is so long, but whatever—the proposed punitive measure doesn’t even make the PKI whole! You would still have certs floating around with incorrect information, but the subscriber would have a trivial credit against their next webinar about automation.

If it is in the CA’s interest to provide additional voluntary constraints on their issuance, then it is because it is somehow in their interest to do so. That could be because they are chartered to improve the security of the web (would that they all were, tbh), or to distinguish themselves in a competitive marketplace. They should not pursue those things in ways that undermine the fundamental guarantee of the WebPKI: the attributes of the certificate are true. Either way, those constraints don’t matter unless they can be relied on by RPs. And if they are to be relied on then they need to hold for all valid certificates, so…

CAs can “maybe, we’ll try!” exceed BRs on their own initiative, without any interaction with the BRs or its remedial mechanisms; just don’t put it in the CPS. Maybe the CAB/F can give out ribbons for effort on the social event cruise, or provide badges for CAs to put on their web sites and in email signatures. “__321__ days since a commitment(*)-breaking issuance!”

But, again, I think this is a molehill and not a mountain. At best an amusing distraction from pursuits that might actually *improve* the reliability of the WebPKI, rather than make it even more complicated for relying parties to navigate.

Mike

Jeremy Rowley

unread,
Jun 13, 2025, 10:19:00 AM (7 days ago) Jun 13
to Mike Shaver, Tobias S. Josefowitz, dev-secur...@mozilla.org
For clarity - I am not representing a CA and do not have the authority to represent any CA. My comments are my own.

I think the credits are a red herring and just an example of the question. Can a CA avoid revocation for something wrong in their CPS by adding qualifying language? My thought is no. If this is a contract, then you would certainly be able to do so, which means the contract analogy doesn't really work as "reasonable efforts" is a pretty common phrase as is a limitation on remedies. I don't think we should pretend the CPS is a contract as the penalty for breach is set and doesn't appear movable. The CPS is a requirement document with a stated requirement if you fail to meet those expectations - revocation of certificates. If that's not true, then the example scenario of offering credits instead of revocation would be totally legit. 

If it is a requirements doc, I think it would be interesting to create a CPS that just has the useful information and cuts out everything that doesn't make a difference to relying parties (such as section 8 and 9). 


--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Jeremy Rowley

unread,
Jun 13, 2025, 10:22:57 AM (7 days ago) Jun 13
to Mike Shaver, Tobias S. Josefowitz, dev-secur...@mozilla.org
I guess actionable steps that I see that would improve the CPS situation:
1) Require the repo to be in github instead of a CPS document
2) Require the profile and CAA information to be extracted directly from the CA's systems
3) Require the CA to disclose more about the validation methods used (be interesting to see a CA list the percent they use each validation method).

On 3, a ledger that logs each domain's validation method would be a better solution if we can get there.
 

Tobias S. Josefowitz

unread,
Jun 13, 2025, 11:25:25 AM (6 days ago) Jun 13
to dev-secur...@mozilla.org, Mike Shaver
Hi Mike,

On Fri, 13 Jun 2025, Mike Shaver wrote:

> Right now there is nothing punitive about a CA?s responsibilities in the
> face of CPS-related misissuance: the things required of them are
> strictly remedial, being simply to correct the error they imposed on the
> WebPKI's collection of valid certs. They don't even require revocation,
> if the CA has chosen to structurally mitigate the risk of misissuance by
> issuing Short-Lived certificates. There is nothing inflicting harm as an
> attempt to balance the possible operational disincentive a
> poorly-organized CA might have against performing appropriate
> remediation. Even the prospect of distrust is remedial: if the CA can't
> be trusted, they shouldn't be trusted.
>
> What you're proposing is that we add something punitive, which means
> that, I assume, you believe that we would need something to motivate
> action that the CA might otherwise not take without the prospect of that
> punishment. That motivational effect would not be equal across all CAs,
> which means that the calculus of be-careful-or-pay would not be a
> reliable means to get back to the important state: the WebPKI's corpus
> of valid certificates being trustable *in all their details* by relying
> parties. (I think that the focus on there being some commercial element
> to issuance and trust would not age well either, given that a large and
> quickly-growing portion of the web's certs are not issued on a
> commercial basis. Who would get credits from Microsoft for their recent
> misissuance? Another part of Microsoft?)
>
> And - I'm sorry this is so long, but whatever - the proposed punitive
> measure doesn't even make the PKI whole! You would still have certs
> floating around with incorrect information, but the subscriber would
> have a trivial credit against their next webinar about automation.

I understand how you get to that perspective, as there is a punitive
effect, undoubtedly. But then, forced revocation for an extra second of
validity just because the CP/CPS state "90 days" is also a punitive
effect. As you rightly state, there is no punitive intention to this
mechanism currently, and through my lense, there isn't one on the idea I'm
suggesting to explore.

To phrase it in what is hopefully a clearer way of expressing it, what I
am suggesting to explore is whether there would be benefits for Relying
Parties and as such the WebPKI ecosystem as a whole if we allowed CP/CPS
statements with the following structure:

Certificates conform to the BRs AND (X or Y).

Where X is a rather concrete statement about the certificate, and Y
describes what happens in a situation where a cert not conforming to X was
issued against the communicated intention of the CA. The totality of the
statement would still be enforced and subject to scrutiny as any CP/CPS
statement is today. And X and Y would have to be chosen in such a way to
not diminish the possible scrutiny.

The current model suggests something very binary. Certs conform, and if
they don't, they'll be revoked if detected. This already introduces a
margin of error.

My thought is that having a statement structured as proposed above gives
those Relying Parties who consume the CP/CPS for their trust decisions
additional data and signal. Indeed, it would be even less binary, but a
Relying Party going through the trouble of consuming the CP/CPS might use
this to gauge if this gives them sufficient trust in the likelihood of a
certificate conforming to X, as backed by Y, to accept the remaining
possibility of the certificate not conforming to X.

I don't presume what the outcome of that judgement would be for Relying
Parties, but I would assume it actually depends on their trust needs.

This is no more meant to introduce a punitive component than the forced
revocation is, instead it is meant to give Relying Parties additional,
possibly useful signal. It might be a bad idea, and I am thankful for
being able to discuss it here. But in my mind, it doesn't have the shape
you prescribe to it.

In fact statements of such form might well introduce additional signal to
the WebPKI ecosystem as a whole, and Browser Root Store Programs. The X
would show us what CAs consider possibly important constraints for their
subscribers, and the Y tells us their level of confidence in being able to
reliably adhere to it. I can't promise that it will be, but I'm proposing
it might be extremely useful, and by allowing CAs to make such statements
even in cases where they're not entirely confident would thus allow for
more (suggestedly useful) signal for Relying Parties, Browser Root
Programs and the WebPKI ecosystem in general.

> If it is in the CA's interest to provide additional voluntary
> constraints on their issuance, then it is because it is somehow in their
> interest to do so. That could be because they are chartered to improve
> the security of the web (would that they all were, tbh), or to
> distinguish themselves in a competitive marketplace. They should not
> pursue those things in ways that undermine the fundamental guarantee of
> the WebPKI: the attributes of the certificate are true. Either way,
> those constraints don't matter unless they can be relied on by RPs. And
> if they are to be relied on then they need to hold for all valid
> certificates, so?

Well they could be relied upon, but only in their totality:
Cert conforms to the BR AND (X or Y). That's not different. What's
different would only be the shape of the constraint.

> CAs can - maybe, we'll try! - exceed BRs on their own initiative,
> without any interaction with the BRs or its remedial mechanisms; just
> don't put it in the CPS. Maybe the CAB/F can give out ribbons for effort
> on the social event cruise, or provide badges for CAs to put on their
> web sites and in email signatures. "__321__ days since a
> commitment(*)-breaking issuance!"

And that's the point. As we have heard in this discussion, some CAs do. I
think that's something to welcome and support, and to possibly allow this
to take a form that's possibly even more accessible and useful to Relying
Parties than warm words and a ribbon on a CABF social event cruise.

> But, again, I think this is a molehill and not a mountain. At best an
> amusing distraction from pursuits that might actually *improve* the
> reliability of the WebPKI, rather than make it even more complicated for
> relying parties to navigate.

The state of WebPKI is indeed frustrating, and at times, the dynamic
between Broswers, the WebPKI community at large, mdsp participants and CAs
can definitely be described as antagonistic. And often, Browsers and the
community must stand firm to protect what's left of its integrity.

However, I am exploring if this is a case where that might not be
necessary. Even though it might not be our biggest problem, it might be
something where we can create progress without further fueling into that
antagonism, even if it might not be the biggest problem.

That said, I can appreciate that subscribers might not just fail to
understand why their certificates have to be revoked and replaced with new
ones which - as far as they can tell - seem substantially identical, but
even get angered by it. I can understand that CAs, when faced with such
subscriber's lack of understanding, ultimately know nothing better to do
than to point at the CA/Browser Forum, or Browser Root Programs. I can
understand that that's an uncomfortable position to be in. I can
understand that CAs want to avoid that through minimizing the commitments
they make. Again, my question is just: Can Relying Parties, Browser Root
Programs and the ecosystem as a whole benefit from making it possible for
CAs to make reduced but maybe still useful commitments in a transparent,
enforceable, and "weighted" fashion. And if we think that would be the
case - how would it need to be structured to avoid this turning into a
regression, because indeed, I wouldn't want to see actually meaningless PR
statements in a CP/CPS either.

Tobi

Ryan Hurst

unread,
Jun 13, 2025, 12:29:45 PM (6 days ago) Jun 13
to Jeremy Rowley, Mike Shaver, Tobias S. Josefowitz, dev-secur...@mozilla.org
I agree with all of these points.

As for 3, there are some inexpensive ways we could get to this end-state state if any CA is interested in discussing this please reach out.

FWIW this thread inspired me to write up this post: https://unmitigatedrisk.com/?p=1041

Matt Palmer

unread,
Jun 14, 2025, 11:10:56 PM (5 days ago) Jun 14
to dev-secur...@mozilla.org
On Thu, Jun 12, 2025 at 11:06:10AM -0400, Jeremy Rowley wrote:
> Well yeah - because they’re often written by a compliance person who often
> has a very loose connection or understanding of anything engineering
> related. Even if they have engineering acumen, then they’re often distant
> enough from the process that they can’t capture what the CA is doing
> accurately.

This phrasing makes it sound like this state of affairs is a natural
law, rather than a choice (deliberate or otherwise) to operate that way.

- Matt

Jeremy Rowley

unread,
Jun 15, 2025, 12:13:15 AM (5 days ago) Jun 15
to Matt Palmer, dev-secur...@mozilla.org
Given the number of bugs related to CPS errors, I think it is the state of affairs. Changing the CPS to be more directly tied to the operations is a difficult feat. However, I don't think it needs to be the natural law which is why I'm proposing a change in how CPS docs are constructed and work. 

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Mike Shaver

unread,
Jun 15, 2025, 9:36:22 AM (5 days ago) Jun 15
to Jeremy Rowley, Matt Palmer, dev-secur...@mozilla.org
On Sun, Jun 15, 2025 at 12:13 AM Jeremy Rowley <rowl...@gmail.com> wrote:
Given the number of bugs related to CPS errors,

Perhaps you’re in a position to answer this question: how many bugs *have* there been in the last few years related to CPS errors, and how many certs have been subject to revocation for that reason, pre-Microsoft?

Mike

Suchan Seo

unread,
Jun 15, 2025, 10:32:55 AM (4 days ago) Jun 15
to dev-secur...@mozilla.org, Mike Shaver, Matt Palmer, dev-secur...@mozilla.org, Jeremy Rowley
about vague CSP obscure what a CA really does,  Let's Encrypt had signed some 6 day certificate after editing CSR to less than 100 days. https://crt.sh/?id=16774666176
but nobody in this thread realizes that, because nothing on CSR indicates they'd likely sign something much shorter than that.
P.S Would those 6 day certificate considered misissuance if LE kept 90 days wording in CSR?
2025년 6월 15일 일요일 오후 10시 36분 22초 UTC+9에 Mike Shaver님이 작성:

Aaron Gable

unread,
Jun 16, 2025, 12:39:37 AM (4 days ago) Jun 16
to Suchan Seo, dev-secur...@mozilla.org, Mike Shaver, Matt Palmer, Jeremy Rowley
Yes, issuing a six-day cert would have been a violation of our CPS had we not previously changed its phrasing. Before issuing our first short-lived cert, we carefully reviewed our CPS to ensure that the new profile would not violate any of the constraints in that document.

And I'm sure several people in this thread realize that we have done so, given that we publicly announced it back in February (https://letsencrypt.org/2025/02/20/first-short-lived-cert-issued/).

Aaron

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Jeremy Rowley

unread,
Jun 16, 2025, 8:00:13 PM (3 days ago) Jun 16
to Mike Shaver, Matt Palmer, dev-secur...@mozilla.org
Good question. I went through the last year of bugs and found the ones listed below. Determining what is a CPS violation vs. a BR violation is difficult because so many BR violations are also a CPS violation (as a lot of CPS documents mirror the BRs). I split it up between profile errors (at the bottom) and CPS related issues (at the top), both of which would be solved by automated CPS generation and a shift to treat the CPS document as a technical disclosure instead of a contract.

https://bugzilla.mozilla.org/show_bug.cgi?id=1970567 - Failed to list the full revocation reasons in its CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1969842 - This is about T&Cs but since the T&Cs generally incorporate the CPS I thought I'd count it? 
https://bugzilla.mozilla.org/show_bug.cgi?id=1969036 - violates the CPS and the BRs 
https://bugzilla.mozilla.org/show_bug.cgi?id=1965808 - Conflicting info in the CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1965806 - Missing OID on T&Cs (which would incorporate the CPS)
https://bugzilla.mozilla.org/show_bug.cgi?id=1965804 - CPS clarity issues
https://bugzilla.mozilla.org/show_bug.cgi?id=1963778 - CPS unavailability
https://bugzilla.mozilla.org/show_bug.cgi?id=1963629 - CPR in CPS not working
 https://bugzilla.mozilla.org/show_bug.cgi?id=1962829 - policy document mis-paste
https://bugzilla.mozilla.org/show_bug.cgi?id=1962830 - Cert change not compliant with CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1955365 - Reused keys in violation of CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1954580 - OCSP not published in time. This violated the BRs but would also violate the CPS if such items were actually dictated by the CPS instead of just the BRs.
https://bugzilla.mozilla.org/show_bug.cgi?id=1948600 - outdated CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1942241 - CPR in CPS not accepting attachments
https://bugzilla.mozilla.org/show_bug.cgi?id=1938236 - CAA issue
https://bugzilla.mozilla.org/show_bug.cgi?id=1939809 - This violated the ETSI requirement but not the BRs I think? Which would make it a CPS violation.
https://bugzilla.mozilla.org/show_bug.cgi?id=1935393 - Failed to update CPS docs (note that the proposal would help remediate this by requiring automatic updates to CPS docs as things change). 
https://bugzilla.mozilla.org/show_bug.cgi?id=1933353 - violation of CPS on OCSP responses
https://bugzilla.mozilla.org/show_bug.cgi?id=1932973 - violation of CAA checking
https://bugzilla.mozilla.org/show_bug.cgi?id=1931413 - violation of onboarding SOP
https://bugzilla.mozilla.org/show_bug.cgi?id=1925106 - incorrect CP provided
https://bugzilla.mozilla.org/show_bug.cgi?id=1921573 - CPS issue on DN
https://bugzilla.mozilla.org/show_bug.cgi?id=1918380 - Business entity not permitted in CPS
https://bugzilla.mozilla.org/show_bug.cgi?id=1914911 - CAA disclosure issue
https://bugzilla.mozilla.org/show_bug.cgi?id=1904749 - CAA record issue
https://bugzilla.mozilla.org/show_bug.cgi?id=1904257 - Incorrect CPR address


I'm listing the profiles issues as well as the proposal would address this issue, or at least make these issues more readily identifiable. If CAs are required to provide the profile directly from the CA, the profile can easily be compared to the BRs and issues identified. Right now the profile may not match the CPS so the CPS will be compliant but the profile will not match the requirements. 
Profiles mismatch:

Jeremy Rowley

unread,
Jun 16, 2025, 8:35:57 PM (3 days ago) Jun 16
to Mike Shaver, Matt Palmer, dev-secur...@mozilla.org
I was trying to use AI to analyze CPS docs to see where interesting information might be. I've only looked at three different CAs so far, but figured I'd share the results:

Sectigo – 70% of the CPS is nearly identical to the BRs. The primary variation is in section 9 and the use of a reseller network. 
Sectigo does not use:
- Method 3.2.2.4.12 (Validating Applicant as Domain Contact)
- Method 3.2.2.4.21 (DNS Labeled with Account ID - ACME)

DigiCert – about 80% overlap in language with the BRs. The primary differences are that the DigiCert CPS covers public trust (not just TLS) and the legal section. 
Digicert does not use:
1)  Method 3.2.2.4.20 - TLS Using ALPN

2) Method 3.2.2.4.21 - DNS Labeled with Account ID - ACME

 

Comparing the two CPS docs together, the AI found they were about 85% similar on TLS. Excluding the business sections (section 1 and section 9), the CPS docs are 95% similar.

Let's Encrypt has about a 77% overlap with both DigiCert and Sectigo.  The major differences in the LE CPS are:
1) Business terms and the lack of OV certificates
2) Automation requirements for issuing certificates
3) No language around the use of RAs (because LE doesn't use RAs)

82% of all documentation is about how the CA matches the BRs. 

This is, of course, subject to some interpretation by the AI used and I haven't reviewed it in full. All CPS docs provide value in that they list the CPR, the CAA records used, and the BR methods permitted for validation. Is there a CPS I can look at that provides substantial additional information beyond the BRs? 

Ryan Hurst

unread,
Jun 17, 2025, 7:28:28 PM (2 days ago) Jun 17
to dev-secur...@mozilla.org, Jeremy Rowley, Matt Palmer, dev-secur...@mozilla.org, Mike Shaver

Hi Jeremy,

Thanks for pulling the Bugzilla data and BR-overlap figures. I would be very cautious about relying on this LLM analysis. Studies show over 80% hallucination rates on legal queries (Dahl et al., 2024), and even specialized models like Westlaw produce 17-33% unsupported assertions with one in six "very confident" answers being wrong (Magesh et al., 2024).

A CPS is essentially a contract densely woven with BR, RFC 5280, ETSI, root program, and national law references. If a model can invent a Supreme Court case, it can just as easily invent a BR exception or declare similarity that does not exist.

When reviewing these documents, we must consider the context of what they reference. CPSs often say "not stipulated," "as per XYZ," or "when not stated the [referenced standard] govern." This requires topological analysis within the broader regulatory framework that domain experts inherently apply. For example, a CPS might claim RFC 5280 compliance, but you need to verify that against the actual RFC. It's like legal document analysis where federal law can preempt local law.

Beyond that, two CPSs can look 80% identical yet behave very differently if one CA blocks non-compliant certs pre-issuance and auto-updates its CPS, while another relies on quarterly manual checks.

This is why I doubt these similarity figures are accurate, but even if they are, the larger question is whether CPSs should clone the BRs or describe actual practices. Some BR sections make sense to duplicate when you implement the corresponding mechanisms, but edits are often needed to keep documents cohesive and understandable.

The operational differences that really matter for compliance are often invisible to text-based analysis, which is why we need CPSs that actually represent practices with enough detail that outsiders can understand what promises are being made.

That is not to say that LLMs are not useful in the context of CPS analysis but I can say with confidence to do it reasonably well requires more than just a LLM.

Best, Ryan

Mike Shaver

unread,
Jun 18, 2025, 9:07:35 PM (yesterday) Jun 18
to Jeremy Rowley, Matt Palmer, dev-secur...@mozilla.org
Thanks for this–I genuinely appreciate the effort–but I think it's not quite the right analysis.

For evaluating the impact of a change to CPS-error revocation policy, we want to consider the set of some CPS-related misissuances that were *not* also BR issues. (And separately were not so serious that they would still require revocation after such a loosening, but I don't exactly know where that line is proposed to be drawn.)

A little birdie tells me that analysis of such incidents over the last three years will reveal a total under 50,000 for the number of certificates that have been revoked due to a CPS-breaking-but-not-BR-breaking misissuance. (Prior to Microsoft's misissuance, of course, but that wouldn't have been an issue if it had happened after Microsoft completed the CRL sharding deployment because of very wide adoption of automation by Microsoft's subscriber base [also Microsoft, fair enough].)

I have not seen and certainly not performed the analysis in question, but I'm willing to trust it nonetheless.

Mike

Jeremy Rowley

unread,
Jun 18, 2025, 9:11:32 PM (yesterday) Jun 18
to Mike Shaver, Matt Palmer, dev-secur...@mozilla.org
Okay but I'm not proposing a change in CPS-error revocation policy. I am proposing a change in the way CPS docs are generated. I'd like them all to move to github and be pulled directly from the CA systems, turning them into a technical document instead something human created (and mostly filled with - IMO - less useful information). For example, does anyone read Section 9? What good is that? I have no concerns with revoking for CPS errors but I think the current way CPS docs are done is error prone and too human-dependant. 

Mike Shaver

unread,
Jun 18, 2025, 9:13:58 PM (yesterday) Jun 18
to Jeremy Rowley, Matt Palmer, dev-secur...@mozilla.org
Ah! I complected some subthreads beyond my ability to keep straight.

Pretend I replied to Aaron or Roman about treating CPS errors as less urgent-to-remedy than other directly-expressed-in-the-certificate errors, if you would be so kind.

Mike

Jeremy Rowley

unread,
Jun 18, 2025, 9:20:19 PM (yesterday) Jun 18
to Mike Shaver, Matt Palmer, dev-secur...@mozilla.org
Sounds good. The 50k number sounds about right based on those bugs too. Most CPS errors do not require revocation as they are issues in policy instead of cert profiles. Just anecdotally, I think most of the CPS bugs related to missed timelines, largely around CRL and OCSP availability. The ones I saw that required revocation were all profile or CAA violations. 
Reply all
Reply to author
Forward
0 new messages