Recent Entrust Compliance Incidents

17,618 views
Skip to first unread message

Ben Wilson

unread,
May 7, 2024, 10:59:30 AMMay 7
to dev-secur...@mozilla.org

Dear Mozilla Community,

Over the past couple of months, a substantial number of compliance incidents have arisen in relation to Entrust. We have summarized these recent incidents in a dedicated wiki page: https://wiki.mozilla.org/CA/Entrust_Issues. In brief, these incidents arose out of certificate mis-issuance due to a misunderstanding of the EV Guidelines, followed by numerous mistakes in incident handling (including a deliberate decision to continue mis-issuance), which have been compounded by a failure to remediate the issues in a timely fashion in line with well-established norms and root store requirements.

Our preliminary assessment of these incidents is that while they were relatively minor initially, the poor incident response has substantially aggravated them and the progress towards full remediation remains unacceptably slow. This is particularly disappointing in light of previous incidents in 2020 (#1651481 and #1648472), which arose out of similar misunderstandings of the requirements, similar poor decision-making in the initial response, and lengthy remediation periods that fell well below expectations. Entrust gave commitments in those bugs to address the root problems through process improvements, and it is concerning to see so little improvement 4 years later.

In light of these recent incidents, we are requesting that Entrust produce a detailed report of them. This report should cover in detail: 

  • The factors and root causes that lead to the initial incidents, highlighting commonalities among the incidents and any systemic failures;

  • Entrust’s initial incident handling and decision-making in response to these incidents, including any internal policies or protocols used by Entrust to guide their response and an evaluation of whether their decisions and overall response complied with Entrust’s policies, their practice statement, and the requirements of the Mozilla Root Program;

  • A detailed timeline of the remediation process and an apportionment of delays to root causes; and 

  • An evaluation of how these recent issues compare to the historical issues referenced above and Entrust’s compliance with its previously stated commitments. 

Finally, Entrust’s report should include a detailed proposal on how it plans to address the root causes of these issues. In light of previous guarantees given by Entrust in 2020 to ensure speedy remediation in future incidents, this proposal should include:

  • Clear and concrete steps that Entrust proposes to take to address the root causes of these incidents and delayed remediation;

  • Measurable and objective criteria for Mozilla and the community to evaluate Entrust’s progress in deploying these solutions; and

  • A timeline for which Entrust will commit to meeting these criteria.

We strongly recommend that Entrust go beyond their existing commitment to offer systematic, automated solutions for effective remediation, like ACME ARI and that it also include clear and measurable targets for the adoption of these tools by new and existing subscribers. 

This report should be submitted to Mozilla dev-security-policy mailing list for evaluation by the community and Mozilla, who will weigh whether Entrust’s report presents a credible and effective path towards re-establishing trust in Entrust’s operation. Submission should be no later than June 7, 2024.

We thank community members for their engagement on these issues and look forward to their feedback on Entrust’s report and proposed commitments.

 Thanks,

Ben Wilson

Mozilla Root Program

Watson Ladd

unread,
May 9, 2024, 10:14:00 PMMay 9
to Ben Wilson, dev-secur...@mozilla.org
Could we add a section for geographical incidents? This is slightly
outside your time window, but I think reading the series here has some
uncanny echos in the ones in your window.

https://bugzilla.mozilla.org/show_bug.cgi?id=1658792
https://bugzilla.mozilla.org/show_bug.cgi?id=1658794
https://bugzilla.mozilla.org/show_bug.cgi?id=1802916
https://bugzilla.mozilla.org/show_bug.cgi?id=1804753
https://bugzilla.mozilla.org/show_bug.cgi?id=1867130

On Tue, May 7, 2024 at 7:59 AM 'Ben Wilson' via
dev-secur...@mozilla.org <dev-secur...@mozilla.org>
wrote:
> --
> You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.
> To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/CA%2B1gtaYURqFzRqVmJdc7fBXE1mbGs25HpSkp5wZ0Xm%2BRG0YHCA%40mail.gmail.com.



--
Astra mortemque praestare gradatim

Ben Wilson

unread,
May 10, 2024, 12:28:00 PMMay 10
to Watson Ladd, dev-secur...@mozilla.org
Here are draft summaries of the additional historic incidents. I'll be adding these to the Entrust Issues page: https://wiki.mozilla.org/CA/Entrust_Issues

Invalid data in State/Province Field -

https://bugzilla.mozilla.org/show_bug.cgi?id=1658792

It was initially discovered that Entrust had issued 395 OV SSL certificates to a large international organization with “NA” for the state/province information. Entrust worked on a drop-down list to prevent the error. Certificate revocation would not occur within established timeframes, so Bug #1658794 for delayed revocation was opened. 

Late Revocation for Invalid State/Province Issue -
https://bugzilla.mozilla.org/show_bug.cgi?id=1658794

This is the delayed revocation bug related to Bug #1658792, above. Entrust said that when educating large institutions about rapid revocation, factors include who owns a certificate, where it is deployed, and the type of system or application that requires the certificate.  It also said that it was advocating automation with such institutions to help speed up certificate replacement and to minimize human error.

EV TLS Certificate incorrect jurisdiction -

https://bugzilla.mozilla.org/show_bug.cgi?id=1802916

Entrust mis-issued 322 EV certificates with the wrong state and locality jurisdiction fields due to complex data entry processes. Entrust implemented a different automated dropdown system for jurisdiction selection. Certificate revocation would not occur within established timeframes, so Bug #1804753 for delayed revocation was opened. 

Delayed Revocation for EV TLS Certificate incorrect jurisdiction -

https://bugzilla.mozilla.org/show_bug.cgi?id=1804753

This is the delayed revocation bug related to Bug #1802916, above. Entrust listed 8 Subscribers who were pushing back on immediate certificate revocation and the reasons given (e.g. extensions granted due to end-of-year freezes). Entrust committed to “continue to develop and extend methods for automatic certificate renewal.”

Jurisdiction Locality Wrong in EV Certificate -

https://bugzilla.mozilla.org/show_bug.cgi?id=1867130

Two EV TLS Certificates were mis-issued due to human error in the Jurisdiction Locality field. (The incident revealed 340 additional accounts needing similar updates.) Entrust said it would enhance its linting processes to include possibly using an external service to validate locality data against verified country data.

SHA-256 hash algorithm used with ECC P-384 key -

https://bugzilla.mozilla.org/show_bug.cgi?id=1648472

A Mozilla policy was adopted to require hashing with SHA-384 for an ECC P-384 key. Existing CAs using SHA-256 were not re-configured when Mozilla adopted this policy.  This incident revealed a serious gap in taking new requirements and implementing them. Ryan Sleevi noted that linting was just a safety net and not a systemic solution. Entrust was also criticized for the lack of detail in its incident report and its decision to not revoke the certificates.

Entrust committed to improving its monitoring and implementation of policy changes to prevent similar incidents. Ryan set forth a number of proactive systemic corrections that Entrust needed to take, rather than taking a reactive stance on matters of non-compliance.

Entrust committed to rigorous review of certificate profiles, browser policy revisions, and industry developments. As a final comment, Ryan said, “My big concern is, going forward, we see incident reports from Entrust take a more systemic, holistic response, like Comment #16, to try and cover the scenarios, and to provide sufficient detail about the situation and its failures to understand how those relate. The goal isn't to make CAs wear proverbial sackcloth, it's to try and make sure we're understanding how things go wrong, so that we can effectively collaborate on identifying solutions to avoid that going forward.”

Late Revocation due to SHA-256 hash algorithm -

https://bugzilla.mozilla.org/show_bug.cgi?id=1651481

This bug is related to Bug #1648472.  Entrust issued TLS certificates using ECC P-384 keys hashed with SHA-256, contrary to Mozilla policy requiring SHA-384 for hashing. Entrust’s initial decision was to allow certificates to expire naturally without revocation, but this was revised with a decision to revoke all affected certificates. Entrust committed to: filing incident report within one business day for future incidents, filing late revocation incident reports within the required 24 hours or 5 days, as applicable, and advising Subscribers about revocation within 24 hours or 5 days, or provide an explanation if they are unable to meet such timeframes. Entrust was told it needed to align its revocation procedures more closely with the Baseline Requirements and Mozilla’s policy, especially in providing a detailed rationale for any delays in revocation on a per-subscriber basis and ensuring timely revocation in line with the Baseline Requirements.

 

George

unread,
May 10, 2024, 12:54:27 PMMay 10
to Ben Wilson, Watson Ladd, dev-secur...@mozilla.org
Although it was not mentioned in the original bug, it may be worth adding that the certificates in bug 1867130 were also not revoked within 5 days of discovery. Entrust might've based the start of the 5 day deadline at the time the "Director of compliance confirmed investigation conclusions to support team" at 2023-11-21 15:00 UTC with all certificates being revoked by 2023-11-26 14:50 UTC, but I don't think that's correct if that was the case.

Ben Wilson

unread,
May 10, 2024, 1:08:50 PMMay 10
to George, Watson Ladd, dev-secur...@mozilla.org
Added " Although not expressed in the bug, it appears that certificate revocation was delayed as well."

Chris Bailey

unread,
May 11, 2024, 3:04:24 PMMay 11
to 'Ben Wilson' via dev-security-policy@mozilla.org

To Ben Wilson and the Mozilla Community:

 

I want to acknowledge your letter and the input from you and the community. We agree that we have go-forward opportunities to improve.

 

To that end, I want to confirm our intent to provide a full written response to you and the community prior to June 7. Until then, please contact me directly with additional questions or feedback.

 

Sincerely,

Chris Bailey

VP-Digital Certificates

Entrust

Any email and files/attachments transmitted with it are intended solely for the use of the individual or entity to whom they are addressed. If this message has been sent to you in error, you must not copy, distribute or disclose of the information it contains. Please notify Entrust immediately and delete the message from your system.

Wayne

unread,
May 11, 2024, 6:37:52 PMMay 11
to dev-secur...@mozilla.org
I can't speak for everyone but in an issue of public trust asking for private feedback and concerns is not helping matters.

One of the prevalent topics that have came up in these issues is shorter certificate lifespans, certificate automation and how Entrust are working very hard with their customers. I'm very curious if Entrust can quantify this in any way?

Taking a step back and just looking at their public statements regarding lifespans, automation and ACME should give us an idea of their internal viewpoint and how this topic is presented to customers. Outside of the first issue I won't be quoting bugzilla, it's there solely to provide context as I can't see an earlier point that automation was promised.

Let's dive in, sources all provided if there are any questions about the rough transcript or context.

1: 2023-03-27: Entrust: Delayed Revocation for EV TLS Certificate incorrect jurisdiction
https://bugzilla.mozilla.org/show_bug.cgi?id=1804753#c8
---
Late revocations are base primarily by Subscribers which have not implemented automation. We have increased our efforts to extend implementation of ACME. We are also considering implementing the ACME Renewal Information (ARI) Extension, which allow the certificate to be automatically renewed before the revocation date.

In the shorter term, we will provide Subscribers with automated reminders on the revocation date for miss-issued certificates. We will plan to allow short extensions, based on the severity of the miss-issuance.
---



2: 2023-04-21: Google’s 90 day proposal for TLS certificates
https://www.entrust.com/blog/2023/04/googles-90-day-proposal-for-tls-certificates/
---
Even if CAs and other browsers don’t share Google’s objectives, there is a chance that Google could unilaterally make this change in its root program and force the entire industry in this direction in a time of their choosing. We hope that browsers will not make this decision unilaterally, but instead allow the decision to be made with broad industry and website owner consensus.

Another issue is that Google has presented no public research or factual data showing that such a change to the ecosystem is necessary or useful in many use cases. We believe there will be much discussion before a 90-day ballot will pass at the CABF as several CAs have indicated that a requirement for 90-day certificates might have far-reaching implications. There have also been several EU governmental bodies concerned over the market and competitive implications of Google’s proposal and the impact on eIDAS Qualified Website Authentication Certificate objectives, which are now being reasserted in the EU’s update of its eIDAS legislation.

Entrust does not believe that a maximum 90-day limit for TLS certificate lifetimes is the only method to drive automation and the deployment of ACME. Additionally, Entrust doesn’t believe that ACME is the only method for automation, or that it would be accepted by some of the most complex subscriber secure server deployments. Rather, we believe subscribers should be encouraged to deploy automation, but do not need to be discouraged by the cost and complexity of certificates with 90-day maximum validity.

While Entrust is not currently in favor of a mandatory 90-day certificate limit, we have no objection to 90-day certificates if that is what a website wants to use. We are always working to improve or extend its validation, issuance, and management processes, including greater use of automation through integrations with certificate lifecycle management (CLM) solutions such as Entrust Certificate Hub, AppViewX, Venafi, and ServiceNow, as well as automation through ACME v2, CMP, SCEP, and other new methods.

We understand that this Google proposal may be causing our customers considerable concern. In accordance with Google’s instructions on its Chrome Root Program Policy, we encourage customers to direct any questions or input regarding the Google proposal to chrome-ro...@google.com; please feel free to share a copy with us at ecs.fe...@entrust.com.
---



3: 2023-07-26: Entrust Cybersecurity Institute Explains: 90-day TLS/SSL Certificate Lifecycle
https://www.youtube.com/watch?v=GcKunPD5SRw
---
0:41-1:48
Today we're using certificates that are generally valid for about a year. Technically, according to the requirements, we can go to 398 days to give you a bit of room so that you can have one day of the year at which you renew your certificates. That is something that is easy to remember, scheduling your agenda as a yearly renewal of those certificates when they need to happen.

When they're moving to shorter-lived certificates, such as the proposed 90 days, this means that you need to do this now four times a year in a window of 90 days.

Moving to 90-day certificates means that you're practically going to renew your certificates probably every 60 days. Meaning that in the end, you have to replace your certificates four to six times a year, depending on what window you allow to renew that certificate.

90-day certificates will have a few benefits. So one is that they will drive automation.


2:51-3:30
That doesn't [mean to] say that there's no problems and that 90-day certificates are actually going to solve this. Crypto agility is important, but that means more than just automating your certificate lifecycle.

For example, if the CA would give you an indication through the automated system that your certificate must be replaced, how does currently the CA know that your system is capable of running a certain algorithm? Have you updated your libraries?

So we still have a long step ahead of us really to make that crypto agility a reality.

4:41-end

We're working with the Internet Engineering Task Force, where Anthos(?) created a proposal for a sort of auto-discovery based on the most used automation, the ACME Auto Certificate Management Environment.

And that would help our customers actually to simplify this mechanism of renewing certificates in different platforms with different systems, without having to reconfigure every individual system to work with the certificate authority and the type of certificates they're dealing with. Together with[sic] very important is one of the risks of automation, is that you're not doing it as a human.

If you do a process, if you follow a step-by-step process that is defined and tested, then you know the outcome, because you're following the steps. But in automation, something might go wrong. And how do you know? You need to make sure that systems are notifying you as someone is watching the notifications.

That's also why in this similar proposal, we've included the mechanism of backup configuration. So if one process would fail, there is another process that can be followed. But other things are still in development. And we will see how that turns out. But the ecosystem seems to be supporting it.
---



4: 2023-07-26: Entrust Cybersecurity Institute Explains: Zero Trust and TLS/SSL Certificate Management
https://www.youtube.com/watch?v=nd0QCvu2F_E
---
1:20-2:18

Well, first I would say we need to adopt automation. But then at the same point, I would point them at the risk of automation. Where processes by humans are easily manageable and can have multiple steps to ensure they flow correctly. Without automation, this could be more challenging.

For example, if you implement automation for your TLS certificates, this means that you need to have a credential for your certificate authority. You need to also have a credential, for example, for your domain name provider, your DNS system. And that's all needed to prove domain control and to make sure that a certificate with the right identity is issued for your endpoints.

But what if these credentials get compromised? Look at the DNS system.
---



5: 2023-08-14: Short-lived Certificates finally approved
https://www.entrust.com/blog/2023/08/short-lived-certificates-finally-approved/
---
After more than 10 years, short-lived TLS certificates are finally permitted by the browsers based on CA/Browser Forum ballot SC-063. Gerv Markham started a short-lived certs discussion in 2014, where he advised he was reviewing the 2012 CA/Browser Forum discussion on the topic. He advised that short-lived certificates was a plank of the Mozilla revocation strategy. There was also a paper prepared in 2012 called Towards Short-Lived Certificates. The paper stated OCSP is as good as dead, so the CAs should issue certificates with a very short lifetime. I suppose no one thought it would take so much time.

Short-lived certificates are designed to help address a certificate revocation issue. Back in 2012, Adam Langley discussed the seat-belt issue, where it works fine, but snaps when you crash. This was based on the fact the browser implements soft-fail revocation checks where the CRL or OCSP response is ignored.
---



6: 2024-03-26: The Path to 90-Day Certificate Validity: Challenges Facing Organizations
https://www.entrust.com/blog/2024/03/the-path-to-90-day-certificate-validity-challenges-facing-organizations/
---
Certificate lifespan is getting shorter

Over the years the cybersecurity industry has undergone notable transformations requiring organizations to implement new best-practice standards, often at a short notice.

In 2020, Apple unilaterally opted for shorter TLS certificate durations, reducing them from three years to 398 days, thereby increasing the burden on certificate management. Subsequently, Apple introduced shorter lifespans for S/MIME certificates at the start of 2022. In the past year, both code signing and S/MIME users faced additional alterations, while Google proposed transitioning to 90-day certificates, a subject we have explored in our latest webinar. Anticipating further changes, particularly with the rise of artificial intelligence (AI) and the looming risk of post-quantum (PQ) computing, organizations must enhance their agility.

Today, TLS/SSL certificates are typically valid for about a year, according to the Certification Authority Browser (CA/B) Forum requirements. This yearly renewal cycle is convenient for organizations to manage and schedule. However, transitioning to shorter-lived certificates, like the proposed 90-day validity period, will require more frequent renewal efforts. With 90-day validity, organizations will need to renew certificates four times every 12 months within that timeframe. In practice, due to the need for buffer time, certificates may need to be renewed every 60 days. Ultimately, this change could lead to replacing certificates more than six times every 12 months, depending on the renewal window chosen.
---

Apologies that some of those got long, I wanted to preserve as much context as possible given how little material we have to work with.

I sincerely ask anyone if they can find any further communication on these topics by Entrust. Their helpdesk has tutorials on specific software setups, but I'm not seeing any actual push for their subscribers to do anything.

It would be very beneficial for Entrust to provide us with any information on what they've been communicating to their customers to promote shorter certificate lifespans and automation.

- Wayne

Amir Omidi (aaomidi)

unread,
May 15, 2024, 11:06:49 AMMay 15
to dev-secur...@mozilla.org
I wanted to also add that I'd like Entrust to address why they don't stop certificate issuances when they find out they're misissuing certificates?

As part of my series on Entrust. In Part 2 I found a concerning issue from Entrust that went unnoticed at the time, which shows a pattern of gross-negligence when it comes to incident response: Entrust: SHA-256 hash algorithm used with ECC P-384 key

In this incident, Entrust discovers that they're misissuing certificates on 2020-06-17. Despite finding out about this incident, they continue to misissue certificates all the way till 2020-06-24: https://crt.sh/?id=2998515551&opt=zlint

There have been a couple of examples of Entrust making this decision in the past, such as: Entrust: Printable String Constraint Failure. Similar to the previous incident, they did not disable issuance on their systems that was capable of misissuing certificates (emphasis mine):
>      Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

        The CA software has a bug which encodes quotation marks in the organization field using PrintableString instead of UTF8String. This software has not been fixed at this time.


And more recently, we saw this behavior in the start of this saga: Entrust: EV TLS Certificate cPSuri missing

Wayne

unread,
Jun 7, 2024, 6:41:03 AMJun 7
to dev-secur...@mozilla.org
As an advanced warning to Entrust on supplying the report please keep in mind that MDSP has a moderation queue for new members. If the report is to be sent to the mailing list today, then please make sure to use an account that has been pre-approved, or otherwise submit it early enough that a moderator can approve it to arrive on-time.

Ben Wilson

unread,
Jun 7, 2024, 12:51:48 PMJun 7
to dev-secur...@mozilla.org
All,
I have created Bugzilla Bug#1901270 as an Entrust "meta" bug for gathering all action items that will be included in their report. 
Please don't comment yet in that bug until Entrust has submitted its report and populated the Bugzilla bug with their action items.
Thanks,
Ben

Bruce Morton

unread,
Jun 7, 2024, 3:53:10 PMJun 7
to dev-secur...@mozilla.org, Ben Wilson
Please respond to comments you may have on our report or action items here.  We will track our progress against the action items list in Bugzilla under bug 1901270.
Mozilla Report - 7 Jun 2024.pdf

Amir Omidi (aaomidi)

unread,
Jun 7, 2024, 4:54:09 PMJun 7
to dev-secur...@mozilla.org, Bruce Morton, Ben Wilson
Can you please explicitly state answers to these following questions - I may have more in the future but these are the immediate ones that come to mind:


Question 1:
Why did section 2.5.1 of your report ignore: https://bugzilla.mozilla.org/show_bug.cgi?id=1890898 - Is this because your legal counsel told you this isn't an incident?


Question 2:
In the report, for the cpsUri incident you state:

> We continued to issue affected EV certificates after reporting the incident because we believed
this to be an error or unintended discrepancy between the CA/Browser Forum’s TLS Baseline
Requirements and the Extended Validation Guidelines. Following feedback from the Bugzilla
community, we fixed the certificates on March 18 and began notifying customers and revoking
affected certificates.

This does not answer my question above which stated:

> I wanted to also add that I'd like Entrust to address why they don't stop certificate issuances when they find out they're misissuing certificates?

A lot of this saga started with that decision to not stop issuance immediately and fix the certificate profile. Beyond that, I'd also like to challenge the "Following feedback from the Bugzilla
community"
statement made. You only took action after Ryan Dickson from the Chrome Root Program chimed in. In fact, you actually initially said you don't want to take action: https://bugzilla.mozilla.org/show_bug.cgi?id=1883843#c4, and then once again: https://bugzilla.mozilla.org/show_bug.cgi?id=1883843#c10

When making decisions like this, and making statements like you made in that bug, I expect a CA which (your words):

Entrust has been a certification authority for over two decades, and we have strived to be a
positive influence on development and advancement of our industry’s standards since the
CA/Browser Forum was created in 2005. We recognize that these incidents were unnecessary
and based on our own mistakes or misjudgments – in this we fell short of the high standard we
set for ourselves. We have thoughtfully considered the community’s questions and comments,
and this input is reflected in our plans.

says that about themselves knows that they need to show precedent for making that decision. I was expecting some supporting documents for justifying making that decision to be shown in this report. Can you please confirm if Entrust simply "made it up" when it made that report without consulting any prior incidents, or any supporting documents?

Question 3:

In your action items, you state: "Change organizational structure to enhance support, governance, and resourcing for compliance team". What does this mean? This is a very qualitative statement and I'd like some more details on this. What was it before, how did you identify what needed to change, what changes did you execute?

Question 4:

In your action items, you state: "Expand use of linters pre-issuance for all certificate types" - what certificate types weren't being linted pre-issuance? I believe that Entrust had made a promise to lint all pre-issuances in the past (https://bugzilla.mozilla.org/show_bug.cgi?id=1448986#c7) - is that promise different from this one? If so, how is it different?

Question 5:
 
Has entrust broken this promise they made four years ago? https://bugzilla.mozilla.org/show_bug.cgi?id=1651481#c6 - If not, can you explain how this promise doesn't apply to the recent failed/delay of revocation incidents? If so, can you please help us understand why we should believe you now?

Thanks!

Wayne

unread,
Jun 7, 2024, 5:30:31 PMJun 7
to dev-secur...@mozilla.org
Okay for my approach I will first note selective parts of this report that jumped out to me:

The highlights section only discusses topics that Entrust should have already been doing, so promising any improvement on that front is the bare minimum.

>2.1.1
>...
>This mis-issuance affected 26,641 EV certificates.
This is factually incorrect, I account for 26,668 impacted certificates and even stated as such in the 20 Incident Revocation Breakdown spreadsheet. I can understand the confusion as Entrust themselves could not find out the correct figure multiple times throughout their own incident, and still cannot.

>Note: During our investigation of this issue, we noted that a subset of 1,975 EV certificates were also issued without the Entrust EV policy identifier (OID), based on our interpretation of the ballot update.
This is also a miscount, presumably due to the original figure being 1963 + 6 certs on a test site that are being double-counted.

>1.
>As of June 7, 2024, all affected certificates related to these incidents have expired or been revoked.
>...
>2.1.1
>CPS Updates: As we updated our EV Certificate profile to resolve bug #1883843, we made two oversights. First, we did not immediately update our CPS to reflect the changes made to the EV certificates on March 18; the issue was fixed in approximately three days. There were 9,045 certificates affected (aside from those already scheduled for revocation as part of bug #1883843), as described in bug #1887753.
>
>Additionally, as we updated the CPS to add the CPS URI qualifier to the EV certificate profile, we inadvertently added this to the OV TLS certificate profile instead. While these certificates were issued in accordance with the TLS Baseline Requirements and our intended certificate profile, there were 6,008 OV certificates issued before the CPS was corrected, as described in bug #1890896.
Those 6000+ certicicates are still pending revocation, with Entrust having previously said they refuse to revoke.

Will they now be revoked?

>2.1.3 Root Cause Analysis
>We misinterpreted the requirement to keep cPSuri in EV certificates, and initially chose not to revoke, believing that the EV Guidelines were in error. Change procedures did not include appropriate review and quality assurance to ensure that updates to TLS Baseline Requirements and EV Guidelines were followed. Additionally, we did not use linters that included the Ballot SC-62v2 changes; we could have used pkilint, which was only in pre-release at that time.
This is an RCA of the technical cause, but not the refusal to listen to anyone telling you that you were wrong. What is not addressed is the continued mis-issuance either.


>2.2.2 Associated Bug List
>Open Date - Bugzilla - Description
>2024-04-03 - 1897630 - EV Locality Errors
Am I missing something or was this opened on 2024-05-19? Note only that the issue in question wasn't officially confirmed internally to Entrust until 2024-05-16. On checking later this is also within the appendix so I presume these dates are not from Bugzilla but from an internal tracker on Entrust's end.

Due to that I propose that Entrust had this incident open internally for over a month before reporting it on Bugzilla as the timeline also suggests.

>2.3.4 Improvement Plan
>...
>Automate CPR form to collect all required information at the outset from the reporter rather than relying solely on email
This goes back to policy issues discussed for years now, see:
https://github.com/mozilla/pkipolicy/issues/98
https://github.com/cabforum/servercert/issues/201
https://bugzilla.mozilla.org/show_bug.cgi?id=1650234

>2.4.1
>One of the reporters of the EV cert issue alerted Entrust that the Problem Reporting Mechanism in its CCADB entry was misaligned with its CPS and contained outdated information (bug 1894111).
To clarify at no time have I reported an EV cert issue to Entrust per their PRM. I am unsure how this got recorded internally on Entrust's end this way. Perhaps they flag anyone commenting on an incident as a reporter?

>2.5.1
>...
>In both incidents, delayed revocation was granted to subscribers who submitted specific requests for exceptions, primarily on the basis of avoiding disruption to critical operations.
This goes against all previous statements that the systems were critical infrastructure and harm would occur.

>2.5.3
>...
>When we addressed delayed revocation in 2020, we made several statements regarding how we would avoid delayed revocation in the future. We have continued to educate our subscribers regarding the need for greater agility and resilience, but we did not make the progress necessary to enable five-day revocation without large-scale disruption to their critical operations. Based on this root cause analysis, we believe the improvement plan below will address the issues identified.
I see no improvement plan that is materially different from prior claims that Entrust has made.

>2.5.4
>...
>We will work with our subscribers to ensure awareness and minimize delayed revocation requests; such requests will be handled only on a case-by-case basis, and only under limited circumstances. We will have this plan in place by the end of June.
I am confused over how your current plan differs from this, or how it will result in revocation occurring within the required timeframes. All that I am seeing is a commitment to offer your subscribers more 'choice' in their revocation timeframe. This is fundamentally misunderstanding the role revocation plays, and that the subscriber is not offered any choices.

>Policy Updates: We ensure that policies are updated and that they are communicated to subscribers. We are considering ways to increase visibility of the CA’s right to revoke certificates on short notice beyond our contract language. We also will add a warning to manual order pages and related emails to ensure subscriber understanding of required timelines.
I was under the impression Entrust did all of this already and have been actively educating subscribers for years as a leader in the field?

>Advancing ACME: We are supporting the ongoing work to automate certificate issuance and management. Entrust experts have authored two IETF drafts around ACME auto discovery, which will help to increase automation adoption by subscribers using public certificates.
Can you make any guarantees that ACME will be a requirement for subscribers going forward, and that they will not be charged extra for using these systems?

>Driving customer adoption of automation: We believe automation is critical to enable ongoing resilience. We have begun campaigns to urge subscribers to adopt automation solutions from Entrust or other providers, including offering our CertHub certificate lifecycle management tool for 12 months at no charge. We are looking at additional ways to drive customer adoption of solutions that enable them to minimize disruption in the event of a five-day or 24-hour revocation.
The first year is free to encourage tie-in answers that question I suppose. It is not on the subscriber to handle 24h or 5d revocation - it is the CA's duty, and so far this report does not understand that.

Now that that is done, let's consider what the report asked for, following the same bulletpoints as the start of this thread:
  • I can see the most minimal factors and root causes of initial incidents, and likewise for commonalities. No attempt exists that I can see to address or recognize systemic failures that caused these to occur, not any items address this going forward;
  • I see zero reference to any internal policies or protocols used by Entrust, and due to this no evaluation of their internal decision-making;
  • I see no detailed timeline of the remediation process, nor an apportionment of delays to root causes;
  • I see no mention of historical issues and how they relate to the more recent incidents Entrust has instead focused on.
Now onto the other 3 bullet points:
  • I do not personally see clear and concrete steps to address the root causes even as identified, only surface-level reviews that put us back where we started with the technical measures addressed as already required;
  • I see no measurable and objective criteria for progress, the majority is internal to Entrust and factors in issues they insist on NDAs and confidentiality over;
  • Although if we only consider their action items in a vacuum at least the last one is perhaps addressed?
To that end at most 2 out of 7 elements isn't that bad considering. The strong recommendation on going further than ACME ARI was never addressed either.

I'm sure others will comment but this is my first impressions in a vacuum.

Watson Ladd

unread,
Jun 7, 2024, 6:42:34 PMJun 7
to Bruce Morton, dev-secur...@mozilla.org, Ben Wilson
Dear Bruce,

This report is completely unsatisfactory. It starts by presuming that
the problem is 4 incidents. Entrust is always under an obligation to
explain the root causes of incidents and what it is doing to avoid
them as per the CCADB incident report guidelines. That's not the
reason Ben and the community need this report. Rather it's to go
beyond the incident report to draw broader lessons and to say more to
help us judge Entrust's continued ability to stay in the root store.
The report falls short of what was asked for, in a way that makes me
suspect that Entrust is organizationally incapable of reading a
document, understanding it, and ensuring each of the clearly worded
requests is followed. The implications for being a CA are obvious.

To start Ben specifically asked for an analysis involving the
historical run of issues and a comparison. I don't see that in this
report, at all. The list of incidents only has ones from 2024 listed,
there's no discussion of the two issues specifically listed by Ben in
his email.

Secondly the remedial actions seem to be largely copy pasted from
incident to incident without a lot of explanation. Saying the
organizational structure will be changed to enhance support,
governance and resourcing really doesn't leave us with a lot of
ability to judge success or explain how the changes made (sparse on
details) will lead to improvements. Similarly process weaknesses are
not really discussed in ways that make clear what happened. How can I
use this report if I was a different CA to examine my organization and
see if I can do better? How can we as a community judge the adequacy
of the remedial actions in this report?

Section 2.4 I find mystifying. To my mind there's no inherent
connection between a failure to update public information in a place
it appears, a delay in reconfiguring a responder, and a bug in the CRL
generation process beyond the organizational. These are three separate
functions of rather different complexity. If there's a similarity it's
between the latter two issues where there was a failure to notice a
change in requirements that required action, but that's not what the
report says! Why were these three grouped together, and not others?
What's the common failure here that doesn't exist with the other
incidents?

If this is the best Entrust can do, why should we expect Entrust to be
worthy of inclusion in the future? To be clear, there are CAs that
have come back from profound failures of governance and judgement. But
the first step in that process has been a full and honest accounting
of what their failures have been, in a way that has helped others
understand where the risks are and helps the community understand why
they are trustworthy.

Sincerely,
Watson Ladd

Wayne

unread,
Jun 8, 2024, 4:20:10 PMJun 8
to dev-secur...@mozilla.org
While Entrust have not provided details on their incident handling and decision-making as requested in this report, a few details have came to light in a reply to an incident today. This is specifically regarding #1886532 the delayed revocation CPSuri certificates.


I will do this mailing a list a courtesy by not embedding it all, however I feel that similar details not being included for all of these incidents in this report already is troublesome. This comment shows that Entrust has all of this information available, they just did not feel it worth including despite it being asked for.

Mike Shaver

unread,
Jun 8, 2024, 5:07:08 PMJun 8
to dev-secur...@mozilla.org
On Sat, 11 May 2024 at 15:04, 'Chris Bailey' via dev-secur...@mozilla.org <dev-secur...@mozilla.org> wrote:

To that end, I want to confirm our intent to provide a full written response to you and the community prior to June 7.

o_o

a full written response to you and the community prior to June 7.

 o_O

prior to June 7


O_______O

Date: Fri, 7 Jun 2024 12:53:10 -0700 (PDT) From: "'Bruce Morton' via dev-secur...@mozilla.org" <dev-secur...@mozilla.org> To: "dev-secur...@mozilla.org" <dev-secur...@mozilla.org> Cc: Ben Wilson <bwi...@mozilla.com> Subject: Re: Recent Entrust Compliance Incidents
In another context, I would think this to not even be worth joking about, but here it's just the cherry on the top of this whole process.

I have time booked this week to go through the report in more detail (every time I start I turn over another thing that is wrong? it's fractal) but I have to say, now that we've reached the end of this part of the process, that I find Entrust's response--in specific and in general--to be well beneath not only the expectations but indeed the mere *dignity* of the Mozilla root program process, the CA/BF commitments, and the trusted role that Entrust seems to so arrogantly believe cannot be lost.

I am generally known as a pretty charitable person, and in the mists of time when I was responsible for the Mozilla root CA process I very often advocated or outright decided in favour of using incidents as a tool for learning far beyond being a tool for culling underperforming CAs from our root store. Even at the point at which Ben posted the (extremely understated) message beginning this thread, I had hoped that we would see Entrust wake up from its long operational-quality slumber. I had hoped, sincerely, that Entrust would provide plans that were transparent, concrete, thorough, and sufficiently evident of meaningful reflection that the response would be celebrated as an improvement in the health of the WebPKI. It would mean that revenue from the financial disincentive that Entrust puts in place against Subscriber automation (I believe it's called "SUB-PKI-CEG-ACME") might in some small way be put towards strengthening the integrity of the web's security. I was bewildered by the non-responses that kept appearing in the bugs, but honestly I'm a sucker so I remained hopeful. There were VPs involved, Entrust values its security brand so much, their history is so long (I was doing infosec in the Ottawa area in the early 90s)--they were going to come through now that it had been made so abundantly clear that things were structurally broken.

Sadly, I then opened the response posted by Bruce.

When I first read the CPS URI incident, it seemed that Entrust thought that the Mozilla root community wasn't watching them. (To be sure, there had been some evidence in the preceding 4 years that this was the case.)

When the demeanour of Entrust's responses changed immediately after Ryan Dickson of the Chrome Root Program entered the bug, it made me feel that Entrust thought that the Mozilla root program and community didn't matter, and that their commitments to that program were not meaningful.

When the third spokesperson, of increasing seniority, restated Entrust's earnestness and pedigree without any actual concrete, measurable commitments, I started to suspect that Entrust thought that they could just "post through it", as the kids say.

But when I read this report, and especially when I compare it to the exceptionally clear request from Ben in his original message, I can only conclude that Entrust believes that this community and its participants are in fact medically-grade stupid.

I honestly hope that someone there is ashamed of this.

Mike

Watson Ladd

unread,
Jun 8, 2024, 6:15:54 PMJun 8
to Mike Shaver, MDSP
On Sat, Jun 8, 2024 at 2:15 PM Mike Shaver <mike....@gmail.com> wrote:
>"It would mean that revenue from the financial disincentive that Entrust puts in place against Subscriber automation (I believe it's called "SUB-PKI-CEG-ACME")"

So for four years, while Entrust told us it was working to get its
subscribers to automate, it was using this as a revenue opportunity
thus continuing manual processes? There is no way to reconcile this
with any sort of commitment here on Entrusts part to getting
subscribers to automate.

Could Mozilla update the root store policy to make clear that
improvements like ACME shouldn't be extra cost items but instead
considered part of the service provided to customers.

Mike Shaver

unread,
Jun 8, 2024, 6:22:52 PMJun 8
to Watson Ladd, MDSP
On Sat, Jun 8, 2024 at 6:15 PM Watson Ladd <watso...@gmail.com> wrote:
On Sat, Jun 8, 2024 at 2:15 PM Mike Shaver <mike....@gmail.com> wrote:
>"It would mean that revenue from the financial disincentive that Entrust puts in place against Subscriber automation (I believe it's called "SUB-PKI-CEG-ACME")"

So for four years, while Entrust told us it was working to get its
subscribers to automate, it was using this as a revenue opportunity
thus continuing manual processes? There is no way to reconcile this
with any sort of commitment here on Entrusts part to getting
subscribers to automate.

I find it hard to come to any other interpretation of the facts.

Could Mozilla update the root store policy to make clear that
improvements like ACME shouldn't be extra cost items but instead
considered part of the service provided to customers.

I think that would be an exceedingly reasonable change for Mozilla to make to its root store policy, personally.

Mike

Mike Shaver

unread,
Jun 8, 2024, 6:47:39 PMJun 8
to Paul Wouters, MDSP, Watson Ladd
On Sat, Jun 8, 2024 at 6:29 PM Paul Wouters <pa...@nohats.ca> wrote:

> On Jun 8, 2024, at 18:16, Watson Ladd <watso...@gmail.com> wrote:
>
> 

> Could Mozilla update the root store policy to make clear that
> improvements like ACME shouldn't be extra cost items but instead
> considered part of the service provided to customers.

I don’t have an opinion on this but as someone who at $dayjob has been forced to request non-acme certificates manually, let me assure you that any vendor requiring me to do that quickly gets pulled in the “vendors to migrate away from” list. Any CA preferring manual issuance over automated issuance is going to find itself out of business soon (as are vendors providing web services requiring their customers to send them certs once a year manually while promising to support acme “soon”)

I guess that’s a nice assurance, but what does “soon” mean? July? Are you buying enough certs to swing the economics of a major CA?

The problem right now is Subscribers who *don’t* want to adopt automation, perhaps in part because Entrust would charge them extra for it. They are the excuse being used too frequently for the dereliction of duty.

Mike

Jeffrey Walton

unread,
Jun 8, 2024, 9:48:41 PMJun 8
to Watson Ladd, Mike Shaver, MDSP
On Sat, Jun 8, 2024 at 6:15 PM Watson Ladd <watso...@gmail.com> wrote:
>
I would caution against that. Effectively, Mozilla would be fiddling
with the market. The market should be the one to punish (or reward)
Entrust for the premiums on manual issuance, not Mozilla. When
subscribers get tired of paying too much for the service, the customer
will go elsewhere.

In my mind's eye, there are two things to observe. First is the
CA/Browser Standards ("what we do"), and second is the CA Operating
Procedures ("how we do it"). The Browsers and collective CA's should
focus on the standard (what should be done), and each individual CA
should focus on the implementation (how it is done). The Forum should
not meddle in everyday affairs of a particular CA.

I understand the community wishes to punish Entrust for its chronic
problems. The CA/Browser Forum do not have tools for that, sans
delisting a particular CA. Maybe the CA/Browser Forum needs to adopt
some punishments, like forbidding a CA from issuing OV certificates or
EV certificates for a specified period of time, like a year. Or forbid
the CA from issuing other types of certificates, like S/MIME and code
signing certificates. The year embargo and lost revenue should be
enough of a haircut to get the CA to comply. If a CA continues to defy
the Forum, then delist the CA. There is plenty of competition in the
marketplace, so any particular CA will not be missed.

And remember, there are three parties in the ecosystem. The Browsers
and CA's are only two of them. There are also 5.35 billion relying
parties who use the internet. If the Forum wishes to acknowledge the
interests of the 5.35 billion internet users, then maybe removing
Entrust would be the best course of action. That's because Entrust
only seems to care about itself and its subscribers. It does not seem
to care about the the Forum, the standards produced by the Forum, or
the relying parties. Entrust has lost the trust of the community, and
that is the only commodity that matters to the relying parties.

Jeff

Mike Shaver

unread,
Jun 8, 2024, 11:11:25 PMJun 8
to nolo...@gmail.com, MDSP, Watson Ladd
On Sat, Jun 8, 2024 at 9:48 PM Jeffrey Walton <nolo...@gmail.com> wrote:
I would caution against that. Effectively, Mozilla would be fiddling
with the market. The market should be the one to punish (or reward)
Entrust for the premiums on manual issuance, not Mozilla. When
subscribers get tired of paying too much for the service, the customer
will go elsewhere.

Hey, uh, yeah…Mozilla sort of exists to “fiddle with the market” in ways that it feels protect the web’s users from the direction that The Market might otherwise take. It’s sort of “their thing”.

But that rather jarring dissonance aside, nobody is objecting to premiums on manual issuance. It is precisely the opposite: it is an objection to charging Subscribers *extra* for using *automated* tools that make the web safer (and which indeed should be cheaper for the CA to operate than a manual process, but you know how it is with rent seeking).

The CA’s primary responsibility is to the web’s users, not to its customers. They all know this. It can require that they not always optimize for short-term business outcomes, but if they are not comfortable with that *very* explicit tension, then this is not an appropriate business for them.

In my mind's eye, there are two things to observe. First is the
CA/Browser Standards ("what we do"), and second is the CA Operating
Procedures ("how we do it").

I guess that is a way that these things could have evolved in a parallel universe, but you have perhaps noticed that the BRs already have many directions as to how things must be done. The BRs are in fact growing more such directions over time as it becomes increasingly clear that not all CAs can be trusted to do the things that are best for the health of the WebPKI; see the active discussion about linting practices in the SCWG, for example.

Mike


Mike Shaver

unread,
Jun 8, 2024, 11:53:40 PMJun 8
to nolo...@gmail.com, MDSP, Watson Ladd

Apologies, I somehow managed to send white-on-white HTML from gmail mobile and I honestly have no idea how.

On Sat, Jun 8, 2024 at 9:48 PM Jeffrey Walton <nolo...@gmail.com> wrote:

> I would caution against that. Effectively, Mozilla would be fiddling

> with the market. The market should be the one to punish (or reward)

> Entrust for the premiums on manual issuance, not Mozilla. When

> subscribers get tired of paying too much for the service, the customer

> will go elsewhere.

Hey, uh, yeah…Mozilla sort of exists to “fiddle with the market” in ways that it feels protect the web’s users from the direction that The Market might otherwise take. It’s sort of “their thing”.

But that rather jarring dissonance aside, nobody is objecting to premiums on manual issuance. It is precisely the opposite: it is an objection to charging Subscribers *extra* for using *automated* tools that make the web safer (and which indeed should be cheaper for the CA to operate than a manual process, but you know how it is with rent seeking).

The CA’s primary responsibility is to the web’s users, not to its customers. They all know this. It can require that they not always optimize for short-term business outcomes, but if they are not comfortable with that *very* explicit tension, then this is not an appropriate business for them.

> In my mind's eye, there are two things to observe. First is the

> CA/Browser Standards ("what we do"), and second is the CA Operating

> Procedures ("how we do it").

I guess that is a way that these things could have evolved in a parallel universe, but you have perhaps noticed that the BRs already have many directions as to how things must be done. The BRs are in fact growing more such directions over time as it becomes increasingly clear that not all CAs can be trusted to do the things that are best for the health of the WebPKI; see the active discussion about linting practices in the SCWG, for example.

Mike

Mike Shaver

unread,
Jun 9, 2024, 3:59:13 PMJun 9
to Paul Wouters, MDSP, Watson Ladd, nolo...@gmail.com
On Sun, Jun 9, 2024 at 3:34 PM Paul Wouters <pa...@nohats.ca> wrote:
On Jun 8, 2024, at 23:53, Mike Shaver <mike....@gmail.com> wrote:

The CA’s primary responsibility is to the web’s users, not to its customers.

That is an interesting view, possibly not shared by its shareholders or the legal framework of the countries they operate in. 
If you have a different view of the BRs to which Entrust and other CAs have committed, or how they conflict in a concrete way with other legal frameworks, then that would be a fine thing to discuss with details in another thread here or perhaps on the CCADB list.

I don’t know what they tell their shareholders, but that’s also not my problem. They don’t have to be in this business, however we got to this situation historically; I think we may well find out that the web can operate just fine without Entrust acting in this capacity at all.

There are many technology businesses which are successful even with the existence of non-profit or similar competition. CAs are not owed a profitable business, especially not at the expense of the integrity of the web’s critical, fragile PKI.

I don’t see how using the DNS and a registrar (instead of a TLS handshake and a root CA) to distribute service identity information fundamentally changes the economics or pressures, but I’m happy to be pointed to something if you think it’s germane to the discussion of how we want CAs to create, or not create, incentives related to automation and certificate agility. Again, perhaps a topic more suited to the CCADB list than to this branch of a discussion of Entrust’s behaviour.

Mike

Tyrel

unread,
Jun 10, 2024, 11:06:25 AMJun 10
to dev-secur...@mozilla.org

All I can say is... wow. This report seems to treat the recent issues at Entrust as a compliance matter, rather than a matter of trust. Which says it all really. 

I find particularly galling the very clear commitment Entrust made in 2020 to not have any more delayed or failed revocations, and yet here we are in 2024 with a whole slew of them. But rather than then go on to echo the other criticisms, I would like to bring another observation:

Entrust seems to have completely missed the boat with bug #1890898. Even if their revised analysis -- that revocation was not required -- is correct (and I am not a sufficient expert to know), it does not change the fact that for 2 months Entrust believed the certificates were misissued and chose not to revoke for those two months. Which shows blatant disregard for the requirements of the BRs. Especially since this bug then doesn't even make an appearance in section 2.5.1 of this report.

I think Jeffrey Walton hit the nail on the head: "If the Forum wishes to acknowledge the interests of the 5.35 billion internet users, then maybe removing Entrust would be the best course of action." I would go further and suggest that all root programs should look to distrust the Entrust roots (for both TLS and S/MIME purposes) by end of 2024. I just don't see how Entrust, with current management, can plausibly be believed to have any road back to trust.

Tyrel

Ben Wilson

unread,
Jun 10, 2024, 4:16:28 PMJun 10
to dev-secur...@mozilla.org

All,

This is to acknowledge that we have received Entrust's June 7 Report regarding its non-compliance issues and associated remediation plans. Mozilla will thoroughly review the report and provide comments, requests for clarifications, and verify that the requested items have been and will be addressed in accordance with the request for the Report and the Action Items listed in Bugzilla Bug #1901270.

Our review process will include:

  • Assessing the root cause analyses and improvement plans.
  • Ensuring that all issues listed in the report and related Bugzilla discussions have been addressed.
  • Verifying that the remediation plans align with the expectations and requirements set forth by the Mozilla Root Store Policy and the CA/Browser Forum TLS Baseline Requirements.
  • Requesting any necessary clarifications or additional information to ensure comprehensive compliance and future prevention of similar incidents.

We will follow up with specific comments and requests for clarifications as needed.

Thank you for your attention to these important matters.

Ben



Mike Shaver

unread,
Jun 10, 2024, 6:16:51 PMJun 10
to Ben Wilson, dev-secur...@mozilla.org
Does this mean that the window has closed for feedback to Mozilla on the report and its responsiveness to the request?

Will requests for clarification be in this thread, the listed bug, or elsewhere?

Does this mean that Mozilla feels that the action items listed in that bug are sufficiently detailed and concrete that they are appropriate as steps for Entrust to take at this point?

Mike

Ben Wilson

unread,
Jun 10, 2024, 7:28:15 PMJun 10
to Mike Shaver, dev-secur...@mozilla.org
See below:

On Mon, Jun 10, 2024 at 4:16 PM Mike Shaver <mike....@gmail.com> wrote:
Does this mean that the window has closed for feedback to Mozilla on the report and its responsiveness to the request?
 
No. It just means that we (Mozilla staff) will be reviewing the report and the action items and providing feedback as well.


Will requests for clarification be in this thread, the listed bug, or elsewhere?

Preferably here, but if the requests for clarification are structured in markdown in Bugzilla as replies to Comment 1, then that would be acceptable, too. Otherwise, general comments and critiques should be made here.  For efficiency, it is requested that comments and requests for clarification be collected into as few emails/posts as possible and not be posted in piecemeal fashion.

Thanks,

Ben

Mike Shaver

unread,
Jun 10, 2024, 7:41:48 PMJun 10
to Ben Wilson, dev-secur...@mozilla.org
On Mon, Jun 10, 2024 at 7:28 PM Ben Wilson <bwi...@mozilla.com> wrote:
Preferably here, but if the requests for clarification are structured in markdown in Bugzilla as replies to Comment 1, then that would be acceptable, too. Otherwise, general comments and critiques should be made here.

Sorry, I meant: will Mozilla’s requests for clarification from Entrust be posted to mdsp or the bug or etc.?

For efficiency, it is requested that comments and requests for clarification be collected into as few emails/posts as possible and not be posted in piecemeal fashion.

Touché…

Mike

Ben Wilson

unread,
Jun 10, 2024, 7:46:23 PMJun 10
to Mike Shaver, dev-secur...@mozilla.org
Hi Mike,
Requests for clarification will be posted here.
Thanks,
Ben

Ben Wilson

unread,
Jun 12, 2024, 1:38:54 PMJun 12
to dev-secur...@mozilla.org
All,

So far, we have received substantive comments and questions on Entrust’s June 7 Report from Amir, Wayne, and Watson.

Are others planning to submit comments or to request clarification and additional information from Entrust?

Thanks,

Ben

Thanks,

Ben
 
You received this message because you are subscribed to the Google Groups "dev-security-policy@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-policy+unsub...@mozilla.org.

--
You received this message because you are subscribed to the Google Groups "dev-security-policy@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-policy+unsub...@mozilla.org.

Macy

unread,
Jun 13, 2024, 12:05:53 AMJun 13
to dev-secur...@mozilla.org, Ben Wilson
On Thursday, June 13, 2024 at 3:38:54 AM UTC+10 Ben Wilson wrote:
All,

So far, we have received substantive comments and questions on Entrust’s June 7 Report from Amir, Wayne, and Watson.

Are others planning to submit comments or to request clarification and additional information from Entrust?

Thanks,

Ben

Hi Ben,

I just wanted to register my confusion and disappointment at Entrust's report and general responsiveness to community concerns. You mentioned in your original statement that "This is particularly disappointing in light of previous incidents in 2020 (#1651481 and #1648472), which arose out of similar misunderstandings of the requirements, similar poor decision-making in the initial response, and lengthy remediation periods that fell well below expectations. Entrust gave commitments in those bugs to address the root problems through process improvements, and it is concerning to see so little improvement 4 years later."

A thing that is notably missing from Entrust's report, given your explicit prompting, is any retroactive discussion of the events in 2020 or the changes they implemented in response to those incidents, or how those changes were insufficient to prevent these incidents. I haven't read every bugzilla comment so I may have missed other mentions, but in Bug #1890685 comment 39 it appears that the changes Entrust internally identified 4 years ago related only to the adoption of automation, with nothing learned about the failures in their process or adherence to the BRs.

I find the report worrisome, because this shows that Entrust has not been and is still fundamentally not taking delayed (or failed) revocation as seriously as a WebPKI CA should.

To that end, some of the action items in 1901270 would strongly benefit from some explanation to the community as to what happened from the 2020 commitment to today, specifically:
  • Implement formal incident response process including incident response communication plan to meet mandatory reporting times
  • Implement specific handling processes for internal as well as external (CPR) reports
  • Review verification process for all certificate types
  • Create formal revocation event handling process
  • Establish delayed revocation criteria
  • Create revocation event communication plan
How did these seemingly get missed in 2020, and continue to be missed for 4 years of operation, after making a commitment to always revoke certificates in accordance to the BRs? What was the process breakdown that led us to be here, watching a CA that predates the formalisation of the BRs just now committing to having a formal process for revocation events? What is the current process, how has it changed from 2020, and why is it still insufficient? What changes will be made to the process to ensure its sufficiency in the future? This is the bulk of what I expected to be reading in the report (and were explicitly requested in the prompt), but instead I saw the level of detail that should have been in the initial bugs with no acknowledgement or awareness of the overarching strategic failures that got them to this point.

In addition, I'm greatly troubled by the way that bug 1890685 was filed as a failure to revoke and treated that way by both Entrust and RPs for two months, only to have Entrust change interpretations and decide that there is no issue there. This entire bug was seemingly unaddressed by their report, and the timeline of the changed interpretation is confusing. Do other CAs share their understanding expressed in that bug that their CPSes' details don't matter if they say "unless the BRs override this"?

I have questions about the status of monitoring Bugzilla to learn from events from other CAs, but they're mooted if the CA isn't learning from its own incidents either.

I want to reiterate that the problem as I see it is not insufficient contrition, but rather insufficient self-awareness. I find it alarming that they are not showing recognition of the core problem here; namely, their continued retention of subscribers that they are willing to violate the BRs for and how that affects the trustworthiness of WebPKI in the event of future misissuances. I've looked through previous incidents where CAs were asked to participate in this process, and I have seen genuine signs of turnaround from CAs that were previously flirting with a distrust decision, but in each of those cases, there was a willingness to identify the deeper causes of failures that seems to be organisationally lacking here. The entire incident response from March onward comes off to me far more like a PR exercise than an attempt to meet other technologists in improving the WebPKI ecosystem.

--Macy

Walt

unread,
Jun 13, 2024, 9:13:51 AMJun 13
to dev-secur...@mozilla.org
All,

If we strip away the cover page, table of contents, appendices and executive summary, the 17 page report goes down to somewhere in the realm of 13 pages. 

If we take a closer look at those 13 pages, there's very little new information shared in the report that wasn't already shared in the various Bugzilla incidents. I don't see internal policies or procedures used to make their decisions on delayed revocation, I see exactly one reference to the previous delrev issues in 2020, in which it is offhandedly mentioned and again pushed onto Subscribers' responsibility, rather than the CA. I see statements that say "we intend to revoke following the BRs, unless a subscriber says we don't want to". I see references to IETF drafts and adoption of automation, but again, those are not action items for a delayed revocation. The problem is "we failed to revoke certificates, here's what we're doing to fix that problem", and the analysis and action items should reflect that. 

I also think it is absolutely impossible to judge a report at this point for the following reasons: 
1. There are still unresolved delrev / failure to revoke incidents in Bugzilla, and this report is now drifting from what exists in Bugzilla. 
2. Entrust is being even more evasive in answering questions in Bugzilla as of this report. There are numerous questions I as well as other community members, as well as individuals at other CAs have asked of Entrust that have either been ignored completely, ignored past the 7 day response period and only answered when prompted by another community member, and when answers are provided, they are seemingly copied from a internal working document on the best applicable answer that appears to have been workshopped, rather than the actual answer. Off the top of my head, I have multiple questions in 1886532 that remain unsatisfactorily answered, questions that should be a relatively easy answer. See Comment 52

If the report actually had action items and a deep understanding of the root causes that led them to this situation of failure to revoking certificates after promising it would never happen again, I would look differently upon this. Instead I see:
- a report that admits to a policy that was supposed to be implemented 4 years ago is barely implemented, if at all (as if it were implemented it should have been documented in the report)
- a CA who is being combative and evasive (which is just wrong from a public trust point of view)
- certificates re-characterized as properly issued after being characterized as mis-issued (which is something that should absolutely be looked closer at, if the perceived solution by a CA to avoiding delayed revocation events is to simply re-characterize the certs as issued appropriately and refuse to elaborate further feels like something that is Very Bad for the ecosystem)
- action items that don't seem to understand the organizational dysfunction that lead us to this series of misissuance events, and even for the minimal action items that exist, most are simply "tell Subscribers to do x" in more words 

In my opinion, evaluating the Entrust Report would only be possible given the following factors: 
- All delrev / failure to revokes are resolved satisfactorily (either via revocation or by an agreement by RPs that the certificates were indeed issued correctly, see 1890685)
- All questions asked in all open bugs are answered satisfactorily, with meaningful answers that take the question asker in good faith, rather than being combative and evasive
- A rewritten version of the report that A. acknowledges the missteps made in the submission of the report and the handling of the incident(s) and meta-incident around this, and B. answers the bullet points given by Ben W that should have been used as a framework to write the report, including but not limited to a thorough analysis and retrospective of the events of 2020, and learnings that were documented then, and why they weren't applied now. 

With the report as it is though, while I'm just a bystander who has become very interested in WebPKI as of the past few months, I feel that there is a clear lack of understanding of the responsibilities and duties of a CA due to organizational dysfunction at best, and at worst malicious incompetence.

Mike Shaver

unread,
Jun 14, 2024, 9:59:55 AMJun 14
to Ben Wilson, dev-secur...@mozilla.org
Apologies for the delayed response; it took longer than I expected to go through the many similar incidents and find the references I wanted, and indeed in the end I omitted many others. Thanks to Ben and the Mozilla community for their patience.

Entrust Report Comments
First, I just have to say that given Ben's very explicit expectations in the request for a response, the contents of Entrust's report are shockingly poor. They failed to address many of the requirements, and the entire exercise reads like a rushed homework assignment--not a credible plan by one of the most experienced CAs on the web to restore their operations to the level of quality and compliance expected by the Mozilla Root Program.

Shallow Action Item Specifications

The Executive Summary claims that the report provides a “detailed overview of concrete, measurable steps”, but the Action Items included in the report are often neither detailed, nor measurable.


A single example, from "2.1.4 Improvement Plan": “Establish cross-functional change control board: Complete”. There is no detail as to what this board will decide, how they will be selected, or how their effectiveness can be measured. This Action Item is, as described, basically just "we made a list of some people".

Inadequate Response to Delayed Revocation Incidents

The Mozilla Root Store Policy itself does not itself admit to any option for delayed revocation. It references the BRs, which require (SHALL and MUST language) revocation within 5 days. Provisions for delayed revocation only come from Mozilla's "Responding To An Incident" page at https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation . Mozilla grants generous latitude to CAs in that CAs are permitted to weigh the impact of revocation against the impact of further delaying revocation, but it also makes clear what is expected from a CA when it decides that the circumstances are "exceptional", such as when used in "critical infrastructure". 


In https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c35 , Ngook Kong clarifies Entrust's position on these expectations:


Our interpretation is any delayed revocation will comply with the Mozilla revocation policy at https://wiki.mozilla.org/CA/Responding_To_An_Incident#Revocation


Unfortunately, Entrust has systematically failed to comply with the policy to which they refer.


Requirement: "The decision and rationale for delaying revocation will be disclosed in the form of a preliminary incident report immediately; preferably before the BR-mandated revocation deadline. The rationale must include detailed and substantiated explanations for why the situation is exceptional. Responses similar to “we do not deem this non-compliant certificate to be a security risk” are not acceptable. When revocation is delayed at the request of specific Subscribers, the rationale must be provided on a per-Subscriber basis."


Entrust's delayed revocation incidents have consistently failed to meet this expectation. In two of the three delayed revocation incidents that Entrust chose to include in their report, no per-subscriber information whatsoever was provided, with the exception of a comment that listed four out of more than 100 affected subscribers. When rationales are provided (https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c32), they are insufficiently detailed: they do not contain enough information to determine whether the proposed action items are likely to meaningfully affect the risk of future delayed-revocation incidents.


This item has been present, in substantially identical form, since 2019, so it was well known to Entrust when they made their commitments in 2020 to avoid delayed revocation and adhere to the BRs and root program expectations. It is alarming that a CA that boasts of its long experience in the web PKI and involvement in the community is still consistently unable or unwilling to adhere to those requirements.


Requirement: "Your CA will work with your auditor (and supervisory body, as appropriate) and the Root Store(s) that your CA participates in to ensure your analysis of the risk and plan of remediation is acceptable."


To the best of my ability to determine, no Root Stores were consulted with regards to the risk analysis or plan of remediation; Ben Wilson's comment at https://bugzilla.mozilla.org/show_bug.cgi?id=1890898#c19 seems to indicate that Mozilla's root program representatives were not consulted. There has also not been any indication of auditor involvement. If indeed Entrust worked with their auditor with respect to the decisions not to revoke captured in bugs 1890898 and 1890685, it is difficult to understand how their later analysis came to a different conclusion.

Investment in Incident Response Capacity

Ngook Kong states in https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c51:


"Yes, we are equipped to perform a wide scale revocation if needed and necessary. [...] We have the technical capability to revoke within the required timelines."


But note also https://bugzilla.mozilla.org/show_bug.cgi?id=1898848#c1 in which Entrust pleads that they lack the resources to meet their commitment to the BRs:


"In addition, the authority to launch and conduct formal investigations, confirm incidents, initiate incident reporting processes, and trigger revocation events is held by a small number of individuals within the compliance team. These same individuals were responsible for helping to respond to incidents, communicating with impacted subscribers, responding to questions from the Bugzilla community, and drafting and submitting incident reports, in addition to other day-to-day responsibilities."


The former comment comes after the latter one, so perhaps the resources available have been increased. However, there is no mention in section 2.5 of the report of any action items related to increasing the resources available for investigating or responding appropriately to misissuance events, or how it can be measured that the level of investment is appropriate on an ongoing basis. I don't feel that CAs should generally have to share staffing levels or budgeting processes with root programs, but if those staffing levels are the cause of incidents, then I think they become relevant to evaluation of the CA's ability to operate correctly (and therefore the risk posed by continuing to trust certificates that they issue).

Failure to Meet Previous Commitments

In 2020, Entrust made the following commitments to Ryan Sleevi (then a peer of the Mozilla root program module): https://bugzilla.mozilla.org/show_bug.cgi?id=1651481#c6


  • [1] We will not the make the decision not to revoke.

  • [2] We will plan to revoke within the 24 hours or 5 days as applicable for the incident.

  • [3] We will provide notice to our customers of our obligations to revoke and recommend action within 24 hours or 5 days based on the BR requirements.

  • [4] We will recommend to our customers to implement automation of certificate management.

  • [5] We will increase our ability for correct implementation and testing to ensure that certificate profiles will meet the latest CA/Browser Forum or root program requirements.

  • [6] We will monitor the Mozilla incidents and the discussion list to discover problems which other CAs have experienced and how they were resolved. This will allow us to review and react if required to our own implementation. This will also help to minimize the number of miss-issued certificates, which will reduce the risk of late revocation.

  • [7] We will manage and update our pre-issuance and post-issuance linting to discover or prevent the problem early.

(I have added the numbers for easier reference.)

These commitments have been repeatedly broken, and in some cases appear repeated again in the report 4 years later:

In 2.5.4 (Improvement Plan), we see


We intend to revoke and replace certificates that do not meet TLS Baseline

Requirements or certificate-specific guidelines.We will plan to do so within the prescribed revocation periods [2]


We will work with our subscribers to ensure awareness and minimize

delayed revocation requests [3]


Driving customer adoption of automation [4]


Of specific note, there is an action item "Launch communication and education to subscribers on requirements for public trust certificates" with a target completion of 2024-07-31. Why is that only being undertaken now, four years after Entrust made commitment [3]?

In 2.4.4 (Improvement Plan), we have as action items:


Run pkilint against all CRLs [7]

Update automated test to cover the added requirement [5]


which are a near-identical repetition of the indicated commitments, and are indicated in the report as "completed". How can we believe that Entrust has made this improvement, when they committed to doing so already 4 years ago? In the same bug 1651481, Bruce Morton states: "We have put in practices to close out non-conformance and late revocation issues." If they indeed kept commitments [5] and [7], then they should provide substantial detail on why they believed that the previous efforts were appropriate, why it was reasonable for the incidents in 2.4.2 to have occurred with good-faith implementations of [5] and [7], and how their approach is different in 2024 such that the community can have faith that there has been a systematic remedy for this systematic fault.

Failure to Address Conflict Between Subscriber Limitations and BR Requirements

More than 8000 of the affected certificates in bug 1886532 are indicated to be delayed due to limitations of subscriber process, but none of the action items in the report provide measurable concrete ways to eliminate those limitations. Indeed, I don't think it is reasonable to hold Entrust accountable for making its subscribers improve internal processes, and I am surprised that Entrust has proposed that they be evaluated on efforts to that end. I think it is very reasonable to hold Entrust accountable for ensuring that the BRs are upheld regardless of whether a given subscriber has chosen to invest in changes to help them replace certificates more promptly; that is to say, hold Entrust accountable for revoking, as they have said explicitly that they have the technical capability to do.


In https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c1, Paul van Brouwsershaven says:


The revocation for customers is in most cases manual and complex, involving multiple internal parties to ensure that the change does not create an adverse impact on web services and back-end processing.


As a result, we have moved these customers into delayed revocation, particularly where their operations are critical to the web ecosystem


Entrust has stated that in 2020 (and presumably for the intervening four years), they felt that their processes were sufficient to meet the commitments made in 1651481. This is an assertion that they were not aware that the "revocation for customers is in most cases manual and complex", because those Subscriber limitations are still used as an excuse for delayed revocation today. Entrust has made no commitments that would limit the extent to which one of their Subscribers' operational limitations would be allowed to inflict harm on the integrity of the web PKI. Indeed, they continue to knowingly issue web PKI certificates to Subscribers who have regulatory or policy limitations that require as much as 90 days for a certificate replacement to be completed, which is entirely inconsistent with their 2020 commitments, and with a belief that the aforementioned (but unspecified) processes were an appropriate means for meeting those commitments–even over a four year time span.

Deceptive and Contradictory Statements Regarding Subscriber Communication

In https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c36 , Ngook Kong says 


No delayed revocation options were offered.


Later in that bug, Ngook posts a sample email that was sent to a subscriber regarding a certificate that was misissued, but part of it is redacted:


Subject:URGENT ACTION REQUIRED: EV Revocation May be Required
Message Group: System Interruptions
Message Expiry Date: 3/30/24

We are writing to inform you about a recent issue that affected some of the EV digital certificates issued by Entrust. We apologize for any inconvenience this may have caused you and we are committed to ensuring the security and integrity of your online transactions.

Summary:

Entrust discovered that some Extended Validation (EV) TLS certificates (EV Multi-Domain SSL, QWAC eIDAS, QWAC PSD2) were missing a specific component required by the EV Guidelines. This component links the certificate to the Certificate Policy (CP) and the Certification Practice Statement (CPS) of the issuer.

Entrust has taken steps to address the issue and prevent it from happening again. Any of the specified certificate types issued after 21:40 UTC today, Mar 18, 2024 are not affected. Entrust is required to revoke affected certificates as soon as possible, which will occur on Saturday, March 23, 2024 at 21:00 UTC

ACTION REQUIRED:

Certificates issued after September 11th, 17:40 UTC, 2023, to March 18th, 21:40 UTC, 2024, will need to be replaced by issuing a new certificate and revoking the old certificate(s) shortly thereafter. We will assist you with this process and ensure a smooth transition.

Steps: [REDACTED]

Thank you,
Entrust Certificates Services

Later, it was revealed by a member of the community that what was redacted was an instruction to the Subscriber to revoke along a 30-day timeline, and Entrust was asked why they redacted that portion: https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c40


Perform a REISSUE, and select "Revoke within 30 days" so your production certificate maintains validity, providing sufficient time to perform the replacement


Ngook Kong responded 


The letter we shared is an example of what was sent from us directly to a subscriber and was not posted in the public domain. We were being transparent by sharing the message. The redacted section provides specific instructions to our subscribers on how to revoke and reissue certificates.


and uploaded a PDF copy of the email (https://bug1886532.bmoattachments.org/attachment.cgi?id=9406229). It is very difficult to believe that this section was omitted due to any sensitivity of the material contained; the PDF was provided promptly when it was revealed that the contents were known to the community member, and Entrust's website contains related instructions on renewal for their product already (https://www.entrust.com/knowledgebase/ssl/how-to-renew-your-entrust-certificate-using-self-service as an example). It is also very difficult to believe that this was anything but an attempt to conceal elements of Entrust's subscriber communication that contradict Entrust's position that they prioritized prompt revocation and conveyed proper urgency to their subscribers. I actually don't think the email is that bad, because their software seems to be such that 30 days is the appropriate window to be selected here, but I think that the fact that they tried to conceal it is very concerning.


Similarly concerning is their repeated uncooperative responses to requests for per-subscriber detail and rationale, which is a crystal-clear requirement of the Mozilla incident response policy for delayed revocation.

Conclusion

In conclusion, I do not feel that Entrust has demonstrated satisfactory commitment or investment in addressing systemic issues with Subscriber management and revocation policy/process. Neither their historical behaviour nor their inadequate response to the community are appropriate for a CA that chooses to issue public web PKI certificates. I strongly recommend that Entrust-issued certificates issued after June 7, 2024, not be trusted by Mozilla products.


Mike



On Fri, 7 Jun 2024 at 15:53, 'Bruce Morton' via dev-secur...@mozilla.org <dev-secur...@mozilla.org> wrote:
Please respond to comments you may have on our report or action items here.  We will track our progress against the action items list in Bugzilla under bug 1901270.

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Amir Omidi

unread,
Jun 14, 2024, 10:11:34 AMJun 14
to Mike Shaver, Ben Wilson, dev-secur...@mozilla.org
I missed that they tried to conceal the part of the email where 30 day revocation was granted. How on earth is this acceptable? 

I’ll have to go double check everything in your correspondence here, but if this is all true then this is deeply unsettling and concerning.

Root program, I implore you to expedite the processing of these issues: If the concealment of the revocation information was willful, then there’s no reason to believe that Entrust hasn’t also acted maliciously in other areas. 

Amir Omidi (he/them)


Mike Shaver

unread,
Jun 14, 2024, 10:22:21 AMJun 14
to Amir Omidi, Ben Wilson, dev-secur...@mozilla.org
On Fri, 14 Jun 2024 at 10:11, Amir Omidi <am...@aaomidi.com> wrote:
I missed that they tried to conceal the part of the email where 30 day revocation was granted. How on earth is this acceptable?

I want to be clear here: I don't know that that part of the instructions was meant to convey to affected Subscribers that 30 days would be an acceptable timeline for revocation (though of course many certificates didn't even get replaced that quickly...). It may be, for example, that the software in question is limited such that it only offers "reissue with immediate revocation" and "reissue with 30 day revocation". In that case, the latter would be an appropriate choice even if the revocation was to happen on a shorter timeline.

My concern is that they chose to conceal this part of the correspondence, and I cannot come up with a good faith reason for doing so given the information that is already public about the ECS system and how to reissue. Obviously the term "30 day" is weird to see there, but if there was a good reason for it (probably a better reason than the one I imagined above), then they should have provided the reason rather than clumsily attempting to conceal part of it. (And after Wayne had indicated both in mdsp and in the incident itself that the contents were already known to some...)
 
I’ll have to go double check everything in your correspondence here, but if this is all true then this is deeply unsettling and concerning.

Please do so! There have been a lot of comments with a lot of slightly different contents and statements, and it's entirely possible that I mis-referenced something, or made an outright error in my analysis.

Mike
 

Wayne

unread,
Jun 14, 2024, 10:46:04 AMJun 14
to dev-secur...@mozilla.org
Given the topic of the concealed '30 day' step is coming up I do wish to clarify my intent. I had been less than subtly telling Entrust for nearly a month that this information was known, and was giving them the option to come forward about an issue that could look bad if it came to light without context. I had been hoping that a mistake was made in March and that it would be acknowledged and treated seriously. I attempted every step of the way to let Entrust provide the information themselves so that they could explain their intentions and clear up any confusion in advance.

That they chose not to is still perplexing to me. I appreciated this could be an embarrassing default text string that they never considered in the 4 years since their prior commitments. However given their actions in response, I can only surmise that it was working as intended.

I still hope they clarify this matter at some point, they have had more than enough opportunities. On that note what is Mozilla's policy for a CA answering questions posed on MDSP and the applicable timeframe? I am sure the rest of the community are as puzzled over the report received and would appreciate clarifications.

- Wayne

Ryan Hurst

unread,
Jun 14, 2024, 3:37:25 PMJun 14
to dev-secur...@mozilla.org

To me, it seems that Entrust has forgotten why we are all here. The purpose of the WebPKI is to enable end-users to trust that they are communicating with the correct website. This trust relies on the CAs that make up the WebPKI to be transparent and live up to their promises while adhering to ecosystem norms, requirements, and best practices. Root programs act as the arbitrators of who is worthy of that trust by enforcing these norms, rules, and best practices on behalf of the end-users they serve.

As I review the various incidents at hand, the bugs tracking them, the incident responses, and the associated timelines, a few key elements stand out to me:

  • Lack of Transparency and Accountability: Entrust’s incident reports and responses lacked transparency and failed to acknowledge where the fault lay.

  • Failure to Meet Previous Commitments: Despite previous commitments made in 2020 to avoid delayed revocations and adhere to BRs, Entrust continued to face similar issues.

  • Inadequate Root Cause Analysis: The root cause analyses provided by Entrust were often superficial and did not address systemic failures.

  • Insufficient Remediation Plans: Entrust's remediation plans were vague, lacking concrete, measurable steps, and often repeated previous commitments without acknowledging how they had previously failed to live up to these commitments.

  • Lack of Organizational Self-awareness: Entrust's responses indicated a lack of self-awareness about the depth of their issues. They did not show a comprehensive understanding of the systemic problems leading to repeated incidents.

  • Belief in Exemption from Norms: Entrust has demonstrated through their actions and responses that they believe the norms, policies, and requirements do not equally apply to them.

  • Slow Response Times: Looking at similar incidents, the timeline associated with Entrust's responses was slow relative to other similar-sized organizations and smaller CAs.

In Entrust’s responses to these incidents, they have leaned on their long history in this industry and the impact they have had. In that vein, I can't help but see parallels to the recent Microsoft CSRB review of STORM-0558, where the reviewers said that “Microsoft’s security culture was inadequate and requires an overhaul, particularly in light of the company’s centrality in the technology ecosystem and the level of trust customers place in the company to protect their data and operations.”


The immediate technical non-conformities we are discussing here, when looked at in isolation, are not major issues. I even understand why Entrust is hesitant to require their customers, who are largely Enterprise customers who are notoriously hard to deal with on such matters, to replace their certificates as a result of Entrust’s operational non-compliance. However, when we consider them as a body of issues, especially over time, and the way Entrust has responded to them, they reach a significant level. We need to be asking ourselves: is this the kind of behavior we want to establish as the norm for the WebPKI?

More broadly, the pattern of non-compliance demonstrated by Entrust, combined with the fact that other, smaller CAs with fewer resources have managed to respond significantly faster and more proactively, makes me ask the question: when we see systemic issues, do we wait until they are catastrophic before we, as the custodians of the WebPKI, respond?

In the end, the answers to these questions will need to be provided by the root programs. However, if I were at Entrust, I would have been doing some serious soul-searching over the past quarter about the cultural and organizational issues that led to this point.


I would also like to encourage other CAs who are watching this transpire to review their practices and ensure their incident response procedures are transparent, proactive, focused on root cause analysis, and more broadly in line with the expected norms of our community.



Ryan Hurst


Bruce Morton

unread,
Jun 14, 2024, 3:56:45 PMJun 14
to dev-secur...@mozilla.org

To the Community -

We wanted to let you know that we have been monitoring this conversation. We appreciate your feedback here and plan to share a response next week.

Best regards, Bruce.

Bruce Morton

unread,
Jun 14, 2024, 4:55:38 PMJun 14
to dev-secur...@mozilla.org, Amir Omidi, Ben Wilson, dev-secur...@mozilla.org, Mike Shaver

Amir, we will respond to the comments from the community, but I want to make it clear that Entrust was absolutely NOT trying to "conceal" anything related to how we do revocation and are disturbed that you would attribute "malicious" motives to any of our actions.  The "30 day revocation" option is a standard option for subscribers in our system that allows them to replace certificates safely before revoking. In normal course, a subscriber would just leave them in this "bucket”, and they would automatically be revoked. When we posted the letter originally, we shared it as an example of what was sent from us directly to a subscriber and was not posted in the public domain. We were being transparent by sharing the message.  The redacted section provides specific instructions to our subscribers on how to revoke and reissue certificates.

“Revoke within 30 days” was one of two options in the tool. Certificates placed in this status were reissued within 30 days of when they were placed in this status; we revoked them sooner if their extension time was reached, or if the subscriber confirmed they had reissued.

Prior to April 4, 2024, customer could only select "Revoke immediately" or "Revoke in 30 days".  The default for use in the instructions on March 18 2024 was "Revoke in 30 days".  Recognizing, this may have been perceived by customers that they then had 30 days vs the 5 day timeline that was communicated, Entrust implemented a change to add "Revoke in 3 days" as the default moving forward to be called out in the event of future mis-issuance. 

Revoke in 3 days.png

These updated instructions with the use of the ‘3 day’ revocation button were used when communicating with subscribers for Bug 1897630.

“Complete the Reissue and select "Revoke in 3 days" so your production certificate maintains validity and provides you with sufficient time to perform the replacement. Note: This does NOT mean your certificate will be valid for another 3 days. It is just a mechanism to not immediately revoke your certificate during the replacement process.”

The full communication can be review in the attached. 

Notice.pdf

Mike Shaver

unread,
Jun 14, 2024, 5:01:41 PMJun 14
to Bruce Morton, Amir Omidi, Ben Wilson, dev-secur...@mozilla.org
On Fri, Jun 14, 2024 at 4:55 PM 'Bruce Morton' via dev-secur...@mozilla.org <dev-secur...@mozilla.org> wrote:

Amir, we will respond to the comments from the community, but I want to make it clear that Entrust was absolutely NOT trying to "conceal" anything related to how we do revocation

You redacted a part of the email about how customers were to go about reissuing and revoking, which is *literally* concealing something related to revocation. That’s what redacting is. It’s the only reason that anything is ever redacted. What are you even trying to say here?

The redacted section provides specific instructions to our subscribers on how to revoke and reissue certificates.

cool, cool

Why was it important to redact that section? It has no confidential information in it as far as I can tell.

I exhort you, for your own sake, to seriously read my guidance in 
https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c64 as to how to properly communicate the reasons for, and details of, the changes that are being listed as things that will improve Entrust’s operations.

Mike

Wayne

unread,
Jun 14, 2024, 5:43:06 PMJun 14
to dev-secur...@mozilla.org
Even taking Entrust's statements in the past hour at face value we have an issue. At no point have they communicated this change or even implied it was happening despite questioning over the matter for weeks. There is not a single mention like this in their formal report.

There is a serious culture issue at play internally and it needs to be addressed. I said I gave Entrust every opportunity to explain. Why did it take until now for some semblance of an excuse to appear?

Not only that but we're being told that in incident 1897630 that different incident response processes were being followed. This does match the statements in there that everything was ad-hoc, and emphasizes that incident response processes are not being followed internally even at this stage.

I do however appreciate that Entrust have finally brought in their emergency planning personnel several months late, I wish them the best of luck.

- Wayne

Walt

unread,
Jun 18, 2024, 12:49:39 PM (12 days ago) Jun 18
to dev-secur...@mozilla.org
I'd just like to point out that we now have a situation where Entrust is in the position of seemingly valuing the opinion of other Root Programs over Mozilla: https://bugzilla.mozilla.org/show_bug.cgi?id=1890898#c42

In Comment #37, it was hinted at (and made slightly more explicit in #39) that the opinion of the Mozilla RP is that the attempt to re-characterize these certs was not going to be looked kindly upon, and only once a Google RP member explicitly said that it was the Google RP opinion that the certs remained mis-issued was any movement made on re-confirming the mis-issuance and taking action to revoke them.

Also, if we're in a position where Entrust is finally able to commit to revoking certs within a 5 day period (setting aside that these certs technically need a delayed revocation bug as the mis-issuance was known as far back as 2024-04-10), why are other incidents not able to be resolved in this amount of time? Is it because Google showed up?  

Mike Shaver

unread,
Jun 18, 2024, 1:12:19 PM (12 days ago) Jun 18
to Walt, dev-secur...@mozilla.org
On Tue, Jun 18, 2024 at 12:49 PM Walt <walter...@gmail.com> wrote:
I'd just like to point out that we now have a situation where Entrust is in the position of seemingly valuing the opinion of other Root Programs over Mozilla: https://bugzilla.mozilla.org/show_bug.cgi?id=1890898#c42

In Comment #37, it was hinted at (and made slightly more explicit in #39) that the opinion of the Mozilla RP is that the attempt to re-characterize these certs was not going to be looked kindly upon, and only once a Google RP member explicitly said that it was the Google RP opinion that the certs remained mis-issued was any movement made on re-confirming the mis-issuance and taking action to revoke them.

Also, if we're in a position where Entrust is finally able to commit to revoking certs within a 5 day period (setting aside that these certs technically need a delayed revocation bug as the mis-issuance was known as far back as 2024-04-10), why are other incidents not able to be resolved in this amount of time? Is it because Google showed up? 

We’ve seen this behaviour in other incidents as well, I believe including the cpsURI one that has turned into a magnet for evidence of poor operation and lack of transparency and responsiveness. I remarked on it in my initial snarky reply to the Entrust Report, in fact.

From a realpolitik perspective their behaviour could indeed be rational, especially when the only tool root programs have is distrust. Firefox would suffer substantial market disadvantage if it stopped trusting Entrust certificates when other browsers didn’t. I think people generally underestimate how much Mozilla would be willing to take near-term pain to protect users, but it’s also possible that I am overestimating it.

Related to that, I think Chrome’s root program representatives have generally been more willing to take a concrete position quickly, so Mozilla might be waiting for more explanation when Chrome decides that there’s no explanation that could suffice, or similar. The root programs tend to be in agreement more often than not (virtually always with Chrome and Mozilla, I would say, excepting some slightly different root store populations), so it may be somewhat irrelevant whose opinion spurs motion.

Realpolitik analysis aside, I do agree that Entrust has created the impression that they care much more about Chrome’s opinion than Mozilla’s, which IMO might not be the best posture to take given that Mozilla and its community are the locus for the processing and evaluation of the incidents in question.

Mike



Amir Omidi (aaomidi)

unread,
Jun 18, 2024, 1:35:48 PM (12 days ago) Jun 18
to dev-secur...@mozilla.org, Mike Shaver, dev-secur...@mozilla.org, Walt
I am not going to say with certainty that Entrust is definitely putting Chrome over Mozilla. However, I hope they know that most Linux systems out there use the Mozilla root store directly.

Bruce Morton

unread,
Jun 21, 2024, 2:59:25 PM (9 days ago) Jun 21
to dev-secur...@mozilla.org
Attached is a letter from Bhagwat Swaroop, President of Entrust Digital Security Solutions, along with an updated response to address questions from the community.

Thanks, Bruce.

Entrust CAB Letter 06.21.24 1.pdf
Report to the Mozilla Community - Update June 21, 2024 4.pdf

Mike Shaver

unread,
Jun 21, 2024, 3:21:08 PM (9 days ago) Jun 21
to Bruce Morton, dev-secur...@mozilla.org
Thanks, Bruce.

On first quick read of the response, I have some concerns about specific elements but the level of detail and specificity is much more appropriate, IMO, than with the first response. Thank you for those additions.

What is the best way to provide feedback on this improved response? I think there are a few important questions still open.

Mike

--
You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

Amir Omidi (aaomidi)

unread,
Jun 21, 2024, 3:31:07 PM (9 days ago) Jun 21
to dev-secur...@mozilla.org, Mike Shaver, dev-secur...@mozilla.org, Bruce Morton
Quick preliminary question:

Is this now the final report? The final report that was due two weeks ago.

Can you explain how this document is going to reconcile the recent response we got from Entrust over this bug? https://bugzilla.mozilla.org/show_bug.cgi?id=1890685#c46

Specifically:

> Thanks, Tim. In Comment 29 posted on June 5, 2024, we issued an updated incident report for this bug stating that we no longer believe this is a mis-issuance. Given this position, there should be no need for further reporting as described in your Question 1.

Ben Wilson

unread,
Jun 21, 2024, 3:48:20 PM (9 days ago) Jun 21
to Mike Shaver, Bruce Morton, dev-secur...@mozilla.org
Thanks.
I think the best way to respond is for each person to gather all of their comments into a single email with a list of remaining issues found and then submit it to this thread.
Thanks,
Ben

Wayne

unread,
Jun 21, 2024, 5:17:30 PM (9 days ago) Jun 21
to dev-secur...@mozilla.org
This has been written without checking prior replies - there may be overlap.

First off, good work on the new report addressing more matters however this should have been your original report at a minimum. Before I even start I will outright state that I hope that Entrust actually improves throughout this and while this comment will be cleaned up it reflects an ongoing opinion as the report is read.

First looking at the letter I will only note this paragraph:
"We are disappointed as this does not represent Entrust values and falls short of the standards we set for ourselves. We also want to make sure it is understood that none of these lapses have been malicious or done with ill-intent to make the internet less secure. As a global CA we must walk a tightrope in balancing the requirements of the root programs and subscriber needs, especially for critical infrastructure. In some cases, we did not strike the right balance."

It does trouble me that compliance is seen as a balancing point against issuance for critical infrastructure. There has been a common talking point of Entrust's delayed revocation incidents of the concept of irresponsible revocation. I point it to Entrust that such a scenario only presents itself when a CA is culpable of irresponsible issuance.

As I read through this I keep seeing a repeating pattern of changing the organizational structure and creating committees and board of cross-discipline personnel. While this is all good in theory, I am concerned that this is not addressing the actual root causes of internal decision making and that the outputs will be just the same with a different label on the team providing it.

Before I delve into any minutiae of the report itself I do find it noteworthy that in incident #1890898 (Entrust: Failure to revoke OV TLS - CPS typographical (text placement) error)) we have a functional example of the new cross-functional team evaluating compliance and coming to a decision. Now this could be a third unspecified team but given the report I presume this is the template going forward, for brevity I'll keep this to the broad strokes:

2024-04-11: Issue opens, mis-issuance confirmed but no intent to revoke. A long conversation ensues, nothing changes until a day before the June 7th report appears.

2024-06-06: "We reviewed and consulted with independent external experts on this revised analysis, and based on this broader consultation, we now believe there was no mis-issuance and thus no need to revoke the affected certificates. A detailed analysis is below."

Following that analysis Mozilla and Chrome's Root Programs give a different opinion.

2024-06-18: "On this basis, we will treat this as a mis-issuance, and intend to complete revocation by end of day Saturday, June 22."

2024-06-19: "On the last question, our position is that there was no mis-issuance—not that there was a mis-issuance and we decided not to revoke which is the situation that recommends discussion with affected root stores."

Now, using the 06-06 opinion as a basis we have an example of this new cross-functional team. They reviewed the original incident and came to a conclusion that a) was not the same as Entrust in April, but crucially b) was not compatible with the viewpoints of the Root Programs who spoke up. I am of the strong belief of evaluating institutional changes not on their stated internal changes, but on their outputs. The decisions are all we will see, they are all that will matter in practice.

I will not detail line by line but I do notice that some factual discrepancies in the original report have been addressed. It would be good to find out how those came to be in the first case. There are still outstanding ones that I already stated previously.

>>Note: During our investigation of this issue, we noted that a subset of 1,975 EV certificates were also issued without the Entrust EV policy identifier (OID), based on our interpretation of the ballot update.
>This is also a miscount, presumably due to the original figure being 1963 + 6 certs on a test site that are being double-counted.

On reading further in 2.1.1 Entrust have outright stated they still stand by their incorrect analysis as previously noted in this reply. This speaks volumes as to the decisions that will occur going forward. Within 2.1.3 there is a mention of Entrust continuing to issue certificates and advocate their position, but I am seeing no reflection as to the root cause of what causes them to advocate for their incorrect positions to this day. Not a single line of 2.1.4 addresses this either.

Oddly 2.2.3 does not mention that on April 3rd "The issue was escalated to our verification team for further investigation.". Instead it purports a subtly different timeline where nothing happened until the 15th. The April 4th issue as stated in the bugzilla timeline is also absent.

It is at this point in the report that my original reply must have gotten lost as I still have outstanding issues. I am quoting my original reply below:

>>2.3.4 Improvement Plan
>>...
>>Automate CPR form to collect all required information at the outset from the reporter rather than relying solely on email
>This goes back to policy issues discussed for years now, see:
>https://github.com/mozilla/pkipolicy/issues/98
>https://github.com/cabforum/servercert/issues/201
>https://bugzilla.mozilla.org/show_bug.cgi?id=1650234

Now, moving on. In 2.4.1 I am again mis-identified as a reporter of the EV cert issue. This does not factually matter but is amusing as the initial factual corrections show that part of my response was read and applied.

The only significant change I can see in 2.5.1 is the insistence that the analysis Entrust performed on the mis-issuance not existing on the OV TLS Typo issue must still be correct. As previously stated, I do not see how this is compatible with multiple Root Programs stating otherwise.

I am confused about 2.5.3 though, it is about delayed revocation but the RCA is focused on the technical issue in the original incidents. 2.5.4 contradicts itself from paragraph to paragraph. A commitment to revocation and replacement, and then statements that delays will be managed on a case-by-case basis.

It is a bit troubling to see the conclusion state the following:
"The mis-issuances we experienced were technical non-conformities and, had any one of them happened in isolation, they would not have resulted in us taking such a hard look at our program and finding the opportunities that we did."

Regarding ACME, I previously stated this question and will repeat it now: Can you make any guarantees that ACME will be a requirement for subscribers going forward, and that they will not be charged extra for using these systems?

Looking into 4.3 Appendix 3: Success Measures I won't address each individually. I am curious how you intend to get the WebTrust annual audit results to result in 0 qualifications in the space of a year. I would suggest an element for Communication is added to address how often a question has to be restated or followed up on due to a lack of clarity and transparency. Otherwise the list presents a minimal standard for any complying CA, if this is not kept by any CA it would be further cause for concern.

Once again in evaluating against what was requested I am struck at how the systemic failures are not being addressed. We have commitments to committees and boards, but the decisions are what truly matter. There is no mention of what policies caused these initial issues and how they were not adhered to. The 2020 commitments are only highlighted due to every comment noting it specifically, no attempt seems to exist to evaluate against historical issues.

On the 2020 commitments I am deeply troubled about this statement in particular:
"Knowledge of 2020 commitments was similarly confined to a small number of business unit employees, without broader leadership team/organizational awareness."
This should have came up in audits which cover incidents on bugzilla. What happened? Did the auditor only address this with the same small number of business unit employees and somehow no note of these commitments made it into any report that went further up the chain? What confidence can we have in any bugzilla-specific commitments outside of this report going forward?

As a final note I will highlight this section:
"As part of our response process to the Mozilla community, Entrust assigned a group of three senior leaders, as well as an external consultant, to review each incident to validate and expand root cause analysis."

Can we please have a breakdown on Entrust's end of what their original opinion was at the start of each incident, and how these personnel would evaluate the situation if it were to happen today? I sincerely hope that #1890898 is not an example going forward.

The point of incident reports and action items is to ensure things do not repeat, knowing that the decision-making process is repaired would be one small step.

- Wayne

Ryan Hurst

unread,
Jun 22, 2024, 6:39:03 PM (8 days ago) Jun 22
to dev-secur...@mozilla.org, Wayne

Part of me wants to commend Entrust for this response. If we can believe its sincerity—and this is a if given their recent history and how this has played out—it took 13 compliance incidents and 107 days for their leadership to recognize, at least publicly, the systemic issues that have happened under their watch, and that does not even count the fact that this has been a problem since at least 2020.



The disappointing thing is that here we are, 30% into the year, and what we have is a commitment to restructure a part of Entrust and fund it to do better without concrete actions to address the specific issues. Meanwhile, they are still trusted and exposing the internet to their continued management challenges. I can’t help but think this response is too little, too late. With that said, it does indicate some level of recognition of how bad things have gotten, which is a step in the right direction.


With all that said, it’s difficult to imagine ISRG or Sectigo, for example, showing the same level of disregard for the processes at play or taking this long to get to this point. While this organizational change might help address that, at the same time as of three days ago, it appears that Entrust was still suggesting that EV certs issued in violation of their CPS weren’t actually misissued. This raises questions about whether they have truly internalized the gravity of the situation or if this public gesture is just that—a gesture.


Beyond that, the thing that I can’t help but ask myself is how long is too long. I’ve not yet gone back and looked at the average response time for other incidents, but just from looking at this thread and the associated bugs since March 6th, there are still missing responses/updates that were promised, and those that were provided were shallow. In my experience as an engineering leader, my first priority in a situation like this would be to ensure that we never missed a promised or obligated response, the second would be to make sure we had done everything possible to address the identified issues immediately. It’s unfortunate that at this point, we are not even there yet. Entrust has had every opportunity to do the right thing, but even with the world watching, they didn’t seem to prioritize it. As a result, today’s response might be better categorized as performative.


Ryan Hurst

Zacharias Björngren

unread,
Jun 23, 2024, 3:39:35 PM (7 days ago) Jun 23
to dev-secur...@mozilla.org, Ryan Hurst, Wayne
Missing the point

> As a global CA we must walk a tightrope in balancing the requirements of the root programs and subscriber needs, especially for critical infrastructure.

This is a very worrying sentence. It seems that both Entrust and many of their subscribers (even more worryingly subscribers responsible for critical infrastructure) completely misunderstand what the purpose of the requirements of the root programs are. These rules, requirements, guidelines, policies, &c are here to keep us safe. And I don't mean us as in relying parties, I mean us as in everyone. That there is a need to balance these requirements against the needs of Entrust subscribers makes me worry about what those subscribers are doing. Why are so many organizations running critical infrastructure not prioritizing following safety regulation?

> Many of our customers represent critical infrastructure due to their roles in the financial system, government, transportation, and other industries and there are real challenges in meeting the guidelines. We recognize that it is not the responsibility of our subscribers to resolve these conflicts. It is our responsibility as part of our commitment to meeting the CA/Browser Forum requirements and protecting the WebPKI.

It's a CAs responsibility to revoke certificates when required. When this cannot be done without causing significant harm because of subscribers lack of capability to handle such a revocation event, then ensuring that a future revocation event can be handled without causing significant harm is a shared responsibility between the CA and their subscribers. In Mozilla's Responding to an Incident the final listed point of expectations in the case of delayed revocation states:

> * You will perform an analysis to determine the factors that prevented timely revocation of the certificates, and include a set of remediation actions in the final incident report that aim to prevent future revocation delays.

If the causes that prevents a timely revocation while avoiding significant harm are internal to the subscribers, then remediation actions must involve them those subscribers. I understand that many of Entrust customers are enormous corporations that can need time to implement the necessary changes. But once a CA becomes aware that one of their subscribers aren't capable handling revocation as required by the BRs, then future issuance of certificates to that subscribers must be predicated on that subscriber making commitments to be able to handle timely revocation. Obviously we don't want to risk harm in a future revocation event, but without requiring the subscriber to make these commitments you are in fact making it your policy to not apply the BR revocation deadlines for that subscriber.

In Comment#82 on bug 1886532 (https://bugzilla.mozilla.org/show_bug.cgi?id=1886532#c82) Bruce Morton writes:

> Although it has been difficult, we now understand that the community places priority on strict adherence to the rules, and views revocation as a tool to influence subscribers into modifying how they use TLS certificates, and is willing to accept much more harm to subscribers and users of the internet than Entrust believed was acceptable.

I want to clarify that for me, this isn't golf. I want these stewards of critical infrastructure to adhere to the rules because if they don't I believe that we risk much greater harm in the future. If a subscriber genuinely requires weeks and months reissue their certificates without causing significant harm, then I agree that a delay in revocation would be prudent. But it is simply unacceptable that for organization controlling such critical infrastructure to be so extremely incapable for technological or organizational reasons. The statement "and views revocation as a tool to influence subscribers into modifying how they use TLS certificates" implies that Entrust did not believe that revocation should be used to influence how webPKI certificates should be used, but is that not what revocation is? If a subscriber is not using a certificate according to the BRs or TLS Guidelines it must be revoked, the threat of revocation is literally a tool to influence behavior of subscribers.

What I believe that the community is trying to communicate is that we are trying to avoid future harm from happening. Assurances from Entrust about how they **would** be able to revoke within 24h, I assume without causing significant harm, for a security issue ring very hollow when Entrust is demonstrably incapable of revocation within weeks or months when the problem is "only mississuance". But if we assume that we can trust that Entrust and their subscribers can handle a mass revocation event within 24h in case of a security breach that still leaves us with this:

> In our conversations with Subscribers, we transparently disclosed that there was no security risk to relying parties if the affected certificates were not revoked, and this context understandably influenced the prioritization.

"and this context understandably influenced the prioritization." Is there an any other interpretation of this sentence than: we could follow the rules, but we would rather spend our money elsewhere. Taken with the statements from the updates June 21st it is clear to me that the harm that Entrust is trying to avoid are the costs of following the requirements of the root programs.

Refusal to learn, bug 1890898

It's important to seize the opportunity to learn from your incidents. Why is Entrust so stubbornly clinging to their analysis in #1890898 that the certificates weren't mississued? I have not seen a single member of the webPKI community outside of Entrust share this position. Two root programs disagree with Entrust. The response from Entrust should not be: "We still think that we are right, but you're the boss", it should be: "%#?!, how could we come to such a different conclusion from everyone else?" The root cause analysis for this section is about how the certificates came to be mississued, it is missing completely the root cause for why Entrust ~~was~~ is not aligned with the rest of the webPKI community when it comes to interpreting how the TLS BRs and EV Guidelines interact with their CPS in this issue. In their June 7th report Entrust thanks industry expert Don Sheehy for his contributions. While it might not be polite to put him on the spot, I believe it would be very interesting to hear directly from him about if he agrees with Entrust, and if he disagrees wether Entrust knew but still chose to proceed with their own analysis.

It's good to see that 1890898 was included in the updated report but that leaves us with the fact that it's one of the issues listed by Mozilla on https://wiki.mozilla.org/CA/Entrust_Issues that are the subject of the requested report. As late as the June 5th they posted their "revised analysis" while it is possible that they hadn't yet written anything in their June 7th report about the issue it's not very likely. It would be interesting to see what they had in their drafts regarding 1890898, and if anything when it was removed. But the complete lack of actual learning for this issue is incredibly alarming and undermines any attempt to believe that Entrust can be a functioning member of the webPKI community.

Broken promises
I don't know exactly how to approach this issue but I haven't seen it addressed by others, and I think it needs to be confronted even if it is about actions (or inaction) of specific persons. I am open to the possibility that others when faced with the same decision came to the opposite conclusion, and that I am in fact wrong in taking up this issue. I am trying to do this as respectfully as possible.

In the June 21th Report we can read:
> Second, our organizational design and governance impeded senior leadership and cross-functional evaluation and awareness of CA/B requirements. Key decisions were in the hands of a limited number of employees in the digital certificates business unit who held duties across multiple functions, including compliance. When responding to recent incidents, the team did not adequately communicate applicable requirements to senior leadership. This led to incorrect decisions and instances where we did not follow delayed revocation and reporting processes as laid out by Mozilla, including late and incomplete incident reports. Knowledge of 2020 commitments was similarly confined to a small number of business unit employees, without broader leadership team/organizational awareness.

I don't understand this explanation, are senior leadership the ones making the decision to delay revocation or not revoke? But those decisions are communicated to the community via Bugzilla, and is that not done through the business unit employees that have knowledge of the 2020 commitments? It's the same person posting: "We will not the make the decision not to revoke." in 1651481, that this year posted: "we decided to not revoke due to exceptional conditions listed in this report." in 1890898. I doubt that senior leadership, or their proxies weren't informed about those commitments, more likely is that they did not understand or care about how serious they were.

The organizational changes that were finally specified more clearly in the 21th June report are obviously long overdue, but I doubt that it will have the impact needed for me to trust Entrust.

Conclusions

While the 21th June report is a much better attempt than the June 7th report I believe that it still falls short of what is expected and required.

They hear the community saying: It's your responsibility to revoke, not the subscribers. They think they understand, but they don't. If  they don't understand that the requirements of the root programs are there to keep us safe, how are they supposed to educate their subscribers of that fact?

Then they miss the point again, while they are justifiably getting a lot of criticism over their failures to revoke on time Entrust fails to understand that what the community want's to see are improvements so that the same mistakes don't happen again. I worry over what harm could happen in the future when there is a security issue if nothing has changed for the subscribers that cannot handle revocation within 5 days.

When it comes to 1890898 perhaps Entrust feels the need to stick to their "revised final incident report" so that they can appear consistent, but if that is so I think it would be a great mistake. We are here now because serial poor decision-making and poor incident responses. For me Entrust's obstinate refusal to actually change is exemplified in 1890898, and completely undermines any trust I have in them.

Zacharias

Amir Omidi (aaomidi)

unread,
Jun 24, 2024, 12:43:52 PM (6 days ago) Jun 24
to dev-secur...@mozilla.org, Zacharias Björngren, Ryan Hurst, Wayne

Let's take a step back from this report. I don't think this report deserves to be taken seriously for one reason alone: You've historically proven to the community that we should not trust any statements made by Entrust. Let's look at how you've proven this:


First - Four years ago, you made a couple of promises in comment 6 of 1651481:


    • We will not the make the decision not to revoke.

    • We will plan to revoke within the 24 hours or 5 days as applicable for the incident.

    None of these promises have been realized in the past four years. Why is this time going to be different? How are we even supposed to measure your commitment to your current action items?


    Second - In this report, you're claiming that a lot of these issues stem from organizational structure. Meanwhile, 3 months ago, Entrust was claiming that:


    This issue has been prioritized at the highest levels within Entrust. We have hundreds of people across Entrust working on remediation—including our senior leadership as well as teams from Customer Support, Operations, Sales, Legal, Compliance, and Product Management, and we have been working hand in hand with executives at Global 2000 companies who are impacted. Our colleagues are working around the clock to support our customers, meet CA/B Forum expectations, and expedite revocation and re-issuance of affected


    So putting these two together, what Entrust seems to be doing is pressing shuffle on the same playlist that's led to all of this.


    Third - As you were sending out this Entrust Report Final-ForRealThisTime.pdf, we had Entrust continue to make nonsensical arguments. Even after it was pointed out by both Chrome, and Mozilla, that what you're doing is not okay.


    To be very clear here, this comment by Entrust was made on 2024-06-19 while we received this new report from Entrust on 2024-06-21. Entrust has not even bothered covering this incident in this report.


    Fourth - As evidenced in your recent incident responses, you don't really care about 1) what the community says 2) what Mozilla says. Time and time again, I've only seen Entrust change their tune on matters when Ryan Dickson (Specifically, Chrome Root Program) chimes in.


    Fifth - Some of your action items make absolutely no sense for a well-established CA:


    • Expand use of linters post-issuance for all certificate types

    • Expand use of linters pre-issuance for all certificate types

    • Implement process during incident review to stop issuing certificates when a mis-issuance event has been confirmed


    Are you claiming you didn't have the linting in place already? Did we learn nothing from all these previous incidents:


    You've had issues with, arguably one of the easiest parts of being a CA, linting. Your issues with linting go back at least six years. Seriously, how do you have so much difficulty with properly implementing pre, and post issuance linting? 


    Beyond that, "Implement process during incident review to stop issuing certificates when a mis-issuance event has been confirmed"


    At Let's Encrypt, and Google Trust Services we used the wording of "suspected". As a CA Engineer, I was empowered to stop issuance at any time if I suspected mis-issuance was happening. I've used that power both correctly, and incorrectly in the past in those CAs and it wasn't a big deal. Why are you waiting until a mis-issuance event has been confirmed? Which seems to take at least 24 hours at Entrust.


    Beyond the language used there being problematic, I'm extremely shocked that this isn't done yet? This was one of the main problems in Entrust's response to the cpsUri incident. How has it taken you nearly five months to address this?


    Sixth - Is this report now happening under the new leadership of compliance? How about the report prior to this? The tone of these two reports are so significantly different that it seems like something changed between these two. What changed between these two incident reports that caused such a significant change in the tone of the report? 


    Moving beyond the tone of these reports - does the new leadership of compliance see the substance of this current report as a satisfactory response to how much Entrust has dropped the ball recently?


    In conclusion - there were so many things you could've done that would've been significantly better than this bag of words in the format of a pdf. I'll list a couple:

    • Immediately revoke all the certificates that you're still in the process of revoking nearly three months later.

    • Reduce the lifetime of your certificates to 180 days until the end of 2024, and then to 90 days once 2025 starts.

    • Sunset the ability for non-automated certificate issuance to take place.


    These are all actions that are externally verifiable, and actually show a meaningful change by Entrust.


    To the community: Entrust, using their own words in this report, admits that they're unable to properly meet the requirements of being a CA due to their organizational limitations, and have created certain action items that are impossible to measure from the outside looking in. To me this sounds like a ship that's sinking, and instead of us being allowed to use lifeboats to get to safety, we're being told to wait while they move people around to balance the ship. I do not see any compelling reason to 1) Trust Entrust's claims here, given their past history of not being truthful and not sticking to their promises 2) Assume the risk on the entire WebPKI ecosystem while Entrust tries to figure out their organizational deficiencies.


    My suggestion here would be to distrust Entrust, and let them re-apply for inclusion in the future once they've asserted that they've fixed the deficiencies that have led Entrust and the community down this path.

    Walt

    unread,
    Jun 24, 2024, 1:25:47 PM (6 days ago) Jun 24
    to dev-secur...@mozilla.org, Amir Omidi (aaomidi), Zacharias Björngren, Ryan Hurst, Wayne
    Some final thoughts on this after re-reading the updated report: 

    First, a deliverable due date is a deliverable due date. If I was asked to answer 7 questions by my management team by a given deliverable date, and provided a report that barely answered two of them, I'd be thinking long and hard about what got me to that point (and the future of my continued employment), and why an updated report took an additional two weeks to create. I would argue that Entrust should be judged based on the initial report primarily, as the due date was very clear, as well as the guidelines for assembling the report. The second report should have been what we saw the first time, but this theme of "Entrust doing closer to the right thing late" seems to be a recurring trend with this series of events. This even goes as far back as 2020, when there was allegedly a promise that this wouldn't happen again (or at least not at the scale it did in 2020) and yet here we are. 

    Two, as Amir noted, there's numerous inconsistencies with the report(s) compared to incidents. 

    This issue has been prioritized at the highest levels within Entrust. We have hundreds of people across Entrust working on remediation—including our senior leadership as well as teams from Customer Support, Operations, Sales, Legal, Compliance, and Product Management, and we have been working hand in hand with executives at Global 2000 companies who are impacted. Our colleagues are working around the clock to support our customers, meet CA/B Forum expectations, and expedite revocation and re-issuance of affected

    followed by saying [paraphrased]: "The work we were doing previously wasn't good enough so we tossed it all out". Is the goal in this updated response simply to make it seem like enough changes have been made to kick the can down the road again for a few more years until Entrust makes some relatively simple mistake that could have been caught by linting and then we end up in this same boat again?

    Third, I'll reiterate the fact that Entrust seemingly doesn't take the opinion of Mozilla RP and the associated community seriously. The incident (#1883843) that started this all, was only taken seriously (12 days after posting) when the original reporter (who happened to be with Google RP) came in and said that the response did not meet Google RP's standards. Only then did mis-issuance stop.

    Fourth, I'll re-iterate as well that pre and post issuance linting feels like a pretty table stakes feature to prevent giving customers certificates that are mis-issued, avoiding the awkward conversations you seem to continually be having with your subscribers. Instead you knowingly issue certificates that might be mis-issued to subscribers, don't pause issuance, and wind up digging a bigger hole which could have been avoided if Entrust team members were empowered to pull the circuit breaker at any time.

    Fifth, there's very little in this report that's measurable in terms of improvement. It digs deeper into what happened sure, but most of these metrics require trust in Entrust to be sharing these metrics correctly to evaluate their compliance with the report. To put it bluntly, I trust Entrust about as much as I could throw Entrust, which is not very far. Given that, I see no reason to trust these metrics given by Entrust, and as such I see no objective way of measuring these commitments.

    In conclusion, I would second Amir's suggestion that Entrust be distrusted, and re-apply for inclusion once the organizational deficiencies have been resolved.

    Walter

    Watson Ladd

    unread,
    Jun 24, 2024, 6:07:30 PM (6 days ago) Jun 24
    to Bruce Morton, dev-secur...@mozilla.org
    So I've finally gotten around to reading it. I was a little confused
    by the lack of real introduction but it is much more detailed.

    However, you're still missing one of the 2020 bugs, namely
    https://bugzilla.mozilla.org/show_bug.cgi?id=1658792. And when I look
    at that against https://bugzilla.mozilla.org/show_bug.cgi?id=1897630 I
    see the answer to what confused me about the organization of the bugs:
    Entrust is overly reliant on humans doing things. It's been organized
    in a way where a single human can err and create a missiuance by not
    seeing an email, or filling out a form in a natural way. It's a
    systemic weakness that this second report only sort of covered, and
    never really dug into. The answer to the geography issues wasn't
    automated checking: it was changing the UI, enabling a different kind
    of error to still happen.

    Some CAs would have used additional automation as a solution:
    validating geographical information against available lists (as
    promised in that bug). Those CAs would likely also have implemented
    monitoring themselves of their OCSP responders, mapped from the CAB
    forum requirements. Culturally they would have been more oriented to
    creating tickets on emails and using the ticket tracking system to
    help ensure that response times were being met and reported on.

    Sincerely,
    Watson Ladd



    On Fri, Jun 21, 2024 at 11:59 AM 'Bruce Morton' via
    dev-secur...@mozilla.org <dev-secur...@mozilla.org>
    wrote:
    >
    > --
    > You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
    > To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.
    > To view this discussion on the web visit https://groups.google.com/a/mozilla.org/d/msgid/dev-security-policy/f3cebe9b-fa25-4b11-ba3d-b7f3f6e0f719n%40mozilla.org.



    --
    Astra mortemque praestare gradatim

    Suchan Seo

    unread,
    Jun 24, 2024, 8:33:26 PM (5 days ago) Jun 24
    to dev-secur...@mozilla.org, Watson Ladd, dev-secur...@mozilla.org, Bruce Morton
    while tethically out of scope for this thread, is there reason for browsers to include offendgind certificates into CRLlite/CRLset without waiting for CA to agree about that?

    2024년 6월 25일 화요일 오전 7시 7분 30초 UTC+9에 Watson Ladd님이 작성:

    Bruce Morton

    unread,
    Jun 25, 2024, 4:55:01 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org, Amir Omidi (aaomidi), Mike Shaver, dev-secur...@mozilla.org, Bruce Morton
    On Friday, June 21, 2024 at 3:31:07 PM UTC-4 Amir Omidi (aaomidi) wrote:
    Quick preliminary question:

    Is this now the final report? The final report that was due two weeks ago.

    This is an update to the final report we issued on June 7. The updates are based on comments and questions from this community over the past two weeks. 
     
    Can you explain how this document is going to reconcile the recent response we got from Entrust over this bug? https://bugzilla.mozilla.org/show_bug.cgi?id=1890685#c46

    As outlined in our report, in the event of a future issue, we will follow our incident response process to ensure swift cross-functional review, decision making and action consistent with CA/B requirements.  
     

    Amir Omidi (aaomidi)

    unread,
    Jun 25, 2024, 4:56:36 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org, Bruce Morton, Amir Omidi (aaomidi), Mike Shaver, dev-secur...@mozilla.org
    > As outlined in our report, in the event of a future issue, we will follow our incident response process to ensure swift cross-functional review, decision making and action consistent with CA/B requirements. 

    Your report makes no mention of that incident.

    Bruce Morton

    unread,
    Jun 25, 2024, 5:01:47 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org, Wayne
    On Friday, June 21, 2024 at 5:17:30 PM UTC-4 Wayne wrote:
    >>Note: During our investigation of this issue, we noted that a subset of 1,975 EV certificates were also issued without the Entrust EV policy identifier (OID), based on our interpretation of the ballot update.
    >This is also a miscount, presumably due to the original figure being 1963 + 6 certs on a test site that are being double-counted.

    On reading further in 2.1.1 Entrust have outright stated they still stand by their incorrect analysis as previously noted in this reply. This speaks volumes as to the decisions that will occur going forward. Within 2.1.3 there is a mention of Entrust continuing to issue certificates and advocate their position, but I am seeing no reflection as to the root cause of what causes them to advocate for their incorrect positions to this day. Not a single line of 2.1.4 addresses this either.

    We have clarified our position on this bug https://bugzilla.mozilla.org/show_bug.cgi?id=1890898#c61.
     
    Regarding ACME, I previously stated this question and will repeat it now: Can you make any guarantees that ACME will be a requirement for subscribers going forward, and that they will not be charged extra for using these systems?

     We are offering ACME free of charge now and are adding support for ARI. We will increase efforts to educate and promote adoption, but we cannot guarantee it as not all subscriber environments support ACME.  

    Looking into 4.3 Appendix 3: Success Measures I won't address each individually. I am curious how you intend to get the WebTrust annual audit results to result in 0 qualifications in the space of a year. I would suggest an element for Communication is added to address how often a question has to be restated or followed up on due to a lack of clarity and transparency. Otherwise the list presents a minimal standard for any complying CA, if this is not kept by any CA it would be further cause for concern.

    Entrust will have qualifications which have already been reported in our current incident reports. These qualifications are being addressed and will be closed in this audit period. Our goal moving forward is to have 0 new qualifications in our current audit period.

    Once again in evaluating against what was requested I am struck at how the systemic failures are not being addressed. We have commitments to committees and boards, but the decisions are what truly matter. There is no mention of what policies caused these initial issues and how they were not adhered to. The 2020 commitments are only highlighted due to every comment noting it specifically, no attempt seems to exist to evaluate against historical issues.

    On the 2020 commitments I am deeply troubled about this statement in particular:
    "Knowledge of 2020 commitments was similarly confined to a small number of business unit employees, without broader leadership team/organizational awareness."
    This should have came up in audits which cover incidents on bugzilla. What happened? Did the auditor only address this with the same small number of business unit employees and somehow no note of these commitments made it into any report that went further up the chain? What confidence can we have in any bugzilla-specific commitments outside of this report going forward?

    Yes, that is correct. The issues were addressed with a small number of business unit employees. We have made significant changes (as described in our response) to ensure that Bugzilla commitments are tracked and met moving forward through our corporate compliance governance process. It's a process we use effectively today for other Entrust business units. 
     
    As a final note I will highlight this section:
    "As part of our response process to the Mozilla community, Entrust assigned a group of three senior leaders, as well as an external consultant, to review each incident to validate and expand root cause analysis."

    Can we please have a breakdown on Entrust's end of what their original opinion was at the start of each incident, and how these personnel would evaluate the situation if it were to happen today? I sincerely hope that #1890898 is not an example going forward.

    Each of these issues had a different set of circumstances, but the common element was that they all were decided by a small number of business unit employees without broader cross-functional review and alignment on requirements and the process for decision making. Moving forward, we will initiate our incident response process (outlined in our response), to ensure swift action consistent with CA/B requirements.  

    Wayne

    unread,
    Jun 25, 2024, 5:07:19 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org
    Further to the latest report making no mention of that incident, in #1890898 there is a statement of intent of there being a second amended final report:
    > Yes, we will amend the final incident report to reflect that we did our analysis

    Is there any intent on there being a final report or is this cycle going to continue? If a judgment on Entrust's ability to make concrete statements will be made, there needs to be a final statement at some point.

    And in regards to your response my statement let us not turn this into an ongoing back-and-forth with individuals. I will only address the last question whereby Entrust will not share how their new team will show any divergence from previous mistakes. The point of asking for an analysis of the 'new team' versus what happened historically is to provide any trust that the same mistakes will not be repeated but with a new badge in place. The lack of confidence in being able to provide that is distressing at this stage.

    - Wayne

    Bruce Morton

    unread,
    Jun 25, 2024, 5:07:29 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org, Zacharias Björngren, Ryan Hurst, Wayne
    On Sunday, June 23, 2024 at 3:39:35 PM UTC-4 Zacharias Björngren wrote:
    Missing the point
    > As a global CA we must walk a tightrope in balancing the requirements of the root programs and subscriber needs, especially for critical infrastructure.

    This is a very worrying sentence. It seems that both Entrust and many of their subscribers (even more worryingly subscribers responsible for critical infrastructure) completely misunderstand what the purpose of the requirements of the root programs are. These rules, requirements, guidelines, policies, &c are here to keep us safe. And I don't mean us as in relying parties, I mean us as in everyone. That there is a need to balance these requirements against the needs of Entrust subscribers makes me worry about what those subscribers are doing. Why are so many organizations running critical infrastructure not prioritizing following safety regulation?

    We serve many of the world’s largest banks, governments, and enterprises and are confident that they do prioritize safety and compliance requirements from a wide variety of regulatory bodies. We are working with them to ensure they are clear on WebPKI compliance requirements moving forward.  If there are use cases in which a privately-rooted environment would be more suitable, we will have those discussions.

    Refusal to learn, bug 1890898

    It's important to seize the opportunity to learn from your incidents. Why is Entrust so stubbornly clinging to their analysis in #1890898 that the certificates weren't miss-issued?

     That is not the case. Please see https://bugzilla.mozilla.org/show_bug.cgi?id=1890898#c61 where this has been addressed.

    I don't understand this explanation, are senior leadership the ones making the decision to delay revocation or not revoke?

     With the process changes described in our updated response, these decisions are now made through a cross-functional compliance review process with senior leadership. This will provide more proactive oversight.  

    But those decisions are communicated to the community via Bugzilla, and is that not done through the business unit employees that have knowledge of the 2020 commitments? It's the same person posting: "We will not the make the decision not to revoke." in 1651481, that this year posted: "we decided to not revoke due to exceptional conditions listed in this report." in 1890898.

     Yes, the business unit employees who made the posts knew of our 2020 commitments. They have been moved into our corporate compliance organization for increased communication and governance.


    Bruce Morton

    unread,
    Jun 25, 2024, 5:12:13 PM (5 days ago) Jun 25
    to dev-secur...@mozilla.org, Amir Omidi (aaomidi), Zacharias Björngren, Ryan Hurst, Wayne
    On Monday, June 24, 2024 at 12:43:52 PM UTC-4 Amir Omidi (aaomidi) wrote:

    Let's take a step back from this report. I don't think this report deserves to be taken seriously for one reason alone: You've historically proven to the community that we should not trust any statements made by Entrust. Let's look at how you've proven this:


    First - Four years ago, you made a couple of promises in comment 6 of 1651481:


      • We will not the make the decision not to revoke.

      • We will plan to revoke within the 24 hours or 5 days as applicable for the incident.

      None of these promises have been realized in the past four years. Why is this time going to be different? How are we even supposed to measure your commitment to your current action items?


      Our updated report addresses this. What will be different is addressed in the overview section. Action items and metrics are in Appendices 2 & 3. Per Bhagwat Swaroop’s letter to the community, we will provide regular progress reports to the community. If we receive additional comments that should be incorporated into our action plan, we will do that.

      Second - In this report, you're claiming that a lot of these issues stem from organizational structure. Meanwhile, 3 months ago, Entrust was claiming that:


      This issue has been prioritized at the highest levels within Entrust. We have hundreds of people across Entrust working on remediation—including our senior leadership as well as teams from Customer Support, Operations, Sales, Legal, Compliance, and Product Management, and we have been working hand in hand with executives at Global 2000 companies who are impacted. Our colleagues are working around the clock to support our customers, meet CA/B Forum expectations, and expedite revocation and re-issuance of affected


      So putting these two together, what Entrust seems to be doing is pressing shuffle on the same playlist that's led to all of this.


      Third - As you were sending out this Entrust Report Final-ForRealThisTime.pdf, we had Entrust continue to make nonsensical arguments. Even after it was pointed out by both Chrome, and Mozilla, that what you're doing is not okay.


      To be very clear here, this comment by Entrust was made on 2024-06-19 while we received this new report from Entrust on 2024-06-21. Entrust has not even bothered covering this incident in this report.


      Fourth - As evidenced in your recent incident responses, you don't really care about 1) what the community says 2) what Mozilla says. Time and time again, I've only seen Entrust change their tune on matters when Ryan Dickson (Specifically, Chrome Root Program) chimes in.


      Fifth - Some of your action items make absolutely no sense for a well-established CA:


      • Expand use of linters post-issuance for all certificate types

      • Expand use of linters pre-issuance for all certificate types

      • Implement process during incident review to stop issuing certificates when a mis-issuance event has been confirmed


      Are you claiming you didn't have the linting in place already? Did we learn nothing from all these previous incidents:


      You've had issues with, arguably one of the easiest parts of being a CA, linting. Your issues with linting go back at least six years. Seriously, how do you have so much difficulty with properly implementing pre, and post issuance linting? 


      Linting has been standard operating process at Entrust since March 2018 for post-linting and May 2019 for pre-linting. We use zlint, and it was in use for pre and post linting during this incident. We plan to expand our linting capability by adding pkilint. With the next platform release (July 30, 2024), we will have linting coverage from both tools pre and post issuance."  

      Beyond that, "Implement process during incident review to stop issuing certificates when a mis-issuance event has been confirmed"


      At Let's Encrypt, and Google Trust Services we used the wording of "suspected". As a CA Engineer, I was empowered to stop issuance at any time if I suspected mis-issuance was happening. I've used that power both correctly, and incorrectly in the past in those CAs and it wasn't a big deal. Why are you waiting until a mis-issuance event has been confirmed? Which seems to take at least 24 hours at Entrust.


      Beyond the language used there being problematic, I'm extremely shocked that this isn't done yet? This was one of the main problems in Entrust's response to the cpsUri incident. How has it taken you nearly five months to address this?


       This happened in the past. As outlined in the report, we’ve taken measures to improve our processes and compliance moving forward. And yes, we are empowered to stop issuing while we investigate possible mis-issuance.  


      Sixth - Is this report now happening under the new leadership of compliance? How about the report prior to this? The tone of these two reports are so significantly different that it seems like something changed between these two. What changed between these two incident reports that caused such a significant change in the tone of the report? 


       The same team wrote both reports. The revised report has additional content to address community comments.  

      Bruce Morton

      unread,
      Jun 25, 2024, 5:15:20 PM (5 days ago) Jun 25
      to dev-secur...@mozilla.org, Walt, Amir Omidi (aaomidi), Zacharias Björngren, Ryan Hurst, Wayne
      On Monday, June 24, 2024 at 1:25:47 PM UTC-4 Walt wrote:
      Some final thoughts on this after re-reading the updated report: 

      First, a deliverable due date is a deliverable due date. If I was asked to answer 7 questions by my management team by a given deliverable date, and provided a report that barely answered two of them, I'd be thinking long and hard about what got me to that point (and the future of my continued employment), and why an updated report took an additional two weeks to create. I would argue that Entrust should be judged based on the initial report primarily, as the due date was very clear, as well as the guidelines for assembling the report. The second report should have been what we saw the first time, but this theme of "Entrust doing closer to the right thing late" seems to be a recurring trend with this series of events. This even goes as far back as 2020, when there was allegedly a promise that this wouldn't happen again (or at least not at the scale it did in 2020) and yet here we are. 

      Two, as Amir noted, there's numerous inconsistencies with the report(s) compared to incidents. 

      This issue has been prioritized at the highest levels within Entrust. We have hundreds of people across Entrust working on remediation—including our senior leadership as well as teams from Customer Support, Operations, Sales, Legal, Compliance, and Product Management, and we have been working hand in hand with executives at Global 2000 companies who are impacted. Our colleagues are working around the clock to support our customers, meet CA/B Forum expectations, and expedite revocation and re-issuance of affected

      followed by saying [paraphrased]: "The work we were doing previously wasn't good enough so we tossed it all out". Is the goal in this updated response simply to make it seem like enough changes have been made to kick the can down the road again for a few more years until Entrust makes some relatively simple mistake that could have been caught by linting and then we end up in this same boat again?

       The updated report included additional details based on comments and feedback that helped us understand the level of detail the community wants to have visibility to.

      Mike Shaver

      unread,
      Jun 25, 2024, 5:19:22 PM (5 days ago) Jun 25
      to dev-secur...@mozilla.org
      While you’re addressing comments, I’d appreciate an answer to my question here: what was the motivation behind redacting that portion of the email to customers, if not to conceal information related to redaction procedures?

      You want to make it clear that you aren’t concealing anything, but you haven’t given us any reason to believe otherwise.

      Mike

      Mike Shaver

      unread,
      Jun 26, 2024, 6:39:29 PM (4 days ago) Jun 26
      to Bruce Morton, dev-secur...@mozilla.org
      I apologize for the length, but I didn't find places where I could remove things that were not IMO material to Mozilla's evaluation of Entrust as a root, or constructive advice that I genuinely wish Entrust to incorporate into their plans.

      Thoughts on Entrust’s revised report

      I'm not going to quote and cite individual missteps extensively, but if I'm misrepresenting something I'm happy to dig up the things that I used as inputs, and make corrections. Instead, I'm trying to summarize some themes that continue to concern me even after Entrust's updated report and responses.


      In summary, while I think Entrust was wise to reuse the Navex guide to Compliance Program assessment, I don't feel they have done so in a way that sufficiently allays concerns about their willingness and indeed ability to operate within the BRs and MRSP, consistently and in good faith. Looking critically at how Entrust has described its actions and decisions shows, in my opinion, not only a company that has been unable to meet the expectations to which CAs must be held, but indeed a company that either doesn't understand or doesn't accept what those expectations are.

      Apportionment of Risk and Responsibility

      Entrust continues to be mistaken about their responsibilities in the WebPKI and MRSP. They say that the subscriber cannot be held responsible for operational limitations on certificate deployment. On the contrary, only the subscriber can be responsible for their own operations, within the CA/root-program/Subscriber/relying-party dynamic of WebPKI. Only the subscriber can ultimately make changes to their own operations, and they will bear the cost. Entrust as a security vendor can take on some responsibility for enabling better Subscriber operations, but their degree of success with that should not bear positively or negatively on the evaluation of Entrust as a CA.


      They also describe being in "tension" between WebPKI and critical systems but omit that this tension is entirely of their own continuous creation, versus reducing, since 2020, the cases in which they encourage or permit subscribers to use web PKI certificates in circumstances incompatible with the operational properties of the web PKI.


      Entrust's action items for delayed revocation often have elements of encouraging subscribers to adopt automation. I think that is a fine thing for them to do, but IMO it should not be considered to be a remediation for a delayed revocation incident, because it's not up to subscribers to decide on delayed revocation. To treat it as remediation is to say that subscriber choices to invest (or not) in improvements to their certificate management operations are allowed to prevent a CA from upholding the requirements. The message needs to be "you might want to improve your operations so you don't have an outage the next time we need to revoke one of your certificates, which we will do on time", not "please improve your operations so that we are able to revoke the certificate we chose to issue you".


      The Improvement Plan says that they will "work with our subscribers to ensure awareness and minimize delayed revocation requests". This language echoes the 2020 commitments, but does not give any meaningful description of a future state they’re seeking. I don't care how many delayed revocation requests they get. That's a function of whether they choose to issue and reissue Web PKI certificates into environments that they know "cannot" actually tolerate immediate revocation in spite of the subscriber's legal commitment to the contrary. Entrust can't solve this problem by reducing the number of times they get asked for exceptions. They have to ensure that they have appropriately narrow exception criteria regardless of how many requests they get . Are we to believe that if Entrust gets more requests in the future, they will permit more exceptions? It's hard to understand why that is relevant to their improvement except that it might cause fewer customers to be disappointed or frustrated; that would be an improvement to Entrust's experience as a vendor, but not material to how Entrust or its Subscribers interact with the web PKI.


      When Entrust decides that it must delay revocation due to intolerable impact on the web ecosystem (paging the Definitions & Glossary WG), proper remediation actions would include an enforced timeline for the subscriber being able to tolerate prompt revocation, possibly by being moved off of web PKI certificates. After that timeline, delayed revocation should no longer be permitted; if Entrust doesn't want to be in that situation, then they should not re-issue to that subscriber.


      As recently as June 21, presumably with the knowledge that Entrust was admitting that its practices had been deficient in the updated report, a representative of Entrust was still defending their previous decisions to delay revocation. "If the strict enforcement of rules begins to take priority over the facilitation of safe and smooth internet transactions, it brings discredit on the entire ecosystem. Entrust has been trying to avoid that, by showing a nuanced understanding of the complex issues faced by subscribers." I disagree with most of that, but that hardly matters. It's a sign that Entrust has not actually accepted that their behaviour was inappropriate.


      Notably, there is no commitment from Entrust to avoid future issuance to systems that will not tolerate prompt revocation. They have stated, "for the record", their position that any CA should be free to issue certificates to subscribers who they know will not be able to tolerate the BRs being upheld, in terms of prompt revocation. This would give  CAs license to knowingly create situations where they will need "exceptional" delays to revocation. That must be held to be incompatible with the CA then delaying revocation due to the situation they helped create, or else we find ourselves in a situation that risks rendering meaningless the BR's timelines and subscriber commitments. (It would also put conformant, good-faith CAs at a commercial disadvantage relative to those who give more weight to customer convenience. While I do not think that financial considerations for CAs should hold much weight in policy making here, incentives are real and we should be careful about creating incentives for undesirable behaviour.)


      Entrust claims that they "generally recommend that subscribers have a backup CA", but I can't find that recommendation anywhere in Entrust's public materials. And again, Entrust opposes making this an industry standard, though IMO it would be a valuable partial mitigation against misuse of web PKI certificates in contexts which cannot tolerate web PKI revocation rules and timelines.


      The "Policy Updates" section of the improvement plan says that they are "considering ways to increase visibility of the CA’s right to revoke certificates on short notice beyond our contract language", but to be frank their Subscriber Agreement itself really does not make that clear. Customers would have to read the BRs directly to get that information. (I consulted https://www.entrust.com/sites/default/files/documentation/licensingandagreements/sept-2020-entrust-site-launch/ecs-subscription-agreement-sep-10-2020.pdf.)

      Insufficient Detail Regarding Leadership and Process Changes

      Entrust has told us that they've made many organizational changes, including of "leadership" and "staffing". Unfortunately, they have not really detailed what was wrong with the old leadership and staffing, and how the new leadership and staffing plans will improve on them, in spite of very specific requests for that information from the community. This is a basic and essential element of leading effective change in any context. They have described operational error, but haven't explained why a leadership change this time will be more successful than the promises made in 2020. I recognize that talking about errors by company leaders can be awkward, but if Entrust doesn't want to tell us why we should believe that the new leadership is better, then they probably shouldn't expect us to count the change in their favour. (Entrust has stated that this context for the leadership changes was in the June 7 report, and seemingly on that basis refused to restate it when asked, but if so it is very well-hidden.) While I have certainly expressed my frustration with Entrust very directly at times, I would personally be happy to help with such an assessment in a blameless, learning-oriented process.


      On the topic of leadership, I want to emphasize that I am not concerned here with individual performance, be it by individuals or leaders, and I do not feel that the Mozilla community should be either. A problem of this magnitude and duration, and reluctance to address in a deep and transparent way, must be a systematic problem and not an individual one. Entrust has created a system that in turn created forces that directed people to act in certain ways. When someone is agreeing to a delayed revocation without writing down the reason, they are doing that for one of two reasons: either they are making an error out of misunderstanding or incompetence (rare, very easy to resolve), or they are doing what they believe Entrust wants them to do (almost universally the case, difficult to remedy). Incentives are very powerful, and if Entrust's employees believe that Entrust wants them to put customer convenience ahead of the integrity of the web PKI, then those employees will do so. If a member of CA business unit staff was not escalating to senior leadership their difficulties with staffing, or investment in linting, or seeing customers successfully adapt their operations to be tolerant of prompt revocation, that's because they believed that practically Entrust did not want them to escalate that way. It's very common for organizations to loudly say "we want to uphold some value", while communicating in a much stronger way that they wish to compromise on that value. That stronger communication to the individuals who ultimately implement the company's operations comes through organizational actions and even explicit policies (such as related to performance evaluation or budgeting). A change in leadership is only effective to the extent that it changes the context in which individual humans and their tools will make decisions and take action. There has been disappointingly little evidence that Entrust actually understands what led (at a truly "root cause" level) to the sustained, unacceptable behaviour.


      Similarly, they promise to "review procedures" and "synchronize with compliance team" but we haven't established that they even know what the procedures should be. Surely, these procedures have been reviewed before, and that has not remedied the situation. In a recent bug, Entrust said that while they were going to revoke the certificates, they didn't agree with the analysis of the root programs. How will they be analyzing similar situations in the future, given that disagreement?


      The updated report says that "Entrust has the technical capability to meet the 24-hour and 5-day revocation requirements", but does not describe what they think is required of that capability, and importantly does not describe non-technical capability that might be required to perform such revocations (such as executive willingness to lose a customer). Even very recently we have seen confusion in Entrust's determination of certificates affected by an incident, and mismatched lists provided.


      If I were in their situation and wished to demonstrate a better alignment with root program and community expectations, I would do this: go back over the cpsURI cert list and say how each would be decided today. I might have to reach out to customers to get more information about how the certificates are used and what the operational context is, but it would let me demonstrate the concrete differences in decision-making that these changes cause. This would also give me a basis to work from for informing subscribers who were granted delayed revocation this time that they would not be in the future. IMO, Entrust doing this would be a valuable contribution to the CA community's understanding of what "exceptional" cases are, and possibly help move towards helpful clarifications of the wording.


      Entrust has said that between 2020 and 2024, they believed that their "existing policies and procedures would be sufficient to support [their] commitments". This is an extremely worrying perspective to take, because it means that the people who were operating the relevant systems thought that everything was OK for four years; as we know from cpsUri and friends, there was no material improvement in handling revocation delay requests. Materially, according to Entrust's explanation, over those four years on the heels of a major incident about excessive delays in revocation, Entrust never wrote down a policy on how to adjudicate delay requests. In 2020, Entrust had been a public CA for twenty years. They had participated in CABF since its inception. They were told by root programs, and agreed, that they were making incorrect decisions about delays, and committed to doing better, but never wrote a policy. I'm honestly disappointed that auditors don't require that policy to be concrete and published, but even beyond that: imagine operating a security and compliance company for twenty years and then getting rinsed because of a failure to decide something correctly, and after that still not writing down how to make the decisions in the future, or not capturing customer communication related to something as critical as revocation and certificate correctness. It's very hard not to interpret that as Entrust being flawed in a fundamental way that needs more than moving the org around.


      I would honestly rather that Entrust was lying about not having a policy, and instead choosing not to disclose it because of content that would not reflect well on them. Similarly, I would prefer to believe that they simply chose to be untruthful about the customer justifications for delays not being captured (and in fact we know that some were, because they were later provided as cherry-picked examples). The alternative is to believe some really unfortunate things about a CA whose certificates, by their own account, are used in very important systems.

      Relationship to Mozilla Root Program

      A concern that has been raised repeatedly and ignored by Entrust is their different behaviour with respect to the Mozilla and Chrome root programs. We have seen them act promptly upon Chrome's team entering a discussion, when that team is saying the same thing that Mozilla has been saying previously. This continues to be unaddressed in the updated report, even though it was called out explicitly in response to the initial report, and repeatedly in incident bugs.


      In spite of frequent reminders of the Mozilla incident response requirements, they continued to not meet those requirements even after recommitting on June 1, 2024. They have not provided the required per-subscriber detail: first claiming that it wasn’t captured, then sharing some unattributed and hiding behind confidentiality that they control. AFAIK they have not gone back to subscribers to get that information and capture it properly. There has been no elaboration on expected harm, just vague industry categorizations and claims of subscriber slowness. This refusal to cooperate fulsomely with the Mozilla incident response policy makes oversight by the root program and community more difficult, because it becomes impossible to identify cases in which Entrust's analysis of the risk tradeoff is out of alignment with that of the community. (To be sure, either side or both might be "wrong" here! But we have to see the gap in order to even start figuring that out.)


      There are a lot of CAs and very few full-time staff allocated to Mozilla's program, which makes it very important that the community be able to assist in that analysis. Even when I was overseeing the program, we needed community assistance, and that was with many fewer CAs, with Chrome's direct assistance with the shared program, and to be frank without an actor behaving in as challenging a manner as Entrust has been. This lack of transparency also makes it impossible for other CAs to learn from or teach Entrust about how to better handle similar situations in the future.


      Similarly, they say that the redaction of a portion of an email to customers (said portion detailing revocation procedures), was "absolutely not" them trying to "conceal" anything about revocation. But of course they did literally conceal things, and they have not responded when asked to explain what their other motivation for redaction might be.


      In short I do not believe that Entrust has operated in good faith through these incidents, with respect to Mozilla and its policies. If Entrust's behaviour became the norm for trusted certificate authorities, I think it would be disastrous for the program and the web.

      Conclusion

      I genuinely worry that it would be irresponsible to trust future Entrust-issued certificates without concrete evidence that improvements have been made, and not just planned. They have not even detailed the plans or success criteria for these initiatives, generally, and there is no way for root programs or the community to evaluate whether the efforts are even pointed in the right direction without much more detail.


      If a new CA using a cross-signed root had Entrust's track record and applied to add their own root to the Mozilla root store today, I believe very strongly that we would first require them to demonstrate improved competence and alignment. Entrust has no more inherent right to operate a CA business than such a newcomer, and I think that we should be very careful not to overweight the duration of Entrust's inclusion as a root, or the size of its business, as justifying even more patience or risk tolerance. To permit Entrust to continue to issue certificates as it has been to date would be to subject the web PKI and Mozilla's users to substantial risk.


      Mike


      Tyrel

      unread,
      Jun 27, 2024, 11:54:51 AM (3 days ago) Jun 27
      to dev-secur...@mozilla.org
      While this revised report and letter from leadership is a substantial improvement compared to what was submitted on June 7th, I share a number of the concerns raised by others here:

      1) Swaroop's letter contains the line "As a global CA we must walk a tightrope in balancing the requirements of the root programs and subscriber needs, especially for critical infrastructure." There is no tightrope to walk here. As a custodian of public trust, a CA must transparently and strictly enforce the rules under which they operate. Subscribers will then take appropriate actions to reach an acceptable level of risk for their operations, which, for critical infrastructure subscribers, could mean anything from having hot spare certificates from an alternate CA ready to go, to migration to a privatePKI. That nearly the highest level of Entrust's leadership still does not recognize this has me worried that this is lipstick on a pig rather than actual meaningful change.

      2) Though very pretty in language, these documents are shockingly light on quantitative, externally measurable, metrics on which to define improvement. I would have at a minimum expected Entrust to commit to almost never delay revocation again (e.g. "we commit to delaying revocation of no more than one certificate per thousand in future incidents"), to commit to not knowingly issue certificates to subscribers for whom revocation in 24 hr/5 days would cause significant harm, and to commit to other actions that help meaningfully move the webPKI ecosystem towards agility and resilience (such as issuing certificates with lifetimes of no more than 1/2 year starting "now", and 1/4 year starting January 2025). Even the commitments to automation and ARI are shallow, missing, e.g., a commitment to not charge customers more for using them, and missing quantitative targets for the fraction of subscribers to be on fully automated certificate pipelines by particular dates. Instead this report commits to almost nothing that is publicly measurable, and comments from Entrust representatives on various open bugs, even after June 21st, make clear that they are unwilling to commit to a quantitative limit on revocation delays, and all but state they will continue to issue certs to subscribers who cannot tolerate a 24 hr / 5 day revocation timeline. I think that the lack of concrete metrics is a serious deficiency. 

      3) It is impossible to tell given the shifting language, but it seems to me that, as of this report, Entrust still disagrees with https://bugzilla.mozilla.org/show_bug.cgi?id=1890898 and https://bugzilla.mozilla.org/show_bug.cgi?id=1890685 being misissuance events. This I think also speaks volumes about the proposed changes in operation not being changes that meaningfully alter the outcomes of how Entrust behaves as a CA.

      Based on these observations, and those of others on this thread, I highly encourage mozilla and other root store operators to distrust Entrust as soon as practicable, e.g. to immediately distrust all Entrust certificates with a not-before of July 1, 2024 or later, and rapidly phasing out the roots. And if Entrust believes they have made meaningful changes that make them again deserving of public trust, that they reapply for inclusion at a later date.

      Tyrel

      PS: For the Entrust business leadership: this is not the end of the world in providing certificates for your customers. Nothing prevents you from becoming a reseller (even a branded reseller) of certificates from another public CA.

      Kurt Seifried

      unread,
      Jun 27, 2024, 4:31:26 PM (3 days ago) Jun 27
      to dev-secur...@mozilla.org
      Also do we know what is happening with their VMC root cert? CN = Entrust Verified Mark Root Certification Authority - VMCR1 which is used for Verified Mark Certificates aka BIMI logos, and is currently supported in Gmail? Do we know if Gmail be removing support for Entrust based VMC certificates and thus BIMI logos done via Entrust? Seeing as how your choices for buying a BIMI/VMC cert are Entrust (or a reseller) and Digicert the removal of trust in CN = Entrust Verified Mark Root Certification Authority - VMCR1 will basically break most BIMI logos in any email platform that supports BIMI and decides to remove Entrust..

      Example:

      $ while openssl x509 -noout -text; do :; done < certchain.pem

      And for additional context on who uses Entrust: https://bimiradar.com/glob#logos

      --
      Kurt Seifried (He/Him)
      ku...@seifried.org

      Mike Shaver

      unread,
      Jun 27, 2024, 4:47:06 PM (3 days ago) Jun 27
      to Kurt Seifried, dev-secur...@mozilla.org
      AFAIK, BIMI certs are not related to the browser root stores in any way, and aren’t signed by server certificate roots.

      Mike

      --
      You received this message because you are subscribed to the Google Groups "dev-secur...@mozilla.org" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to dev-security-po...@mozilla.org.

      Kurt Seifried

      unread,
      Jun 27, 2024, 4:51:57 PM (3 days ago) Jun 27
      to Mike Shaver, dev-secur...@mozilla.org
      We've never had a situation like this, partly due to the fact there are only two VMC sellers, Entrust and Digicert (as I understand it everyone else selling these is a reseller). But I can't see why the issues at Entrust would be restricted to their web cert business and not the VMC business (which are virtually identical products/processes). And thus I can't imagine why the rest of Google wouldn't remove their trust in Entrust as well.

      Mike Shaver

      unread,
      Jun 27, 2024, 5:04:03 PM (3 days ago) Jun 27
      to Kurt Seifried, dev-secur...@mozilla.org
      I don't know what the calculus will be for Google's trust of Entrust-issued BIMI certificates, but I am pretty sure that they won't be announcing that policy on MDSP—you could ask in a Google forum of some kind, but I think most likely you just have to wait for the announcement if/when it comes.

      (I personally think Entrust will not keep the BIMI business around by itself even if the root somehow stays trusted, but it's possible they were completely compliant with all BIMI-related requirements!)

      Mike

      Reply all
      Reply to author
      Forward
      0 new messages