Hi folks!
A core value for the Chrome CT team is transparency (not just in certificates), so we wanted to take an opportunity to share how we are thinking about possible future changes to CT, and encourage community discussion and feedback.
We're quite grateful to today's CT log operators, but we're strongly interested in efforts to make it easier and cheaper for log operators to run durable logs. This is both to make the lives of our existing operators easier, and also to encourage new operators. We're also interested in efforts that increase the transparency guarantees of CT, increase certificate submitters' confidence across log operators to avoid single points of failure, and better align policies with what's needed for a robust ecosystem. In the short term, we're hoping that two efforts will advance these goals.
We're excited about efforts to build out CT logs that have better caching characteristics and generally reduced expected costs. It's particularly encouraging that there are shaping up to be at least two independent implementations of the new static-ct-api spec.
Our hope is to be able to accept logs that implement the new spec into the ecosystem soon, and we've already taken steps to make that possible. Our compliance monitoring infrastructure has been updated to support static-ct-api logs alongside RFC6962 logs, and we're in-progress on updating other critical infrastructure as well.
There are many details to be ironed out, but are currently envisioning a plan that rolls out in the stages outlined below. This plan is us "thinking out loud" and we may change course to ensure broad compatibility with the rest of the ecosystem. For instance, it's important that it's straightforward to get a single set of SCTs that satisfies all CT-enforcing user agents, and our plans may shift as needed to ensure that this remains the case.
While static-ct-api logs are not trusted by Chrome today, we strongly encourage everyone in the CT ecosystem to experiment with these new logs types, and as much as possible, use them alongside existing RFC6962 logs. Certificate submitters should submit certificates to static-ct-api logs. CAs are particularly encouraged to ensure that they can embed SCTs from static-ct-api logs alongside existing RFC6962 log SCTs.
We also encourage those interested to run their own static-ct-api log, and to share their log details on the ct-policy@ list so others can poke at your log. We are also happy to monitor any static-ct-api log with our compliance monitoring infrastructure to help shake out any issues during this critical experimentation phase.
In this phase, Chrome will update our log list to provide a list of static-ct-api logs, and allow requests for inclusion of static-ct-api logs as trusted and usable by Chrome. Chrome will then be updated to accept TLS connections using a similar CT policy as today, but permitting up to one of the required SCTs to come from a static-ct-api log.
This phase allows log operators and CAs excited about static-ct-api to begin using these logs in production. At the same time, monitors that only understand RFC6962 logs will still retain full visibility into the certificate ecosystem (so long as the RFC6962 logs remain healthy).
We expect our own tooling to be ready for this transition in the coming months, however this phase accepts some decreased availability over RFC6962 to support static-ct-api, bringing static-ct-api into a load-bearing position in the ecosystem. As a result, before transitioning to this phase, we'd hope to see the static-ct-api spec formally migrate to a v1.x.y semantic versioning designation, and a few prominent third-party CT log consumers (monitors and auditors) add support for static-ct-api logs.
In this stage, Chrome's CT policy would no longer include the "at-most-one static-ct-api log" restriction from Stage 1, allowing certificates to validate in Chrome with SCTs entirely from static-ct-api logs, so long as all other requirements were met. Logging to exclusively RFC6962 logs, exclusively static-ct-api logs, or a mixture, would all be fully supported.
Before entering this phase, we'd want to see more static-ct-api support among prominent CT monitors and auditors, as this stage represents the first where unmodified existing (i.e. RFC6962-only) monitors lose visibility over certificates logged to CT. We are not looking to rush into this phase, but would hope that it would be about a year after entering Stage 1.
If operators end up preferring RFC6962 logs, we may never mandate the use of static-ct-api logs. If a critical mass of log operators migrate to static-ct-api logs, we may investigate retiring support for RFC6962. No matter what, we do not expect to be in any rush to move beyond Stage 2, but will see how the ecosystem evolves.
In addition to the policy changes needed to support static-ct-api logs, we're also exploring other changes to better align Chrome's CT policy requirements to make it easier to run logs and strengthen CT's transparency guarantees.
The faster that a misissued certificate can be detected, the faster it can be mitigated. Chrome's current CT policy effectively limits logs to MMDs of less than or equal to 24 hours. All current log operators provide logs use an MMD of the full 24 hours. These long MMDs harken to a time when building the merkle tree was considered expensive, and logs were expected to batch pending inclusions on some infrequent cadence.
Today, MMDs mostly serve to give log operators time to resolve availability issues before falling afoul of log policies. static-ct-api logs, in particular, provide an architecture that's well suited to be able to limit typical merge delays to a few seconds, rather than a full day. We're exploring whether we can significantly reduce the maximum MMD permitted by our program by acknowledging that limited MMD violations may not necessitate the distrust and removal of a log.
CT today is a collectively managed distributed system. No single log operator is a single point of failure. We believe that the best way to increase robustness within the ecosystem is to increase the number of log operators. We expect log operators to ensure the integrity of their logs, and be diligent in fixing issues with their logs when issues arise, but requiring that each log is itself a large and complex distributed system ensures that log operation is out of reach for all but a very few operators capable of managing such systems.
We are exploring ways that it might be possible to separate requirements for the write and read paths to better reflect what the ecosystem actually needs from operators, while permitting safer forms of downtime for individual logs. We believe this may leave more room for performant, but simpler, log designs.
For instance, on the read path, high availability of log data is important. Reductions in log read availability go directly against the core objective of CT -- to enable transparency of certificate issuance. Even if a log is not accepting new submissions, it remains important that the log's data be available to monitors and other consumers. Our current availability standard of two-9s lags significantly from industry standards, and tiled logs may provide an opportunity to significantly increase our availability requirements.
The write path has quite different properties. Certificate submitters should always be able to acquire a policy-conforming set of SCTs, which necessitates availability of multiple log operators but does not require availability from multiple logs within a single operator. Among operators with multiple logs, it's arguably sufficient for the ecosystem to ensure that there is always at least one log that can accept the submission of any certificate. We believe that an ecosystem with log operators that can ensure this property for the write path is as robust as an ecosystem with one highly-available log. This would make it substantially easier for log operators to handle expected or unexpected downtime safely, and without penalty. Currently, operating multiple logs comes at a significant storage penalty, and we are exploring approaches to reduce or eliminate that penalty.
Think this whole thing is silly? Of course conversation here is welcome, but you can also talk to us about it in person!
We're excited and committed to CT's future, but there are still lots of uncertainties in the topics above and well beyond. Future challenges like supporting post-quantum encryption will require significant changes to CT.
In years past, we've organized CT Days to bring the community together to brainstorm ideas, discuss upcoming changes, and share perspectives from different members of the ecosystem. This year, we've eschewed having a discrete CT Days event and instead will be participating in the Transparency.dev Summit. The summit will have dedicated time for talks and discussions about CT in particular, not just transparency systems in general. While this event is being planned by our friends in TrustFabric, Chrome will be there, and we encourage everyone in the CT ecosystem (log operators, monitors, auditors, interested CAs) to participate!
As I mentioned in the beginning of this email, we're just thinking out loud here. Do you have thoughts? Let's chat about it!
- Joe, on behalf of the Chrome CT team