Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Sunlight v0.4.0, including a Static CT API client

624 views
Skip to first unread message

Filippo Valsorda

unread,
May 16, 2025, 12:16:25 PMMay 16
to Certificate Transparency Policy
Hello!

I have just tagged Sunlight v0.4.0.

It includes the local POSIX filesystem backend and Skylight read-path server that power the Tuscolo CT log.

Sunlight logs now also expose a /log.v3.json endpoint from both the monitoring and the submission prefixes, as previously discussed. It's a per-log endpoint, as it was not clear where to host the per-operator one, if an operator has multiple Sunlight instances. If this looks good, I can also propose it as part of a future Static CT API v1.1.0.

Finally, the filippo.io/sunlight package now includes a Static CT API client. It exposes a simple Go iterator that produces (index, LogEntry) pairs, starting from arbitrary indexes and automatically validating entries against the provided checkpoint. It supports Retry-After, context, and timeouts. To avoid fetching redundant partial tiles, it stops iterating at the end of the last full tile, unless asked to start from there. (The idea is that clients that fetch new checkpoints in a loop will naturally only fetch full tiles if they fill fast enough, and automatically fetch the partial tile if a checkpoint doesn't progresses the full tiles.)

The client is a high-level wrapper around the Client in filippo.io/torchwood, a new (well, renamed) collection of tlog tools.

Alla prossima,
Filippo

Andrew Ayer

unread,
May 22, 2025, 10:32:28 AMMay 22
to Filippo Valsorda, Certificate Transparency Policy
On Fri, 16 May 2025 18:15:00 +0200
"Filippo Valsorda" <fil...@ml.filippo.io> wrote:

> Sunlight logs now also expose a /log.v3.json endpoint
> <https://tuscolo2025h2.sunlight.geomys.org/log.v3.json> from both the
> monitoring and the submission prefixes, as previously discussed
> <https://groups.google.com/a/chromium.org/g/ct-policy/c/936lR3MEUDU>.
> It's a per-log endpoint, as it was not clear where to host the
> per-operator one, if an operator has multiple Sunlight instances. If
> this looks good, I can also propose it as part of a future Static CT
> API v1.1.0.

Thanks to the log.v3.json endpoint, this was the easiest log ever to configure in my monitor.

I think this should be added to the spec. I also think log policy should require the log application to contain this JSON file, since the current unstructured application format seems to be quite prone to human error.

Regards,
Andrew

Joe DeBlasio

unread,
May 22, 2025, 12:32:58 PMMay 22
to Andrew Ayer, Filippo Valsorda, Certificate Transparency Policy
We agree that the json adds a ton of value (and I even thought so before I made a ton of typos while manually adding the Tuscolo logs 🙃), and requiring structured log metadata is at the top of our policy TODO list for Chrome. I am hopeful we'll send more specific guidance next week. Providing a log's json blob in the log inclusion bug (even if just manually pasted in) will be the first step, and since that's straightforward, will be required more or less immediately.

We'd also like log operators to have a single unified location where they publish a list of their logs. That could enable easier/automatable log discovery, which we expect to be useful as certificate lifetimes continue to decrease. I had originally planned on asking log operators to publish a single endpoint serving a version of the "operator" entry used in our log lists (augmented to indicate desired UA inclusion status, instead of UA-specific inclusion state), though if folks think there's strong value in each log serving its own details, that operator endpoint could just point out to the logs for log-specific data. I'd be interested in what folks think.

Best,
Joe

--
You received this message because you are subscribed to the Google Groups "Certificate Transparency Policy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ct-policy+...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/ct-policy/20250522073144.dc62e3d03249038195e33864%40andrewayer.name.

Andrew Ayer

unread,
May 22, 2025, 4:40:31 PMMay 22
to Joe DeBlasio, Certificate Transparency Policy
On Thu, 22 May 2025 09:32:38 -0700
Joe DeBlasio <jdeb...@chromium.org> wrote:

> We agree that the json adds a ton of value (and I even thought so
> before I made a ton of typos while manually adding the Tuscolo logs),
> and requiring structured log metadata is at the top of our
> policy TODO list for Chrome. I am hopeful we'll send more specific
> guidance next week. Providing a log's json blob in the log inclusion
> bug (even if just manually pasted in) will be the first step, and
> since that's straightforward, will be required more or less
> immediately.

Excellent!

> We'd also like log operators to have a single unified location where
> they publish a list of their logs. That could enable
> easier/automatable log discovery, which we expect to be useful as
> certificate lifetimes continue to decrease. I had originally planned
> on asking log operators to publish a single endpoint serving a
> version of the "operator" entry used in our log lists (augmented to
> indicate desired UA inclusion status, instead of UA-specific
> inclusion state), though if folks think there's strong value in each
> log serving its own details, that operator endpoint could just point
> out to the logs for log-specific data. I'd be interested in what
> folks think.

I think there's huge value in logs serving their own metadata, to ensure that it accurately reflects the log's configuration. Per-operator lists would probably be manually managed and prone to mistakes.

As a monitor operator, I'm actually more excited by per-log metadata endpoints than per-operator lists. We already have the UA-published log lists for finding and automatically configuring logs that absolutely must be monitored. Monitoring additional logs is discretionary and I manually evaluate additional logs on a case-by-case basis before deciding to monitor them; the pain comes not from finding the logs, but from configuring them with the correct details.

Regards,
Andrew

Joe DeBlasio

unread,
May 29, 2025, 1:21:49 PMMay 29
to Andrew Ayer, Certificate Transparency Policy
Chrome's CT log inclusion application process has been updated to require a json object, either pasted directly into the bug or available via URL (like the Geomys logs provide), with log metadata. This is a small and tactical step into this space -- we'd be supportive of static-ct-api requiring a log metadata endpoint, and we may consider additional changes in the future to streamline our log inclusion processes.

(In particular, we may still investigate requiring per-operator lists in the future, even if those lists amount to little more than a list of log URLs. This could eventually facilitate automation of parts of the log inclusion process, which would save Chrome folks a bunch of time, ensure more timely turnaround times for applicants, and may be nice for some CT-relevant designs for a PQ WebPKI that Chrome is investigating.)

Joe

Philippe Boneff

unread,
Jun 18, 2025, 10:10:34 AM (3 days ago) Jun 18
to Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
I think there's huge value in logs serving their own metadata, to ensure that it accurately reflects the log's configuration.  Per-operator lists would probably be manually managed and prone to mistakes.
+1 for logs having their own metadata, if they can. static-ct-api submission servers might not be able to provide the monitoring endpoint, and even for RFC6962 logs, serving the host is a bit of a stretch. True that a per-operator list would have to be managed, but it could be done with a small script that itself curls some of the data from the per log endpoint to avoid errors while copy-pasting information.

A good end goal I think would be to have a fully automated lifecycle process that works well for operators, enforcing User Agents, and monitors. Having a per-log endpoint is a good stepping stone, though it does not allow to fully automate everything as a log operator still has to paste the URL of the log somewhere to announce their new log. Today, that would be in crbug and in an email to certificate-tran...@group.apple.com. I don't know for sure how other CT enforcing User Agents and monitors configure their services, I'd assume that they either follow crbug and/or Chrome's log list.

If these per log URLs, or the full metadata objects representing an operator's logs showed up in a single, per operator endpoint, would Chrome, Apple and monitors consume it directly to start monitoring logs and start their inclusion process automatically? The only downside to this that I can see is that it would remove human acks of inclusion requests, which is always reassuring... but also automation is precisely the end goal here. Would other log operators be comfortable with this?

Cheers,
Philippe

--
You received this message because you are subscribed to the Google Groups "Certificate Transparency Policy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ct-policy+...@chromium.org.

Jeremy Rowley

unread,
Jun 18, 2025, 10:19:35 AM (3 days ago) Jun 18
to Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
I like it! I assume any information provided on an incident report or inclusion request would be descriptive at that point? IE - if there is a conflict between the metadata and anything else provided, the meta-data would control? 

Pierre Barre

unread,
Jun 18, 2025, 10:26:23 AM (3 days ago) Jun 18
to Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
It'd be great if the endpoint was standardized before there's more adoption. On mine I've settled on https://compact-log.ct.merklemap.com/inclusion_request.json before I saw the logv3 endpoint served by Sunlight.
It would also probably make more sense to have this as /ct/v1/{something} instead of at the root, but I'm unsure how much work it would be to amend the RFC.

Best,
Pierre

Matthew McPherrin

unread,
Jun 18, 2025, 11:54:39 AM (3 days ago) Jun 18
to Pierre Barre, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
> I'm unsure how much work it would be to amend the RFC.

RFCs are not amendable, with the only allowance being for "Errata" attached to track errors.
Adding a new feature like this would require a new RFC to be published, if you wanted to go through IETF.
The TRANS working group which standardized CT and CTv2 has been concluded, so there's not an obvious place to do so either.

At this point, with static-ct being documented as a C2SP spec, I think it would be fine to add the "log metadata JSON" as another C2SP spec that both static-ct and RFC6962 logs implement.


Pierre Barre

unread,
Jun 18, 2025, 5:14:39 PM (3 days ago) Jun 18
to Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy

At this point, with static-ct being documented as a C2SP spec, I think it would be fine to add the "log metadata JSON" as another C2SP spec that both static-ct and RFC6962 logs implement.

I have concerns about this approach.

A policy requiring "implement RFC6962 plus this C2SP spec" sets a bad precedent. We risk creating a patchwork of requirements spread across RFCs, GitHub repos, and various specs. The CT ecosystem already suffers from information being scattered across multiple sources - we shouldn't make this worse.

C2SP explicitly states: "C2SP decisions are not based on consensus. Instead, each spec is developed by its maintainers... Since C2SP produces specifications, not standards, technical disagreements can be ultimately resolved by forking."

This is the opposite of how interoperability standards and / or specifications should work. We need stability and broad input, not the ability to fork when we disagree. 

Best,
Pierre

Winston de Greef

unread,
Jun 18, 2025, 5:41:04 PM (3 days ago) Jun 18
to Pierre Barre, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Hi Pierre,

If Google wants (or if we want Google) to require logs to host a metadata endpoint, there's practically two ways Google could do this:
1. They incorporate the requirements in the CT log policy directly.
2. They could write a separate document, call it a "standard", and add compliance with this standard as a requirement in the CT log policy.

As mentioned, RFC6962 can not be modified.

I prefer option 2, because this makes it easy for other log program operators (ie Apple) to include the requirement for hosting a metadata endpoint, because they can just reference the standard in their policy.

On C2SP not being based on consensus, I do not feel like that is that important, because ultimately google's CT log policy is not based on consensus, but on the whims of Google. This standard would effectively be an extension of Google's CT log policy, and is also directly based on Google's whims. (IE if a standard was reached via consensus, but Google doesn't like it, they will just choose not to include it as a requirement in their log policy, and presumably then it won't get implemented by logs.)

Sincerely,
Winston de Greef


Pierre Barre

unread,
Jun 18, 2025, 5:58:31 PM (3 days ago) Jun 18
to Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy

Winston de Greef

unread,
Jun 18, 2025, 7:01:47 PM (3 days ago) Jun 18
to Pierre Barre, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Hi Pierre,

I misunderstood what you meant by updating RFC 6962. I thought you wanted to change the text of this RFC, but you mean publishing a new RFC for CT v1.1 that would add a metadata endpoint. I feel like there is definitely a place for CT v1.1, because it would be nice for the extra algorithms for validating in V2 to be in the same document as the endpoints that are actually implemented.

Such an update however would presumably want to fix multiple issues with RFC6962 in one go, and might take quite a while to develop, so I still believe that having a seperate standard for the metadata endpoint (which might get rolled up into a future RFC update) is preferable.

Sincerely,
Winston de Greef

Pierre Barre

unread,
Jun 18, 2025, 7:07:14 PM (3 days ago) Jun 18
to Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
On Thu, Jun 19, 2025, at 00:11, Winston de Greef wrote:
Hi Pierre,

I misunderstood what you meant by updating RFC 6962. I thought you wanted to change the text of this RFC, but you mean publishing a new RFC for CT v1.1 that would add a metadata endpoint. I feel like there is definitely a place for CT v1.1, because it would be nice for the extra algorithms for validating in V2 to be in the same document as the endpoints that are actually implemented.

Such an update however would presumably want to fix multiple issues with RFC6962 in one go, and might take quite a while to develop, so I still believe that having a seperate standard for the metadata endpoint (which might get rolled up into a future RFC update) is preferable.

By the way, I've been curious about the limited adoption of CT v2, it seems to have gained very little to no traction in practice.

Does anyone have insights into why the ecosystem has largely stuck with v1?

Best,
Pierre

Ben Cartwright-Cox

unread,
Jun 18, 2025, 7:10:06 PM (3 days ago) Jun 18
to Pierre Barre, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Pierre, the CT v2/Sunlight spec is quite new, and there is now at least 1 sunlight log that is about to be included in chrome as of today

Pierre Barre

unread,
Jun 18, 2025, 7:23:30 PM (3 days ago) Jun 18
to Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Hi Ben,

I am specifically talking about CT v2, not static CT https://www.rfc-editor.org/rfc/rfc9162.html

I realize this may be a controversial opinion given the current ecosystem's tendency to favor static CT, but I hope static logs won't ever replace classic CT.

The API is extremely difficult to use, conceptually complex, and challenging to verify (requiring each client to essentially engineer an entire CT log implementation) and it shifts the implementation burden entirely onto clients. More fundamentally, static CT has always felt unnecessary to me because an efficient classic CT implementation is entirely possible (one of the reason why I wanted to work on CompactLog).

Best,
Pierre

Filippo Valsorda

unread,
Jun 18, 2025, 11:59:49 PM (2 days ago) Jun 18
to Pierre Barre, Ben Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
2025-06-19 01:23 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
The API is extremely difficult to use, conceptually complex, and challenging to verify (requiring each client to essentially engineer an entire CT log implementation) and it shifts the implementation burden entirely onto clients.

https://pkg.go.dev/filippo.io/sunlight#example-Client is an example of a fully verifying high level Static CT client.

According to scc, the whole filippo.io/sunlightfilippo.io/torchwoodgolang.org/x/mod/sumdb/tlog, and even golang.org/x/mod/sumdb/note packages combined are 2397 lines of code. This is a gross overestimation of the code required just for the client, or for Static CT in particular, but it helps show we are not hiding the complexity somewhere. For rough comparison, CompactLog (without counting SlateDB it is based on) is 9493 lines.

The rest is a subjective assessment, but "requiring each client to essentially engineer an entire CT log implementation" is inaccurate.

More fundamentally, static CT has always felt unnecessary to me because an efficient classic CT implementation is entirely possible (one of the reason why I wanted to work on CompactLog).

There are tradeoffs, otherwise one could argue CompactLog is unnecessary because an efficient Static CT client implementation is entirely possible.

The ecosystem has had long-standing issues with log operation cost, complexity, and scalability—and not with client complexity—so Static CT optimizes for the former.

Pierre Barre

unread,
Jun 19, 2025, 1:01:27 AM (2 days ago) Jun 19
to Filippo Valsorda, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Hi Filippo,

Looking at your sunlight client code, I think it actually reinforces my concerns about complexity distribution. While the LOC count appears modest, the client achieves simplicity primarily by delegating the hard parts: checkpoint verification is pushed to callers ("should have been verified by the caller"), and most cryptographic work happens in the torchwood dependency. This creates a thin veneer of simplicity while still requiring consumers to understand checkpoints, signatures, tile models, and verification flows.

I think we're looking at complexity from different angles. While your example demonstrates that a working Static CT client can be implemented, the cognitive complexity for log consumers remains significantly higher compared to RFC 6962.

With RFC 6962, a consumer can:

- Make straightforward HTTP requests to well-defined endpoints
- Receive directly usable responses without additional processing
- Implement basic verification with minimal cryptographic knowledge

The Static API, even with your client library, requires consumers to:

- Understand tile-based data structures and their implications
- Implement tile fetching and assembly logic
- Handle the inherent complexity of reconstructing log state from static components

Your LOC comparison actually reinforces my point: while the server implementation becomes simpler (which may benefit operators), the aggregate complexity for the ecosystem increases because every consumer now needs more sophisticated client logic. RFC 6962 centralizes this complexity in the log operator, where it can be implemented once and shared by all consumers.

My concern isn't that Static CT is technically impossible to implement, but rather that this shift in complexity creates barriers for smaller log consumers and reduces the diversity of implementations in the ecosystem. The RFC 6962 ecosystem is already arguably slim in terms of independent implementations. With the additional complexity barriers of static APIs, I fear we'll see even less diversity, with most consumers simply defaulting to reference implementations rather than building their own clients.

Your sunlight client example actually illustrates this pattern - it's a thin wrapper that most developers will likely use as-is rather than understanding the underlying static model well enough to build alternatives. This creates a concerning dependency pattern where the ecosystem's health becomes tied to the maintenance and governance of a few reference libraries, ultimately undermining CT's decentralization goals by creating de facto centralization at the client library level.

Best,
Pierre

Filippo Valsorda

unread,
Jun 19, 2025, 3:24:33 AM (2 days ago) Jun 19
to Pierre Barre, Ben Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
2025-06-19 07:01 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
Hi Filippo,

Looking at your sunlight client code, I think it actually reinforces my concerns about complexity distribution. While the LOC count appears modest, the client achieves simplicity primarily by delegating the hard parts: checkpoint verification is pushed to callers ("should have been verified by the caller"), and most cryptographic work happens in the torchwood dependency. This creates a thin veneer of simplicity while still requiring consumers to understand checkpoints, signatures, tile models, and verification flows.

No, the line count I provided includes all those dependencies: "the whole filippo.io/sunlight, filippo.io/torchwood, golang.org/x/mod/sumdb/tlog, and even golang.org/x/mod/sumdb/note packages combined are 2397 lines of code".

I think we're looking at complexity from different angles. While your example demonstrates that a working Static CT client can be implemented, the cognitive complexity for log consumers remains significantly higher compared to RFC 6962.

With RFC 6962, a consumer can:

- Make straightforward HTTP requests to well-defined endpoints
- Receive directly usable responses without additional processing
- Implement basic verification with minimal cryptographic knowledge

The Static API, even with your client library, requires consumers to:

- Understand tile-based data structures and their implications
- Implement tile fetching and assembly logic
- Handle the inherent complexity of reconstructing log state from static components

Your LOC comparison actually reinforces my point: while the server implementation becomes simpler (which may benefit operators), the aggregate complexity for the ecosystem increases because every consumer now needs more sophisticated client logic. RFC 6962 centralizes this complexity in the log operator, where it can be implemented once and shared by all consumers.

Again, relative tradeoffs, not absolutes: "Static CT centralizes complexity in the client library, where it can be implemented once and shared by all consumers" is the specular argument.

My concern isn't that Static CT is technically impossible to implement, but rather that this shift in complexity creates barriers for smaller log consumers and reduces the diversity of implementations in the ecosystem. The RFC 6962 ecosystem is already arguably slim in terms of independent implementations. With the additional complexity barriers of static APIs, I fear we'll see even less diversity, with most consumers simply defaulting to reference implementations rather than building their own clients.

Your sunlight client example actually illustrates this pattern - it's a thin wrapper that most developers will likely use as-is rather than understanding the underlying static model well enough to build alternatives. This creates a concerning dependency pattern where the ecosystem's health becomes tied to the maintenance and governance of a few reference libraries, ultimately undermining CT's decentralization goals by creating de facto centralization at the client library level.

I am much less concerned about client library diversity than about log implementation diversity. A client can be reimplemented and replaced in a weekend with a tight feedback loop, a log takes at least 90 + 70 days to become usable, plus all the time it takes to productionize and debug, as you're finding out with CompactLog.

[ Aside: the more clients the better, though! A Rust client for example is something you could contribute, although you should check with the Cloudflare folks if they already have one as part of their Azul Static CT log implementation. ]

I keep trying to explain we are optimizing for different things. This is very common and totally natural! The productive conversation to have is then around what are the ecosystem's priorities and pain points, which should drive the tradeoffs. This is the conversation that happened last year and led to the adoption of Static CT (and to answer one of your questions upthread, didn't lead to the adoption of CT v2).

Pierre Barre

unread,
Jun 19, 2025, 5:15:57 AM (2 days ago) Jun 19
to Filippo Valsorda, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
The LOC argument feels like a red herring. We could write a "hello world" CT client in 10 lines that does nothing useful, or a 10,000-line client that's bulletproof and easy to use. What matters is the cognitive complexity required to use CT correctly, not how many characters the reference implementation contains.

You dismiss client implementation diversity by claiming clients can be "reimplemented in a weekend" - but this fundamentally misunderstands the problem. If implementing compliant, secure CT clients were truly trivial, we'd see much more diversity in the RFC 6962 ecosystem than we currently do. The fact that we don't suggests that even the "simpler" RFC 6962 protocol presents real barriers to implementation.

Moreover, I could equally argue that a basic RFC 6962 server could be implemented as a weekend project too - yet you emphasize the operational complexity that makes this impractical in production. This suggests we're applying different standards to client versus server complexity when evaluating the static CT tradeoffs.

The Rust client suggestion kind of proves my point - we're already talking about needing specialized implementations for each ecosystem rather than simple HTTP clients that any developer can write. Even ignoring cryptographic verification entirely, a basic RFC 6962 client just makes HTTP GET requests to well-defined endpoints and receives directly usable JSON responses. Static CT requires understanding tile structures, checkpoint formats, fetching multiple tiles, reconstructing log state from distributed components, and handling partial tiles where iteration may need to stop and resume across checkpoint boundaries.

I think you're underestimating the practical barriers to client implementation. "Weekend reimplementation" might be possible for experts like yourself, but most CT consumers aren't cryptography specialists. They need approachable APIs, not weekend homework assignments. The ecosystem health concerns I'm raising are about real-world adoption patterns, not theoretical implementation speed.

Static CT is essentially exposing CT log internals as the public API. The tile format and structure become immutable interface contracts that all implementations must support. If we ever want to optimize storage differently - say, using different batching strategies, storage layouts, or data structures - we're now stuck implementing translation layers to maintain the static CT interface. 

This is the opposite of good API design, which should abstract away implementation details to preserve future flexibility. RFC 6962 lets log operators innovate on storage and performance while maintaining a stable, simple interface. Static CT locks us into specific storage patterns as public contracts.

Static CT would also make zero Maximum Merge Delay architecturally impossible to achieve efficiently. The tile structure requires batching to fixed boundaries - you can't publish individual certificates immediately without creating excessive numbers of tiny tiles that defeat the caching benefits. This demonstrates how exposing storage patterns as immutable public contracts constrains what kinds of performance characteristics are even possible, regardless of implementation innovation.

The fundamental premise that RFC 6962 servers are inherently too complex operationally isn't necessarily true. If server-side complexity can be addressed through better implementation approaches, then shifting that complexity to clients seems like the wrong direction for the ecosystem.

Best,
Pierre

Bas Westerbaan

unread,
Jun 19, 2025, 5:43:51 AM (2 days ago) Jun 19
to Pierre Barre, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
On Thu, Jun 19, 2025 at 1:23 AM Pierre Barre <pierre...@barre.sh> wrote:
Hi Ben,

I am specifically talking about CT v2, not static CT https://www.rfc-editor.org/rfc/rfc9162.html

CT v2 does not solve any pressing problems with the ecosystem: the cost of adoption outweighs the gains.

Static CT on the other hand does address acute problems of CT operators (making the read-API easily cacheable.)
 
I realize this may be a controversial opinion given the current ecosystem's tendency to favor static CT, but I hope static logs won't ever replace classic CT.

The API is extremely difficult to use,

If you're a mirroring monitor, you don't need to bother with the trees proper: you only need to pull the data tiles, which is pretty straightforward. To wit: for our monitor to support static CT logs, we only changed the get-entries call. A pretty short patch overall. The logic to recompute treeheads is all the same.
 
conceptually complex, and challenging to verify (requiring each client to essentially engineer an entire CT log implementation) and it shifts the implementation burden entirely onto clients. More fundamentally, static CT has always felt unnecessary to me because an efficient classic CT implementation is entirely possible (one of the reason why I wanted to work on CompactLog).

You're not solving the caching problem.

Best,

 Bas
 

Pierre Barre

unread,
Jun 19, 2025, 5:48:50 AM (2 days ago) Jun 19
to Bas Westerbaan, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
You're not solving the caching problem.
When you can serve 1+ GB/s per CPU core (As CompactLog can), the caching problem static CT solves becomes irrelevant. We're adding protocol complexity to work around a performance limitation that no longer exists.

Best,
Pierre

Bas Westerbaan

unread,
Jun 19, 2025, 6:41:13 AM (2 days ago) Jun 19
to Pierre Barre, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
You're certainly not compute bound at that stage. I'd say 1GB/s is an upper limit on bandwidth for somewhat accessible hosting of non cached CT logs. With Hetzner you pay $48 per month for the 10Gbit uplink upgrade. Then another $1.20 for every TB over 20TB per month, which adds another $3000/month.

That 1 GB/s allows about 1500 monitors to tail the log today (entries are on average 5kB today). Let's look ahead. Say the number of certificates double every three year. That cuts number of monitors by eight come 2034 (187 monitors). By 2034 we'll have short-lived certs, so we're looking at at least a 15-fold increase, so we're down to 12 monitors. By 2034 we might also see a bunch of post-quantum chains as well. An ML-DSA-44 chain with one intermediate adds 7.4kB.

Clearly something has to happen before 2034.

I'd say StaticCT is a great first step, but doesn't go far enough yet. It helps a lot with deduplicating intermediates; making the read-API cacheable and not serving base64, but it doesn't future proof us just yet.

Best,

 Bas

Pierre Barre

unread,
Jun 19, 2025, 7:27:22 AM (2 days ago) Jun 19
to Bas Westerbaan, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
This discussion has shifted from LOC complexity to implementation time to server complexity to bandwidth costs, with each new concern emerging when the previous one is addressed. This suggests we're working backwards from a conclusion rather than genuinely evaluating the tradeoffs.

If bandwidth efficiency is the primary concern, RFC 6962 could have evolved to address this - for example, serving chains with hash references to certificates rather than complete chains, achieving the same deduplication benefits. RFC 6962 responses can also be HTTP cached effectively since most monitors tail recent entries or request similar ranges, making cache hit rates comparable to static tiles. For point GETs or specific ranges, RFC 6962 can actually be more efficient since you fetch exactly what you need rather than entire tiles.

Static CT isn't inherently more efficient - these are implementation choices that could have been applied to RFC 6962 without the client complexity overhead. The bandwidth argument assumes we had to choose between "current RFC 6962 with full chains" and "static CT," when we could have evolved RFC 6962 to be more efficient while keeping the simple client interface.

If static CT "doesn't go far enough yet" and won't solve the long-term scaling challenges, that suggests we shouldn't be splitting the ecosystem with an intermediate solution that will require further changes anyway. Also, I get a 8Gbps shared uplink to my home - treating 1Gbps as a meaningful constraint for enterprise CT infrastructure is not realistic. Using Cloudflare's own analysis of wholesale pricing, 1 Gbps costs about $80/month in US/Europe at wholesale rates - not the $3000 crisis scenario. These scaling concerns might solve themselves without requiring protocol changes.


Best,
Pierre

Bas Westerbaan

unread,
Jun 19, 2025, 7:40:19 AM (2 days ago) Jun 19
to Pierre Barre, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
On Thu, Jun 19, 2025 at 1:27 PM Pierre Barre <pierre...@barre.sh> wrote:
This discussion has shifted from LOC complexity to implementation time to server complexity to bandwidth costs, with each new concern emerging when the previous one is addressed.

You're speaking to different people with different concerns.
 
This suggests we're working backwards from a conclusion rather than genuinely evaluating the tradeoffs.

Please, I think everyone here is just trying their best to keep this ecosystem running well, and it's been tough. The problems we're dealing with aren't just technical, but also about keeping our organisational sponsors interested in this. I think I'm not the only one that is really happy that you joined with Merklemap.
 
If bandwidth efficiency is the primary concern, RFC 6962 could have evolved to address this - for example, serving chains with hash references to certificates rather than complete chains, achieving the same deduplication benefits.

That is what StaticCT is doing.
 
RFC 6962 responses can also be HTTP cached effectively since most monitors tail recent entries or request similar ranges, making cache hit rates comparable to static tiles. For point GETs or specific ranges, RFC 6962 can actually be more efficient since you fetch exactly what you need rather than entire tiles.

In theory yes, but there are few caching reverse proxies that you can configure to do so.
 
If static CT "doesn't go far enough yet" and won't solve the long-term scaling challenges, that suggests we shouldn't be splitting the ecosystem with an intermediate solution that will require further changes anyway. Also, I get a 8Gbps shared uplink to my home - treating 1Gbps as a meaningful constraint for enterprise CT infrastructure is not realistic.

You dropped an 8: I was talking about 1GB/s. That's btw also roughly the traffic we're serving across our logs today.

Best,

 Bas

Pierre Barre

unread,
Jun 19, 2025, 8:01:48 AM (2 days ago) Jun 19
to Bas Westerbaan, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
You're right about the units - I meant 1 GB/s, not Gbps. At wholesale rates that's ~$640/month in US/Europe, still far from a $3000 crisis scenario. The core point stands: bandwidth costs aren't the fundamental constraint they're portrayed as.

I really appreciate your kind words about Merklemap and this community - it's been great to contribute. That's exactly why I care so much about these decisions. As someone building monitoring infrastructure, what I need most is stability and predictable APIs that don't require constant reimplementation as the ecosystem evolves.

While different people may have different concerns, the pattern of constantly shifting justifications - from LOC to complexity to operational issues to bandwidth costs - still suggests we're retrofitting technical arguments rather than having a principled discussion about ecosystem tradeoffs.

On caching: RFC 6962 responses are just standard HTTP GET requests - they can be cached with normal HTTP caching headers without any special reverse proxy configuration. Any standard CDN or caching layer can handle this (even if we may have some inefficiencies with query param ordering operating that way) . The claim that "few caching reverse proxies can configure this" doesn't align with how basic HTTP caching works.

I appreciate the organizational challenges, but "keeping sponsors interested" shouldn't be the primary driver of protocol design decisions. If static CT is an intermediate solution that "doesn't go far enough yet," fragmenting the ecosystem seems counterproductive. As a monitor, I want stability, not complexity that keeps changing.

I think we probably need to agree to disagree at this point, but I appreciate the thoughtful discussion.

Best,
Pierre

Bas Westerbaan

unread,
Jun 19, 2025, 10:16:54 AM (2 days ago) Jun 19
to Pierre Barre, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
 
I really appreciate your kind words about Merklemap and this community - it's been great to contribute. That's exactly why I care so much about these decisions. As someone building monitoring infrastructure, what I need most is stability and predictable APIs that don't require constant reimplementation as the ecosystem evolves.

The caching issue has been expressed by several log operators for a while. When Sunlight was announced, there was quite a generous period of discussion and consultation (on these mailing lists and other venues such as transparency dev summit) before Chrome proceeded to cautiously allow it. Generally everyone was happy with the move. Personally I'd have preferred to have seen more fundamental changes done at once (eg. https://datatracker.ietf.org/doc/draft-davidben-tls-merkle-tree-certs/04/ ) to reduce the number of migrations, but it's become clear an incremental approach works better for those in the ecosystem today.
 
On caching: RFC 6962 responses are just standard HTTP GET requests - they can be cached with normal HTTP caching headers without any special reverse proxy configuration.

As you already alluded to, with a naive implementation the following will all be disjoint cache entries

[...]?start=1000&end=2000
[...]?start=1000&end=2001
[...]?start=1000&end=1000
[...]?start=1001&end=2000

From our testing, caching like this doesn't help.

I appreciate the organizational challenges, but "keeping sponsors interested" shouldn't be the primary driver of protocol design decisions.

To clarify I didn't intend to suggest that's the primary reason.
 
If static CT is an intermediate solution that "doesn't go far enough yet," fragmenting the ecosystem seems counterproductive. As a monitor, I want stability, not complexity that keeps changing.

I think we probably need to agree to disagree at this point, but I appreciate the thoughtful discussion.

Same.

Pierre Barre

unread,
Jun 20, 2025, 3:09:49 PM (20 hours ago) Jun 20
to Bas Westerbaan, Ben Cartwright-Cox, Winston de Greef, Matthew McPherrin, Philippe Boneff, Joe DeBlasio, Andrew Ayer, Certificate Transparency Policy
Hi all,

After our discussion, I wanted to explore the practical implications further. I've added static CT support to CompactLog alongside the RFC 6962 API.


CompactLog now serves both APIs from the same LSM-tree backend while maintaining similar performance characteristics. I should note that my previous assumption about tiles making zero MMD architecturally impossible was wrong - I'm still achieving immediate certificate incorporation with both APIs.

After implementing both, I'm thinking this might actually be a nice middle ground. Rather than deprecating RFC 6962 or rejecting static CT entirely, logs could offer both:

- Tiles are generated on the fly.
- Monitors doing bulk synchronization can use the tile API for efficient batch downloads
- Interactive queries and single-certificate lookups can use the simpler RFC 6962 endpoints
- Operators could apply different rate limiting strategies based on use case

Happy to share implementation details if anyone's interested in the approach.

Best,
Pierre
Reply all
Reply to author
Forward
0 new messages