New rfc6962 implementation

524 views
Skip to first unread message

Pierre Barre

unread,
Jun 2, 2025, 11:15:09 AMJun 2
to certificate-transparency
Hi everyone,

I wanted to share an experimental CT log implementation I've been working on that takes a different approach to the scalability challenges frequently discussed here.
The project is fully RFC 6962-compliant and explores whether we can achieve the same scalability benefits as tile-based architectures, but with a simpler operational model.

Key approach:
- Uses LSM-trees for storage
- Native support for object storage (S3, Azure, local)
- Leverages LSM-tree properties: natural write batching, efficient compaction...

From rudimentary testing and lightweight hardware ressources (8vcpus), the server easily handles 10k+ writes per second, and 20k+ reads per second.

Pierre Barre

unread,
Jun 2, 2025, 5:39:22 PMJun 2
to certificate-transparency
To follow-up on this with something testable:

I've hosted compact_log at https://compact-log.pre-test.ct.merklemap.com/ tree grow few hundreds entries a second on a very cheap VM (16 euros a month), with barely any cpu usage (10-20% of 4 slow cores).

Get-entries should be pretty fast too:
Pierre

--
You received this message because you are subscribed to a topic in the Google Groups "certificate-transparency" group.
To unsubscribe from this group and all its topics, send an email to certificate-transp...@googlegroups.com.

Pierre Barre

unread,
Jun 6, 2025, 4:23:06 AMJun 6
to certificate-transparency
Hello,

After a bunch of improvements I am happy to report that compact_log is able to process thousands of `add-(pre)-entry` requests per second.

Here is some logs:

INFO slatedb::compactor: i: Batch stats: 26 batches flushed (avg size: 904, avg time: 183ms), throughput: 4703 entries/sec, queue: 436/2000
INFO slatedb::compactor: i: Batch stats: 28 batches flushed (avg size: 925, avg time: 177ms), throughput: 5181 entries/sec, queue: 428/2000
INFO slatedb::compactor: i: Batch stats: 18 batches flushed (avg size: 884, avg time: 265ms), throughput: 3185 entries/sec, queue: 430/2000

So far, I think it's a very successful experiment that'd probably offer a path towards very cheap to operate and higher available logs.

I have a little more work to do around the CA root validation and then, I'll apply for a test log.

Best,
Pierre

Ryan Dickson

unread,
Jun 6, 2025, 9:00:01 AMJun 6
to certificate-...@googlegroups.com
Hi Pierre,

I have a little more work to do around the CA root validation and then, I'll apply for a test log.

Related to root validation, and in support of log operators, we recently launched two CCADB reports that list the CA certificates included in the root stores being managed in the CCADB (i.e., Apple, Chrome, Microsoft, and Mozilla).
  • Production Logs (includes CAs trusted by at least one of the CCADB root stores)
  • Test Logs (includes CAs that have applied to at least one of the CCADB root stores and are not included in the above report)
The lists are dynamically updated with each retrieval.

Hopefully these reports can simplify at least one aspect of log management going forward.

- Ryan

You received this message because you are subscribed to the Google Groups "certificate-transparency" group.
To unsubscribe from this group and stop receiving emails from it, send an email to certificate-transp...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/certificate-transparency/9c7fe07f-1bf3-4ab6-ad0f-1bf3c448207c%40app.fastmail.com.

Pierre Barre

unread,
Jun 6, 2025, 9:27:33 AMJun 6
to certificate-transparency
Hi Ryan,

Thank you _so much_ for your email. I completely missed that and was going to implement something not-optimal in comparison.

Should this remain static in the context of a log? Or is it permitted to fetch the list in a worker every so often to update get-roots, and related verification path?

Best,
Pierre 

Filippo Valsorda

unread,
Jun 6, 2025, 9:31:00 AMJun 6
to certificate-...@googlegroups.com
2025-06-06 15:27 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
Hi Ryan,

Thank you _so much_ for your email. I completely missed that and was going to implement something not-optimal in comparison.

Should this remain static in the context of a log? Or is it permitted to fetch the list in a worker every so often to update get-roots, and related verification path?

I was going to ask the same. The browsers ask the initial set of roots as part of the application process, but I can't find clear rules on how often they can be updated.

Indeed, we (Geomys) also have a sub-optimal process to collate roots, and would rather fetch this list in a cronjob, although we'd probably make it only additive, so no roots are ever removed, both to avoid outages if CCADB returns the wrong output, and to provide ongoing transparency for recently distrusted roots.

Pierre Barre

unread,
Jun 6, 2025, 9:49:36 AMJun 6
to Filippo Valsorda, certificate-transparency
Wouldn't we need something like `/get-removed-roots` ?

Best,
Pierre

Philippe Boneff

unread,
Jun 6, 2025, 10:42:40 AMJun 6
to certificate-...@googlegroups.com, Filippo Valsorda
> only additive, so no roots are ever removed

+1, my plan is to do this for Tessera+CT. I'd like to prune roots as much as we can each time we start a new temporal log though: there's no need to keep expired roots around if certs chaining to this root will never make it into the log. 

Ryan Dickson

unread,
Jun 6, 2025, 3:31:52 PMJun 6
to certificate-...@googlegroups.com
Hi Pierre,

We’re so glad to hear the reports will be helpful!

The Chrome CT Log Policy states: “In order to maintain broad utility to Chrome and its users, CT Logs are expected to accept logging submissions from CAs that are trusted by default in Chrome across all its supported platforms, including ChromeOS, Android, Linux, Windows, macOS, iOS.

The approach outlined by Philippe appears to balance the above with reasonable operational considerations quite well.

We’ll also look to update our existing policy language for more clarity re: expectations for accepted roots. 

- Ryan, on behalf of the Chrome CT team


Pierre Barre

unread,
Jun 6, 2025, 3:47:53 PMJun 6
to certificate-transparency
Thank you! 

I’ll go that route then.

By the way, out of curiosity, is the primary reason for a log to not just accept anything primarily for anti-spam reasons and to not make monitors waste resources? Or is there also something else?

Best,
Pierre

Filippo Valsorda

unread,
Jun 6, 2025, 3:53:42 PMJun 6
to certificate-...@googlegroups.com
2025-06-06 21:47 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
Thank you! 

I’ll go that route then.

By the way, out of curiosity, is the primary reason for a log to not just accept anything primarily for anti-spam reasons and to not make monitors waste resources? Or is there also something else?

There are related but even more critical anti-poisoning concerns: if logs incorporate illegal material or PII there are no mechanisms to take it down while preserving auditability of the log.

Pierre Barre

unread,
Jun 6, 2025, 4:02:31 PMJun 6
to Filippo Valsorda, certificate-transparency
Couldn’t already you encore illegal things and PII as punnycode in SANs?

Corey Bonnell

unread,
Jun 6, 2025, 4:13:34 PMJun 6
to certificate-...@googlegroups.com, Filippo Valsorda

Pierre Barre

unread,
Jun 6, 2025, 4:18:21 PMJun 6
to certificate-transparency, Filippo Valsorda
Fun, I was toying around the idea of using CT logs to notarize documents few months back, with signatures stored in SANs, I got told off but I still like the idea ;P

Pierre Barre

unread,
Jun 7, 2025, 8:47:01 PMJun 7
to certificate-transparency, Filippo Valsorda
Hello,

I've received several questions about CompactLog's architectural approach, particularly regarding how MMD and versioning is handled. Here's a detailed explanation of the key design decisions:

HOW MMD IS ELIMINATED ENTIRELY

Many CT log implementations have a Maximum Merge Delay - a window after SCT issuance where submitted certificates aren't yet included in the Merkle tree. This exists because traditional implementations issue SCTs immediately, then incorporate certificates later via background processes.

CompactLog eliminates MMD by reversing this order - certificates are incorporated before SCTs are issued:

Submission 1  ─┐
Submission 2  ─┼─ Wait up to 500ms ─→ Batch tree update ─→ All SCTs returned
Submission 3  ─┘                       └── Certificates already incorporated

The 500ms delay is submission delay, not MMD. Once SCTs are issued, certificates are already in the tree (MMD <= 0).

TRADITIONAL VS COMPACTLOG TIMING

Traditional CT implementations:
Submit cert → Issue SCT immediately → [MMD period] → Incorporate in tree

CompactLog:
Submit cert → [Batch delay ≤500ms] → Incorporate in tree → Issue SCT

Result: Traditional logs have MMD after SCT issuance; CompactLog has zero MMD.

STH-BOUNDARY VERSIONING

CompactLog versions nodes only at STH publication boundaries:
- Update nodes in-memory during batch operations  
- Store O(log n) versioned nodes only at STH publication
- With STHs every k certificates: reduces versioned node storage from O(n log n) to O(n log n / k)

Example: Publishing STHs every 1000 certificates reduces versioned node storage overhead by 1000x.

Best,
Pierre

Winston de Greef

unread,
Jun 7, 2025, 9:49:30 PMJun 7
to certificate-...@googlegroups.com
Hi Pierre,

Small nitpick: MMD is a promise by the log to incorporate a certificate in the log within an amount of time (the MMD) after issuing the Signed Certificate Timestamp. If you are saying your log has no MMD, you are effectively saying you are not making a promise about how long it takes a certificate to be included after an SCT is issued (ie you might wate a day, or never include it), making your log invalid for inclusion in any log program.


You are effectively promising to have any certificate included in the log within 0 time after issuing the SCT, so your log has a MMD of 0.


Sincerely,
Winston de Greef

Winston de Greef

unread,
Jun 7, 2025, 9:52:11 PMJun 7
to certificate-...@googlegroups.com
No MMD isn't the same as sn MMD if 0

Sincerely,
Winston de Greef

Pierre Barre

unread,
Jun 7, 2025, 10:05:41 PMJun 7
to Winston de Greef, certificate-transparency
Hi Winston,

Ah, that makes sense! Thank you, I’ll clear up the wording in the project’s readme.

What I was trying to say, is that because the tree is updated _before_ the SCTs are returned, the concept of merge delay doesn’t exist (because if compactlog cannot integrate a new entry, the SCTs just won’t be returned to the clients that are currently waiting).

Best,
Pierre 

Pierre Barre

unread,
Jun 8, 2025, 8:45:26 AMJun 8
to certificate-transparency, Winston de Greef
Winston, 

I've updated the readme with what I hope is some more accurate and clearer wording: https://github.com/Barre/compact_log/commit/a632c92ef292ba6cc1f6e8c8bbe3a102e209acf4

I've also added a new section as I just implemented chain deduplication:


========================

Certificate Chain Deduplication

CompactLog stores certificate chains using content-addressable storage:

  1. Entry structure: Each log entry stores SHA-256 hashes of certificates rather than the certificates themselves
  2. Certificate store: Certificates are stored separately under cert:{hash} keys
  3. Deduplication: Multiple entries referencing the same certificate (e.g., intermediate CA certs) share the same stored copy
  4. Reconstruction: The API reconstructs full certificate chains by resolving hash references during retrieval
The DeduplicatedLogEntry structure contains:

  • Certificate hash (32 bytes)
  • Chain certificate hashes (array of 32-byte hashes)
  • Original metadata (timestamp, index, entry type)
========================

Now, I'll ingest few hundreds of millions certificates to check how efficient all this actually is :)

Best,
Pierre

Pierre Barre

unread,
Jun 8, 2025, 1:51:48 PMJun 8
to certificate-transparency, Winston de Greef
Now some performance numbers, using a 8 cores (6 years old) cpu and 16GB of memory:

Compactlog happily serves:

- 5gb/s of get-entries calls:

- A little less than 10k add-(pre)-chain requests per second:

2025-06-08T17:46:43.654825Z  INFO compactlog::storage: Batch stats: 46 batches flushed (avg size: 922, avg time: 91ms), throughput: 8486 entries/sec, queue: 382/2000
2025-06-08T17:46:48.642678Z  INFO compactlog::storage: Batch stats: 46 batches flushed (avg size: 967, avg time: 86ms), throughput: 8904 entries/sec, queue: 429/2000

Best,
Pierre

Pierre Barre

unread,
Jun 10, 2025, 1:37:29 PMJun 10
to certificate-transparency
Hello everyone,

I've ingested 54M certificates in CompactLog from real sources (current and past logs).

Current storage: 122GB for 54M+ entries (https://compact-log.pre-test.ct.merklemap.com/ct/v1/get-sth). This translates to approximately 236GB per 100M certificates or 2.36TB per billion. I was able to ingest continuously at around 4k-8k entries per second on commodity hardware.

For reference, Let's Encrypt's 2019 blog post (https://letsencrypt.org/2019/11/20/how-le-runs-ct-logs/) mentioned their implementation used around 1TB per 100M entries. I wonder if their current numbers are similar.

Best,
Pierre

Filippo Valsorda

unread,
Jun 10, 2025, 1:46:29 PMJun 10
to certificate-...@googlegroups.com
2025-06-10 19:36 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
Hello everyone,

I've ingested 54M certificates in CompactLog from real sources (current and past logs).

Current storage: 122GB for 54M+ entries (https://compact-log.pre-test.ct.merklemap.com/ct/v1/get-sth). This translates to approximately 236GB per 100M certificates or 2.36TB per billion. I was able to ingest continuously at around 4k-8k entries per second on commodity hardware.

For reference, Let's Encrypt's 2019 blog post (https://letsencrypt.org/2019/11/20/how-le-runs-ct-logs/) mentioned their implementation used around 1TB per 100M entries. I wonder if their current numbers are similar.

tuscolo2025h2 (which should be indicative of any Sunlight log, including Let's Encrypt's ones) is just past 500M entries and taking 427GB, so about 86GB per 100M entries.

root@ctlog-tuscolo:~# head -n 3 /tank/logs/tuscolo2025h2/data/checkpoint
500741615
lH5FUAjLI0/Te0Ufwo+flpiVJAWYE67aEyMqroniykk=

root@ctlog-tuscolo:~# zfs list tank/logs/tuscolo2025h2
NAME                      USED  AVAIL  REFER  MOUNTPOINT
tank/logs/tuscolo2025h2   427G  13.0T   427G  /tank/logs/tuscolo2025h2

Pierre Barre

unread,
Jun 10, 2025, 1:48:40 PMJun 10
to certificate-transparency
Interesting! Thank you for the numbers.

Pierre Barre

unread,
Jun 10, 2025, 7:44:10 PMJun 10
to certificate-transparency
Hi Ryan,

I've successfully implemented logic to fetch the CCADB report, and the process went very smoothly. This report has made the implementation much more straightforward than I anticipated. I'm grateful it exists, as I expected significant challenges in fetching and assembling roots.

Following Filippo and Philippe's suggestion, I've implemented this as a purely additive store:



I've encountered one edge case that I wanted to raise.

The CCADB report doesn't include certain monitoring certificates, such as "CN=Merge Delay Monitor Root,OU=Certificate Transparency,O=Google UK Ltd.,ST=London,C=GB".

While I understand why these may be out of scope for the report, their absence creates a small challenge for my implementation.

Rather than adding special handling for these monitoring certificates (which would compromise the clean design), I wanted to ask: Is there any possibility these could be included in report, or are they intentionally excluded from its scope?

Thank you again for your help.

Best,
Pierre

Ryan Dickson

unread,
Jun 10, 2025, 9:55:13 PMJun 10
to certificate-...@googlegroups.com
Hi Pierre,

I've successfully implemented logic to fetch the CCADB report, and the process went very smoothly. This report has made the implementation much more straightforward than I anticipated. I'm grateful it exists, as I expected significant challenges in fetching and assembling roots.

This is great to hear. We appreciate your feedback, and further welcome any suggestions you or other members of the community have.

Rather than adding special handling for these monitoring certificates (which would compromise the clean design), I wanted to ask: Is there any possibility these could be included in report, or are they intentionally excluded from its scope?

We'll look into this further and will report back.

More to come!

- Ryan


Pierre Barre

unread,
Jun 11, 2025, 11:42:02 AMJun 11
to Filippo Valsorda, certificate-transparency
Just curious, how much is "Logical Used" in zfs? And the compression algorithm you use / as well as the recordsize, if you can share that :)

Best,
Pierre

On Tue, Jun 10, 2025, at 19:46, Filippo Valsorda wrote:

Filippo Valsorda

unread,
Jun 11, 2025, 1:00:03 PMJun 11
to Pierre Barre, certificate-transparency
2025-06-11 17:41 GMT+02:00 Pierre Barre <pierre...@barre.sh>:
Just curious, how much is "Logical Used" in zfs? And the compression algorithm you use / as well as the recordsize, if you can share that :)

logicalused is 535G, compression is the "on" default (lz4), and recordsize is the default (128K). Here's a full listing.

Note that Sunlight simply stores Static CT files on disk, and data tiles in Static CT are compressed with gzip. I am actually kinda surprised by the 1.28x compress ratio.

Pierre Barre

unread,
Jun 11, 2025, 5:58:17 PMJun 11
to Filippo Valsorda, certificate-transparency
Thank you again Filippo!

logicalused is 535G, compression is the "on" default (lz4), and recordsize is the default (128K). Here's a full listing.

After switching to Zstd and using larger SSTs, I am now around 42GB for 20,622,946 entries.

So that'd be:

- 100M certificates: ~204 GB
- 1 billion certificates: ~2.04 TB

Probably less in real life, due to the compounding effects of deduplication. Less efficient than yours, but still not bad considering it supports the full RFC6962 (or at least, as far as I know...).

Note that Sunlight simply stores Static CT files on disk, and data tiles in Static CT are compressed with gzip. I am actually kinda surprised by the 1.28x compress ratio.

That's indeed very interesting. I now wonder if LZ4 generally compresses gzip-compressed data well, or if it's just a quirk due to the nature of the data.

Best,
Pierre

Pierre Barre

unread,
Jun 13, 2025, 11:27:30 AMJun 13
to certificate-transparency
Hello,

The endpoints appear to be erroring out quite frequently. While not critical, this may be worth investigating. Unfortunately, I don't have specific statistics as I haven't been keeping logs, but I've seen this quite enough to notice.

Here's an example error:

2025-06-13T15:18:29.167980Z ERROR compactlog::ccadb: CCADB update failed: Internal error: Failed to read CCADB response: error decoding response body

Best,
Pierre

Chris Clements

unread,
Jun 20, 2025, 3:27:16 PMJun 20
to certificate-...@googlegroups.com
Hi Pierre,

We’ve added the compliance monitoring root to the Production Logs report (the report that includes CAs trusted by at least one of the CCADB Root Store Operators). We still need to investigate the endpoint errors you referenced, and we’ll plan to report back shortly after reviewing.


Thank you

-Chris, on behalf of the Chrome CT team


Pierre Barre

unread,
Jun 20, 2025, 4:26:06 PMJun 20
to certificate-transparency
Thank you, in the meantime I’ll add more verbose logging to see what the problem actually is!

Best,
Pierre 

Pierre Barre

unread,
Jun 30, 2025, 11:38:55 AMJun 30
to certificate-transparency
Hi Chris,

I was able to catch it:

2025-06-30T15:35:20.797442Z ERROR compactlog::ccadb: CCADB update failed: Internal error: Failed to read CCADB response: reqwest::Error { kind: Decode, source: hyper::Error(Body, Custom { kind: Other, error: Error { code: ErrorCode(1), cause: Some(Ssl(ErrorStack([Error { code: 134217884, library: "elliptic curve routines", function: "ossl_ecdsa_simple_verify_sig", reason: "bad signature", file: "crypto/ec/ecdsa_ossl.c", line: 491 }, Error { code: 109576198, library: "asn1 encoding routines", function: "ASN1_item_verify_ctx", reason: "EVP lib", file: "crypto/asn1/a_verify.c", line: 218 }]))) } }) }

I hope that will be helpful.

Best,
Pierre

Pierre Barre

unread,
Jun 30, 2025, 12:39:16 PMJun 30
to certificate-transparency
Saw it for RSA too, just in case...

2025-06-30T16:36:55.170920Z ERROR compactlog: Initial CCADB update failed: Internal error: Failed to read CCADB response: reqwest::Error { kind:
Decode, source: hyper::Error(Body, Custom { kind: Other, error: Error { code: ErrorCode(1), cause: Some(Ssl(ErrorStack([Error { code: 33554570,
library: "rsa routines", function: "RSA_padding_check_PKCS1_type_1", reason: "invalid padding", file: "crypto/rsa/rsa_pk1.c", line: 79 }, Error
{ code: 33554546, library: "rsa routines", function: "rsa_ossl_public_decrypt", reason: "padding check failed", file: "crypto/rsa/rsa_ossl.c", 
line: 796 }, Error { code: 478674948, library: "Provider routines", function: "rsa_verify_directly", reason: "RSA lib", file: "providers/impleme
ntations/signature/rsa_sig.c", line: 1041 }, Error { code: 109576198, library: "asn1 encoding routines", function: "ASN1_item_verify_ctx", reaso
n: "EVP lib", file: "crypto/asn1/a_verify.c", line: 218 }]))) } }) }

Best.
Pierre
Reply all
Reply to author
Forward
0 new messages