Optimizing RFC 6962 CT Log Data Acquisition Efficiency Based on the Static CT API Format

189 views
Skip to first unread message

Xiaoming Yang

unread,
May 29, 2026, 7:10:14 AMMay 29
to Certificate Transparency Policy
1. Problem Statement
    For a long time, while operating RFC 6962 CT log servers, we have faced severe performance challenges with the usage patterns of the get-entries endpoint:
        Low Cache Hit Rate: Due to the randomness and uncertainty of the start and end parameters, the cache hit rate at both the CDN and gateway levels is extremely low, preventing the cache from delivering its acceleration benefits.
        Huge Backend Pressure: Every request requires a range query in the backend database and real-time data compression. This not only causes a high database I/O load but also significantly increases the Gzip processing overhead on the gateway.
        Inefficient Bandwidth Utilization: Frequent data requests from certain monitors result in excessively high network transmission bandwidth consumption, exhibiting a long-tail effect.
        Retrieval Latency Degrades with Data Growth: As the scale of CT logs continues to expand, the retrieval latency for monitors keeps climbing, which adversely affects the real-time availability of monitor data.

2. Solution Strategy
    Given that the Static CT API has been recognized within the CT community and mainstream monitor tools have already supported this format, we have decided to deploy a set of data acquisition interfaces that comply with the Static CT API standard .
    Core Logic: Leverage the static, immutable, and pre-cacheable characteristics of Tile data to alleviate backend database query pressure and network bandwidth bottlenecks through CDN caching.

3. Implementation
    Deploy an HTTP service specifically dedicated to serving Tile-formatted data.
    3.1 Architectural Design
        Interface Design:  
            /tile/data/<N>: Provides standard Tile block data.      
            /issuer/<SHA256>: Provides CA issuer information.
            Interfaces and data formats:
                 https://github.com/C2SP/C2SP/blob/main/static-ct-api.md#log-entries
                 https://github.com/C2SP/C2SP/blob/main/static-ct-api.md#issuers
            Among these, the /tile/data interface only supports the format of /tile/data/<N> (i.e., full tiles)  and does not support the [.p/<W>] parameter (partial tiles).

        Data Flow:
            Data Acquisition: The HTTP service communicates with the trillian_log_server via the gRPC protocol to retrieve raw data segments.
            Format Reassembly: The service repackages the JSON data retrieved via gRPC from the RFC 6962 format into the Tile format.
            Caching and Distribution: The packaged Tile data is distributed and cached by CDN nodes.

    3.2 Verification Mechanism
        After a monitor retrieves the Tile data, the processing workflow is as follows:
            Parsing: Parse the Tile data packets according to the Static CT API format.
            Auditing and Verification: Monitors must independently compute the Merkle Tree Leaf Hashes based on the RFC 6962 specification to ensure the consistency proofs and audit.

4. Expected Benefits
    Community Benefits: Monitors can sync full/incremental data much faster, meeting the demands of continuously growing log scales in the future.
    Performance Improvement: With CDN caching, data retrieval latency will be dramatically reduced from "database processing time + transmission time + bandwidth congestion" down to "CDN edge response time".
    Efficiency Gains:
        Lowers database I/O utilization for the operator.
        Enhances overall service stability.

Xiaoming Yang

unread,
May 29, 2026, 7:18:54 AMMay 29
to Certificate Transparency Policy, Xiaoming Yang
We have deployed this interface at ct2026-a.trustasia.com/log2026a and ct2026-b.trustasia.com/log2026b.
With this interface, each request can retrieve 256 entries without modifying our configuration.

Request Examples:
```bash
curl -v https://ct2026-a.trustasia.com/log2026a/tile/data/000 --output log2026a_000
curl -v https://ct2026-b.trustasia.com/log2026b/tile/data/000 --output log2026b_000
```

Rob Stradling

unread,
May 29, 2026, 12:02:07 PMMay 29
to Xiaoming Yang, Certificate Transparency Policy
This is great!

Are you intending to implement /checkpoint too?
Or are you envisaging that each monitor will adopt a hybrid RFC6962/StaticCT strategy specifically for TrustAsia's logs - i.e., call /ct/v1/get-sth, then fetch tiles ?


From: ct-p...@chromium.org <ct-p...@chromium.org> on behalf of Xiaoming Yang <xiaomi...@trustasia.com>
Sent: 29 May 2026 12:18
To: Certificate Transparency Policy <ct-p...@chromium.org>
Cc: Xiaoming Yang <xiaomi...@trustasia.com>
Subject: [ct-policy] Re: Optimizing RFC 6962 CT Log Data Acquisition Efficiency Based on the Static CT API Format
 
We have deployed this interface at ct2026-a. trustasia. com/log2026a and ct2026-b. trustasia. com/log2026b. With this interface, each request can retrieve 256 entries without modifying our configuration. Request Examples: ```bash curl -v https: //ct2026-a. trustasia. com/log2026a/tile/data/000
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
 
ZjQcmQRYFpfptBannerEnd
--
You received this message because you are subscribed to the Google Groups "Certificate Transparency Policy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ct-policy+...@chromium.org.
To view this discussion visit https://groups.google.com/a/chromium.org/d/msgid/ct-policy/917c0ada-4415-4b0f-922c-791c1f11fc29n%40chromium.org.

Xiaoming Yang

unread,
Jun 1, 2026, 5:17:43 AM (12 days ago) Jun 1
to Certificate Transparency Policy, Rob Stradling, Xiaoming Yang
Our original intention in designing the tile/data endpoint was to provide a more efficient data retrieval interface, ensuring that monitors that need faster data access can obtain it in a timely manner.

We believe that for pure data retrieval, one can determine whether the maximum tree_size has been reached simply by checking the HTTP status codes (200 vs. 404) of the tile/data endpoint. Furthermore, we currently do not serve the final, incomplete tile data via tile/data.

The /checkpoint endpoint involves cryptographic signing using a private key, and we are uncertain whether we can (or should) use the current private key to sign the data for this additional endpoint. Note that our implementation does not involve modifying Trillian's codebase; instead, it is achieved via a separate application that retrieves Trillian's results using gRPC.

We would love to hear feedback and suggestions from the community and the monitors on this matter.
Reply all
Reply to author
Forward
0 new messages