A small number of entries (leaves) in the Oak 2022 CT log were being served with incorrect timestamps: a `get-entries` request for the surrounding sequence numbers would show entry timestamps generally increasing and within a few seconds of each other (as expected), but the sequence number in question would have a timestamp of several days or weeks later. Our investigation showed that this was additional, previously-undetected fallout from the December 9th, 2021 database migration and incident.
All times are UTC. Timestamps for only one affected entry (the entry at index 135227797, one of those originally reported by Cloudflare) are given for the sake of brevity; a full list of affected entries can be found below.
2021-11-24 04:39: The precertificate with serial 3204903542915527182 is submitted to Oak 2022 and gets sequence number 135227797. Google mirrors this entry into their letsencrypt_oak2022 log.
$ ctclient -log_uri https://ct.googleapis.com/logs/eu1/mirrors/letsencrypt_oak2022/ -first 135227797 -last 135227797 getentries | egrep 'Index|Serial'
Index=135227797 Timestamp=1637728753775 (2021-11-24 04:39:13.775 +0000 UTC) pre-certificate from issuer with keyhash f11c3dd048f74edb7c45192b83e5980d2f67ec84b4ddb9396e33ff5173ed698f:
Serial Number: 3204903542915527182 (0x2c7a1acabedef20e)
2021-12-08 19:00: Scheduled maintenance to split Oak’s unified database into separate per-shard databases begins.
2021-12-08 20:24: Google detects errors from certain API methods. Investigation shows that the errors are due in part to missing LeafData rows that were not properly copied by the migration tooling. The LeafData row referenced by the SequencedLeafData with sequence number 135227797 is among the missing rows.
2021-12-08 20:26: First duplicate certificate integrated (see below for example). This incident begins.
2021-12-09 09:29: The precertificate with serial 3204903542915527182 is re-submitted to Oak 2022 and gets sequence number 158698536. This request only resulted in a new sequence number because the LeafData row for this certificate was missing; if the LeafData row had been present the request would have been correctly detected as a duplicate. This creates a new LeafData row with this new timestamp, but with the same certificate data. Google mirrors this entry.
$ ctclient -log_uri https://oak.ct.letsencrypt.org/2022 -first 158698536 -last 158698536 getentries | egrep 'Index|Serial'
Index=158698536 Timestamp=1639042196214 (2021-12-09 09:29:56.214 +0000 UTC) pre-certificate from issuer with keyhash f11c3dd048f74edb7c45192b83e5980d2f67ec84b4ddb9396e33ff5173ed698f:
Serial Number: 3204903542915527182 (0x2c7a1acabedef20e)
$ ctclient -log_uri https://ct.googleapis.com/logs/eu1/mirrors/letsencrypt_oak2022/ -first 158698536 -last 158698536 getentries | egrep 'Index|Serial'
Index=158698536 Timestamp=1639042196214 (2021-12-09 09:29:56.214 +0000 UTC) pre-certificate from issuer with keyhash f11c3dd048f74edb7c45192b83e5980d2f67ec84b4ddb9396e33ff5173ed698f:
Serial Number: 3204903542915527182 (0x2c7a1acabedef20e)
2021-12-09 09:29: Because the re-submitted precertificate has the same certificate data, it also has the same “LeafIdentityHash”, a Trillian internal implementation value which acts as the foreign key linking LeafData rows and SequencedLeafData rows. Therefore the existing SequencedLeafData with sequence number 135227797 now also references this new LeafData row, and requests for this sequence number start returning incorrect leaf data instead of an error.
$ ctclient -log_uri https://oak.ct.letsencrypt.org/2022/ -first 135227797 -last 135227797 getentries | egrep 'Index|Serial'
Index=135227797 Timestamp=1639042196214 (2021-12-09 09:29:56.214 +0000 UTC) pre-certificate from issuer with keyhash f11c3dd048f74edb7c45192b83e5980d2f67ec84b4ddb9396e33ff5173ed698f:
Serial Number: 3204903542915527182 (0x2c7a1acabedef20e)
2021-12-09 13:59: Backfill of the missing data begins. Because a new LeafData row with this same LeafIdentityHash exists, it does not appear to be a missing row, and does not get backfilled.
2021-12-10 15:13: Last duplicate certificate integrated.
2021-12-10 19:19: Backfill of the missing data ends.
2022-03-22 17:05: Email is sent to ct-policy noticing the discrepancy. Investigation begins.
2022-03-23 00:05: Begin writing tool to audit logs for additional discrepancies of the same kind.
2022-03-23 16:30: Root cause (as described above) identified.
2022-03-23 20:48: Complete scan of Oak 2023 begins.
$ time go run ./crawl -log_uri https://oak.ct.letsencrypt.org/2023 -num_workers 100 -start_index 0 -end_index 30301266 > 2023.log
5608.64s user 404.47s system 150% cpu 1:06:44.16 total
2022-03-23 21:55: Full scan of Oak 2023 completes. Two candidate out-of-order entries found. Note that both are only out-of-order by a couple seconds each: these do not represent data corruption, just normal consequences of the log’s batching behavior. No action needed here.
$ grep -A 3 "Found out-of-order entry" 2023.log
Found out-of-order entry:
Index: 76744
Timestamps: 1604660534448, 1604660533422
Serial: 6221299746771331086
--
Found out-of-order entry:
Index: 206702
Timestamps: 1633969854747, 1633969852344
Serial: 1633969822394416
2022-03-23 22:05: Scan of Oak 2022 up through 2021-12-10 19:20 (end of prior incident) begins.
$ ctclient -log_uri https://ct.googleapis.com/logs/eu1/mirrors/letsencrypt_oak2022/ -timestamp 1639164000000 bisect | grep "First entry"
First entry with timestamp>=1639164000000 (2021-12-10 11:20:00 -0800 PST) found at index 160530962
$ time go run ./crawl -log_uri https://oak.ct.letsencrypt.org/2022 -num_workers 100 -start_index 0 -end_index 160530962 > 2022.log
30542.10s user 1992.84s system 115% cpu 7:49:08.96 total
2022-03-24 05:54: Scan of Oak 2022 up through 2021-12-10 19:20 completes. 3483 candidate out-of-order entries found.
$ grep "Found out-of-order entry" 2022.log | wc -l
3483
2022-03-24 22:17: Comparison of candidate entries to Google’s mirror of those entries completes. 370 entries identified as corrupted.
2022-03-24 22:46: Correct leaf_input data for all 370 entries retrieved from Google’s mirror.
$ time go run ./compare -log_uri https://ct.googleapis.com/logs/eu1/mirrors/letsencrypt_oak2022/ -num_workers 100 -entry_file 2022_mismatches.csv -leafdata_file 2022_leafdata.csv
4.85s user 0.98s system 46% cpu 12.492 total
$ wc -l 2022_leafdata.csv
370 2022_leafdata.csv
2022-03-25 23:50: This incident report published.
During normal operation, when a CT log receives a submission for a certificate that has previously been submitted, it accepts the submission (and even returns SCTs), but it does not incorporate the certificate into the log a second time – rather, it returns SCTs which say “this certificate has already been incorporated”, and it makes no changes to the Merkle tree.
Trillian’s mysql storage backend implements this behavior via a unique key constraint on one of its two primary data tables. For our purposes, Trillian’s schema consists of two related tables: `LeafData`, which contains opaque blobs of the “leaf_input” and “extra_data” fields returned by the log’s `get-entries` endpoint; and `SequencedLeafData`, which contains entry indices (sequence numbers) and pre-computed merkle tree hashes of the leaf data. These two tables are connected by a shared foreign key `LeafIdentityHash`, which is an internal implementation detail that (for Trillian’s CT profile) happens to be a hash over the certificate (but not the full leaf_input) contained in the leaf. As the Trillian docs say:
…[I]n Certificate Transparency each certificate submission is associated with a submission timestamp, but subsequent submissions of the same certificate should be considered identical. This is achieved by setting the leaf identity hash to a hash over (just) the certificate, whereas the Merkle leaf hash encompasses both the certificate and its submission time -- allowing duplicate certificates to be detected.
So when a duplicate certificate is submitted, the log computes the LeafIdentityHash over the certificate, sees that a row with that same LeafIdentityHash already exists in the LeafData table, and returns the already-sequenced data instead of performing a new insertion.
During the December 9th incident, there was a period of time where Oak 2022’s SequencedLeafData table was complete, but rows were missing from the LeafData table. This meant that the above logic did not work properly: when a duplicate certificate was submitted, the log computed its LeafIdentityHash, saw that no matching row already existed, and proceeded with integrating the submission. In the process of integrating the submission, the log created a new LeafData row with the same LeafIdentityHash and ExtraData as the missing row, but a slightly different LeafValue: because the LeafValue blob includes both the certificate in question and the timestamp at which it was submitted, the timestamp portion of the LeafValue was different. It also created a new SequencedLeafData row with a LeafIdentityHash pointing at this “new” LeafData row. This meant that there were now two SequencedLeafData rows with the same LeafIdentityHash.
The good news is that Google mirrors the Oak 2022 log, and that mirror copied the original leaf_input values long before they went missing or were overwritten. The bad news is that one can’t trivially fix the error by rewriting a LeafData row to have its original value: doing so would break the duplicate SequencedLeafData row which references it. The good news is that with a little work, we can fix the error. See Remediations below.
We maintain and run a log-monitoring tool called ct-woodpecker. Along with checking inclusion proofs as new STHs are seen, it also has a configuration parameter which causes it to periodically re-submit previously-seen entries. We run ct-woodpecker with this configuration flag enabled.
What this code hopes to see is that the log returns the same SCT as it returned the first time the certificate was submitted. However, if it gets a non-duplicate SCT (indicating that the log actually integrated the cert a second time) it doesn’t error out – it simply stores the new SCT as a new entry that it will keep an eye on in the future. So ct-woodpecker didn’t report any errors while the duplicate certificates were being accepted.
As a side-effect, this also means that our own monitoring infrastructure was responsible for submitting most (if not all) of the duplicate certificate entries. For a while during this investigation we were confused as to why a precertificate originally issued in Feb 2020 would be re-submitted to Oak 2022 (a log it is already in!) in December 2021. Remembering that ct-woodpecker incorporates this behavior allowed us to be confident in our root cause diagnosis.
We believe most CT log monitors act in a mode analogous to “tail -f”, similar to ct-woodpecker: they watch for new entries to appear in a log, check the consistency proofs against previously-seen tree heads, store the result for future comparisons, rinse, and repeat. This means that they successfully combat one of CT’s core threat models: that a log would sign an SCT, integrate the cert, allow that inclusion to be proved, and then later go back and try to erase the entry from history. The certificate has been logged and the monitors have seen it, and so it cannot be successfully removed to hide misissuance or malfeasance.
However, this monitoring model does not prevent a log from simply serving incorrect data for old entries. If a log wanted to make a certificate seem innocent to all but the most trained eyes, it could correctly incorporate it, wait for the various log monitors to audit that inclusion, and then start serving the wrong data to all other requestors. The only reason this wouldn’t work in practice today is because most people do their day-to-day certificate lookups on crt.sh, rather than directly from logs.
A model of log monitoring more akin to “cat” would be able to detect this. By scanning the whole log from scratch and verifying that the tree hashes remain consistent with some observed STH, a log monitor would see discrepancies like this where the underlying data has been changed long after it was originally published. But doing so obviously costs additional time and computational resources.
Our remediations fall into three categories: repairing the current log database to restore the original leaf data, better preventing problems like this, and better detecting problems like this.
We have already identified the set of SequencedLeafData rows whose LeafIdentityHash is pointing to a LeafData row whose embedded timestamp differs from the timestamp originally mirrored by Google. We have also already retrieved the correct/original leaf_input data for each of these sequence numbers from Google’s mirror. For each of the incorrect indices, we intend to do the following:
Write a new LeafData row with the LeafValue and ExtraData retrieved from Google’s mirror, but with a new unique LeafIdentityHash (constructed deliberately to be different from the original, and not a hash over the original contents).
Overwrite the LeafIdentityHash column of the original SequencedLeafData row to contain the new unique value from the previous step.
This will restore the correct leaf data to all of the currently-broken sequence numbers, without breaking the duplicate entries at the same time.
To prevent this from happening again, we intend to submit a PR upstream to the Trillian repo which adds a Unique Key Constraint on the LeafIdentityHash column of the SequencedLeafData table. This will prevent multiple leaf nodes from referencing the same underlying LeafData row.
To better detect future instances of this kind of data corruption, we intend to stand up additional log monitoring infrastructure that regularly re-scans and rebuilds our own logs, verifying that the Merkle tree hashes still match the underlying entries. We will also update ct-woodpecker’s resubmission logic to raise an error if the duplicate submission is accepted.
See attached text file.
We have already identified the set of SequencedLeafData rows whose LeafIdentityHash is pointing to a LeafData row whose embedded timestamp differs from the timestamp originally mirrored by Google. We have also already retrieved the correct/original leaf_input data for each of these sequence numbers from Google’s mirror. For each of the incorrect indices, we intend to do the following:
Write a new LeafData row with the LeafValue and ExtraData retrieved from Google’s mirror, but with a new unique LeafIdentityHash (constructed deliberately to be different from the original, and not a hash over the original contents).
Overwrite the LeafIdentityHash column of the original SequencedLeafData row to contain the new unique value from the previous step.
This will restore the correct leaf data to all of the currently-broken sequence numbers, without breaking the duplicate entries at the same time.
To prevent this from happening again, we intend to submit a PR upstream to the Trillian repo which adds a Unique Key Constraint on the LeafIdentityHash column of the SequencedLeafData table. This will prevent multiple leaf nodes from referencing the same underlying LeafData row.