Inconsistencies in CRL fields in AllCertificateRecordsCSVFormatv4

107 views
Skip to first unread message

Hanno Böck

unread,
Jan 29, 2026, 9:00:42 AM (12 days ago) Jan 29
to pub...@ccadb.org
Hi,

I recently did some checks with the CRL data from CCADB contained in
the AllCertificateRecordsCSVFormatv4 file and noted some
inconsistencies.

Those can either be a single URL value (column "Full CRL Issued By This
CA") or a JSON list ("JSON Array of Partitioned CRLs").

* In the JSON list, it appears multiple different values are used to
indicate that the field is empty. It is a mix of empty strings (""),
JSON lists with an empty string ('[""]'), or JSON lists with a
double-double-quoted empty string ('[""""]'). In one particularly
peculiar case (DigiCert/Microsoft TLS G1 ECC CA 01), it is a list
containing a double-double-quoted non-breaking space
('[""\\u200b""]').

* In the single URL column, there are two cases that are missing the
protocol, i.e., no http:// or https://:
www.acabogacia.org/crl/aca_arl.crl and ssl.gpki.go.kr/certs/ssl-ca.cer

I would suggest to add some basic sanity checks to the data. I don't
care which symbol is used to indicate an empty field for the JSON
column, but I think it should be consistent. Furthermore, I'd suggest
checking that URLs are URLs, and possibly also reject
unicode/non-ascii characters.


Note that there's a somewhat related issue that many of these CRLs are
not reliably accessible due to dubious blocking based on user-agents,
and that they are often served with incorrect MIME types. That's
recently been discussed on mdsp:
https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/PZTEB49qsHY/m/8vm3-C3oFgAJ

--
Hanno Böck - Independent security researcher
https://itsec.hboeck.de/
https://badkeys.info/

Dustin Hollenback

unread,
Jan 29, 2026, 1:24:20 PM (12 days ago) Jan 29
to Hanno Böck, pub...@ccadb.org
Hello Hanno,

Thanks so much for this detailed analysis and for the feature suggestion! We really appreciate you taking the time to dig into the data quality issues.

The specific inconsistencies you've identified (the various empty field representations, missing protocols, special characters) are definitely problems we need to address.

Regarding your suggestion to add automated sanity checks to prevent these issues, I agree that this is needed! To help us evaluate how to implement this, would you be able to share some additional thoughts on:
- Which validations you see as highest priority
- Whether these should be enforced at data entry (blocking submission) or flagged for review
- Any edge cases or exceptions we should consider

To ensure we track both the current data issues and the feature request long-term, could you file this in Bugzilla under the Common CA Database component? Here's the link:
https://bugzilla.mozilla.org/buglist.cgi?product=CA%20Program&component=Common%20CA%20Database&resolution=---&list_id=17830792

Once filed with those details, the CCADB Steering Committee will review it, add it to our backlog, and prioritize accordingly.

Thanks again for thinking about ways to make CCADB better!

Best regards,


Dustin
On behalf of CCADB
> --
> You received this message because you are subscribed to the Google Groups "CCADB Public" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to public+un...@ccadb.org.
> To view this discussion visit https://groups.google.com/a/ccadb.org/d/msgid/public/20260129150032.3a23c25b%40hboeck.de.

Ben Wilson

unread,
Jan 29, 2026, 4:41:26 PM (12 days ago) Jan 29
to Dustin Hollenback, Hanno Böck, pub...@ccadb.org
Hi Dustin and Hanno,
I started a bug for this already, here: https://bugzilla.mozilla.org/show_bug.cgi?id=2013294, but certainly we can add to it as necessary.
Ben

Reply all
Reply to author
Forward
0 new messages