On Wed, 19 Apr 2023 08:39:24 -0700 (PDT)
"'Leonardo Toshinobu Kimura' via Certificate Transparency Policy"
<
ct-p...@chromium.org> wrote:
> Recently, I read the article <
https://doi.org/10.1109/TNET.2021.3123507>"Exploring
> the Reliability of Monitors in the Wild", and I discovered that monitors
> "frequently fail to return a complete set of certificates issued for a
> domain of interest". In other words, even if I request all certificates for
> a domain, some certificates may not be returned by the monitor, raising the
> risk of invisible fraudulent certificates.
I (Cert Spotter author / SSLMate founder) corresponded with an author
of this paper in May 2019. During our correspondence, I discovered that:
1. They were misusing the Cert Spotter API by only considering the first
page of multi-page responses, despite the API responses containing a
Link header with rel=next, and the API documentation explaining how to
paginate results.
2. They were querying the URL of an undocumented experimental API endpoint
rather than the URL in the documentation.
3. Some of the certificates that they claimed were missing from
Cert Spotter were issued by untrusted CAs and were not logged in
browser-recognized logs and would not be accepted by browsers.
4. Many of the "missing" certificates were precertificates deliberately
omitted from Cert Spotter results because the corresponding certificate
is returned instead.
They were ultimately unable to furnish a single CT-compliant
certificate that was actually missing from Cert Spotter.
Although I explained all of this to them over a total of 6 emails,
they wrote in their paper:
> SSLMate and crt.sh replied that they would initiate investigations on
> the reported issues, but we have not received further feedbacks.
I'm happy to provide the email correspondence in order to disprove the
above claim.
They did incorporate some of my feedback in their paper, but problems
remain.
First, their method of matching certificates and precertificates is flawed:
> We define the four-tuple (NotBefore, NotAfter, SerialNumber,
> Issuer) as the index to identify a (pre)certificate, and dedupli-
> cate the data from each monitor as well as the union of all
> searched certificates. Finally, we obtain a dataset of 382,051
Using the Issuer in the four-tuple fails when the precertificate is issued
by a Precertificate Signing Certificate. This will cause them to think
that Cert Spotter is missing certificates when it really isn't. Instead
of inventing their own method of matching certificates to precertificates,
the authors should have followed the guidance in RFC 6962 Section 3.2.
> In SSLMate, a query of
zendesk.com,
> for which many (pre)certificates are supposed to be returned,
> receives the error "This query took too long to complete."
As the full text of the error message explains, you can contact
SSLMate to request (for a fee) that a dedicated index be provisioned
to enable the searching of these huge domains.
> SSLMate claims it only accepts domain names of registered domains or their
> subdomains [24], and deliberately excludes these 4 domains to
> protect user privacy (e.g., customers' hostnames in amazon-
>
aws.com,
cloudfront.net,
blogspot.com and
fbsbx.com) [28].
The part about deliberately excluding domains to protect user privacy
is a complete fabrication ([28] is a cite to a totally unrelated paper
written by authors who are not affiliated with SSLMate). It's true
that full-domain search of entire eTLDs is restricted by default
because these queries are expensive to service. As with the timeout
error, the complete text of the error message says you can contact
SSLMate to enable access. The authors did not contact SSLMate to
request access; had I known that they were hitting this I would have
been happy to work with them.
In any case, just because the Cert Spotter API paywalls certain domains
which are disproportionately expensive to search does not mean that
Cert Spotter is unreliable or is missing certificates.
> There is already any mitigation for this? Or maybe some proposal? It looks
> like a serious issue to me. I did some research, but strangely I have found
> little material on the subject. Maybe I am looking at the wrong places?
I do not think there is any credible evidence that this is a serious
issue.
Regards,
Andrew