crt.sh incomplete results for huge domains

430 views
Skip to first unread message

Andreas Bernhofer

unread,
Sep 4, 2021, 5:17:44 AM9/4/21
to crt.sh
Hey,
I noticed that querying huge domains on https://crt.sh/ , like facebook.com or appspot.com returns incomplete results, with the most recent certificate logged in 2020. Recent certificates don't show up anymore. It is the same when I exclude expired certs in the advanced search.
Since there is no indication on the result page, I guess this is not intentional?
Is this a bug or is there a reason for that?

Thx,
Andreas

r...@sectigo.com

unread,
Sep 16, 2021, 5:53:30 AM9/16/21
to crt.sh
Hi Andreas.  This is a known issue, and I'm afraid I haven't yet found any potential solution.

See https://groups.google.com/g/crtsh/c/J6xTretTJ5M/m/90xbdscUCgAJ

Andreas Bernhofer

unread,
Sep 19, 2021, 6:22:43 AM9/19/21
to crt.sh
Hey,
well, I'm not sure if I get the nature of the issue...
When I use the advanced search and show the SQL used, there is a "LIMIT 10000" statement within the sub-query. So this clearly limits the result. How about removing it?
Also, to avoid very large result sets, wouldn't it be easy to add date filters on the advanced search, so one can e.g. only query certs issued within the last year or so?

r...@sectigo.com

unread,
Sep 20, 2021, 7:25:32 AM9/20/21
to crt.sh
Hi Andreas.  Here's a better link to a previous description of this issue: https://groups.google.com/g/crtsh/c/PJBu5cvm0G8/m/iCIy0vBPAwAJ

Removing the "LIMIT 10000" would cause the service's performance to degrade horribly for everyone, and most likely none of the searches for very large result sets would actually complete before being killed off by the database replication.

The identity searches are powered by PostgreSQL's Full Text Search.  I've not yet(*) been able to find a way to use Full Text Search as part of a composite index (on e.g., issuance date), so adding date filters isn't as simple as you might think.

(*) This is just a half-baked idea in my head so far (unfortunately $DAYJOB hasn't afforded me much time to spend on any crt.sh projects recently): The "certificate" table currently has a partition for each calendar year (and each certificate is stored in the partition that corresponds to the certificate's year of expiry).  I've read that newer versions of PostgreSQL can handle huge numbers of partitions, so I'm wondering about changing the "certificate" table to have a separate partition for each day.  Since each partition has its own Full Text Search index, my thinking is that this approach might help to provide a fairly crude but hopefully effective date filter to accommodate large result sets.
Reply all
Reply to author
Forward
0 new messages