How API-able is crt.sh?

4,784 views
Skip to first unread message

Hanno Böck

unread,
Jul 16, 2017, 2:33:56 PM7/16/17
to crt.sh
Hi,

I have a couple of related questions about potential automated use of
the crt.sh webpage:

* Is it generally considered "okay" when automated services make use of
  crt.sh? Any comment on "reasonable use"? Are there any concerns when
  there are requests in the numbers of thousands happening or is that
  what the service can easily handle and is considered acceptable? Any
  situation in which you'd like to get contacted before?

* How much stability from output and URLs can one expect? From what I
  see currently the best thing that's more or less a structured output
  is the atom feed. Can one expect that the basic structure (aka
  "there's an entry element with a sub element id that links to the
  cert entry" etc.) stays the same?

* Are cert ids considered stable identifiers of certificates on crt.sh?

Rob Stradling

unread,
Jul 17, 2017, 10:40:51 AM7/17/17
to crt.sh
On Sunday, July 16, 2017 at 7:33:56 PM UTC+1, Hanno Böck wrote:
Hi,

Hi Hanno.

I have a couple of related questions about potential automated use of
the crt.sh webpage:

* Is it generally considered "okay" when automated services make use of
  crt.sh?

Yes.

Any comment on "reasonable use"?

My only comment is: If you're trying to download pretty much all of crt.sh's data (as some crt.sh users seem to be!), then please strongly consider monitoring the CT logs directly instead!

  Are there any concerns when
  there are requests in the numbers of thousands happening or is that
  what the service can easily handle and is considered acceptable? Any
  situation in which you'd like to get contacted before?

Thousands of requests in what timeframe?

We currently serve ~250,000 requests per day comfortably.  I haven't actually tried to benchmark what we could currently scale to, but I suspect we would struggle to service (for example) thousands of requests per second.

* How much stability from output and URLs can one expect? From what I
  see currently the best thing that's more or less a structured output
  is the atom feed. Can one expect that the basic structure (aka
  "there's an entry element with a sub element id that links to the
  cert entry" etc.) stays the same?

You can expect stability from the Atom feeds.

You can expect all crt.sh URLs to continue to function and serve a representation of the same data, but please don't expect the HTML to never change.

* Are cert ids considered stable identifiers of certificates on crt.sh?

crt.sh certificate IDs are stable identifiers, but they are of course specific to crt.sh.  In contrast, SHA-256(Certificate) fingerprints are stable identifiers for certificates both on crt.sh and elsewhere.

The benefits of using crt.sh IDs are that the URLs are shorter and they provide a rough idea of how long each certificate has been known to crt.sh (since the IDs are assigned sequentially).  However, to promote interop with other systems, there is an argument (which I have some sympathy with) that I should hide the IDs in the user interface and steer users towards using SHA-256(Certificate) instead.

BTW, depending on what you're trying to build, you might want to consider connecting to the crt.sh DB directly rather than via https://crt.sh/.  See https://groups.google.com/forum/#!topic/crtsh/sUmV0mBz8bQ for details.
Reply all
Reply to author
Forward
0 new messages