Auditing Logs on a large-scale

119 views
Skip to first unread message

Luis

unread,
Aug 18, 2024, 8:22:34 PM8/18/24
to crt.sh
Hello there,

I want to do some research on CT logs for my bachelor thesis. Namely, I want to verify the integrity of SCTs with their corresponding entry in the logs (general existence and temporal).
My goal is to verify for a large dataset of certificates that each of the certificates is properly included.

I now have some problems that I ran into while researching and doing some coding myself.
  1. Since the timestamp used in the leaf_hash of the entries can be in a relatively large range ((timestamp given in the SCT) to (timestamp given in the SCT + MMD)), I am thinking about how to efficiently query the logs to get the correct entry. Is there maybe already an implementation of this? So that I just have to give the implementation the certificate and the timestamp of the SCT and it will find me the exact entry?
    1. If there is no known implementation so far: I was thinking about doing a binary search on the rough timestamp - given by the SCT - in the CT log, so that I have a starting point. From there, I try to do a linear search to find the entry that contains the certificate I am looking for. Is this feasible, or is there a smarter way to do this?
  2. As an alternative, I would like to use the crt.sh DB on a large scale. Is there any way to contact the operators and request some temporary high-frequency access for research purposes?

Thanks in advance! Love the project
Best regards.

r...@sectigo.com

unread,
Aug 19, 2024, 6:58:19 AM8/19/24
to crt.sh
Hi Luis.

> 1. Since the timestamp used in the leaf_hash of the entries can be in a relatively large range ((timestamp given in the SCT) to (timestamp given in the SCT + MMD)), I am thinking about how to efficiently query the logs to get the correct entry. Is there maybe already an implementation of this? So that I just have to give the implementation the certificate and the timestamp of the SCT and it will find me the exact entry?
>   1. If there is no known implementation so far: I was thinking about doing a binary search on the rough timestamp - given by the SCT - in the CT log, so that I have a starting point. From there, I try to do a linear search to find the entry that contains the certificate I am looking for. Is this feasible, or is there a smarter way to do this?

This question doesn't seem to be specific to crt.sh, so I recommend taking it to https://groups.google.com/g/certificate-transparency to reach a wider audience of CT experts.

> 2. As an alternative, I would like to use the crt.sh DB on a large scale. Is there any way to contact the operators and request some temporary high-frequency access for research purposes?

We don't have the resources to be able to offer any access beyond what's currently available via https://crt.sh/ and crt.sh:5432.  Sorry about that.

Reply all
Reply to author
Forward
0 new messages