Dataset of certificate transparency

158 views
Skip to first unread message

غ

unread,
May 21, 2023, 11:30:26 AM5/21/23
to certificate-transparency
Hello, I am working on my graduation project that uses machine learning and I wonder if it is possible to get a dataset for the certificate transparency logs with timestamps of when the certificate is added to the public logs

Matt Palmer

unread,
May 21, 2023, 7:35:56 PM5/21/23
to certificate-...@googlegroups.com
The entries in the CT logs contain the timestamp of when the certificate was
added to the log, so you can scrape the logs and come up with the data
you're after.

- Matt

Kevin Jorquera

unread,
Jun 2, 2023, 4:21:45 AM6/2/23
to certificate-transparency
How do you access this data?

I can't seem to find any clear documentation on how to poll the logs 

Philippe Boneff

unread,
Jun 2, 2023, 4:44:36 AM6/2/23
to certificate-...@googlegroups.com
The API to poll logs is defined at https://www.rfc-editor.org/rfc/rfc6962#page-20, and I get from the other thread that you're using this already.

Cheers,
Philippe

--
You received this message because you are subscribed to the Google Groups "certificate-transparency" group.
To unsubscribe from this group and stop receiving emails from it, send an email to certificate-transp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/certificate-transparency/2d27b6a4-7b8e-4d03-9e76-3445f2314b4dn%40googlegroups.com.

Matt Palmer

unread,
Jun 3, 2023, 6:50:04 PM6/3/23
to certificate-...@googlegroups.com
You'll need a tool to do the scraping, that uses the HTTP endpoints
described in RFC6962. I've recently published a tool which attempts to do
this efficiently and at maximum available speed; see
https://github.com/mpalmer/scrape-ct-log.

- Matt

Kevin Jorquera

unread,
Jun 4, 2023, 3:08:38 PM6/4/23
to certificate-transparency
Wow This is awesome - thank ya'll for the feedback will look into it.
Reply all
Reply to author
Forward
0 new messages