the crawl archives of May 2022 are now available.
The data was crawled May 16 – 29 and contains 3.45 billion web
pages or 420 TiB of uncompressed content. It includes page captures
of 1.35 billion new URLs, not visited in any of our prior crawls.
As usual, more details about the crawl and information how to access
and use the data can be found on the Common Crawl blog .