Hi all,
The May 2025 crawl archive and corresponding Web Graph release are now available.
The May crawl (
CC-MAIN-2025-21) crawled between May 11th and May 25th 2025, contains 2.47 billion web pages (429 TiB of uncompressed content); page captures are from 46.9 million hosts or 38.2 million registered domains and include 654 million new URLs.
The Web Graph release (
cc-main-2025-mar-apr-may) contains 326.8 million nodes and 2.9 billion edges at the host level, and 156.1 million nodes and 2.1 billion edges at the domain level.
See these links for further info:
🔗
May 2025 Crawl Announcement🔗
May 2025 Web Graph Announcement🔗
Web Graph StatisticsEnjoy.
TV