Hi everyone,
The April 2025 crawl archive and corresponding Web Graph release are now available.
The crawl (
CC-MAIN-2025-18) contains 2.74 billion web pages (or 468 TiB uncompressed); page captures are from 47.5 million hosts or 38.8 million registered domains and include 838 million new URLs.
The Web Graph release (
cc-main-2025-feb-mar-apr) contains 309.2 million nodes and 2.9 billion edges at the host level, and 157.1 million nodes and 2.1 billion edges at the domain level.
See these links for further info:
🔗
April 2025 Crawl Announcement 🔗
April 2025 Web Graph Announcement 🔗
Web Graph Statistics (A reminder, here you can explore the top 1K ranked domains and hosts from all of our graph releases, with searchable and sortable tables. Many have commented that they miss this from our old website when we introduced the new one.)
We hope you enjoy exploring the data.
TV