July 2025 Crawl and Web Graphs

67 views
Skip to first unread message

Thom Vaughan

unread,
Jul 27, 2025, 9:42:30 AMJul 27
to Common Crawl
Hi folks,

Our July 2025 crawl and Web Graphs are now available.  The July 2025 crawl (CC-MAIN-2025-30) fetched between the 7th and 21st of July contains 2.42 billion web pages, or 419 TiB of uncompressed content. Page captures are from 47.6 million hosts or 39 million registered domains and include 763 million new URLs, not visited in prior crawls.

The July 2025 Web Graph (cc-main-2025-may-jun-jul) consists of 481.6 million nodes and 3.4 billion edges at the host level, and 209.5 million nodes and 2.6 billion edges at the domain level.

Further info below:


Cheers,
TV
Reply all
Reply to author
Forward
0 new messages