June 2026 Crawl Archive and Corresponding Web Graph are now available

10 views
Skip to first unread message

Luca Foppiano

unread,
Jun 25, 2026, 6:26:20 AM (4 days ago) Jun 25
to Common Crawl
Hi everyone,

Our June 2026 Crawl Archive and corresponding Web Graph are now available.

The June 2026 crawl consists of 2.10 billion web pages (or 354 TiB of uncompressed content). Captures are from 40.8 million hosts or 33.6 million registered domains.

The corresponding Web Graph release consists of 247.3 million nodes and 6.3 billion edges at the host level, and 121.1 million nodes and 3.9 billion edges at the domain level.

🔗 June 2026 Crawl Announcement
🔗 June 2026 Web Graph Announcement
🔗 Crawl Statistics
🔗 Web Graph Statistics

Live long and prosper! 🖖
Luca
Reply all
Reply to author
Forward
0 new messages