Hi everyone,
The December crawl contains 2.64 billion web pages (or 394 TiB of uncompressed content) fetched between the 1st and the 15th of December. Page captures are from 47.5 million hosts or 38.3 million registered domains, and include 1.05 billion new URLs not visited in any of our prior crawls.
The Web Graph consists of 283.7 million nodes and 2.6 billion edges at the host level, and 98.7 million nodes and 1.8 billion edges at the domain level.
More info can be found in the announcement blog posts for each, and in our new statistics page, all linked below:
Wishing you all a very happy end to the year, from all of us at Common Crawl.
TV