March 2018 Crawl Archive Now Available

47 views
Skip to first unread message

Sebastian Nagel

unread,
Mar 29, 2018, 6:52:32 AM3/29/18
to Common Crawl
Hi all,

the March 2018 crawl archive is now available. The crawl was run from March 17 to 25, 2018
and covers 3.2 billion web pages or 250 TiB of uncompressed content. More details about the
crawl and information how to access and use the data can be found on our blog [1].

You'll find statistics and metrics about the current and previous crawls on [2].

The URL index of the December crawl is available at [3]. Also the columnar index [4] now
contains the March crawl as new partition.

Best,
Sebastian


[1] http://commoncrawl.org/2018/03/march-2018-crawl-archive-now-available/
[2] https://commoncrawl.github.io/cc-crawl-statistics/
[3] http://index.commoncrawl.org/CC-MAIN-2018-13/
[4] http://commoncrawl.org/2018/03/index-to-warc-files-and-urls-in-columnar-format/
Reply all
Reply to author
Forward
0 new messages