Hi everyone,
it was a wish often expressed on this list to have URL indexes available also for older crawl archives.
We've started to generate indexes for the crawl archives of 2013 and 2014:
- indexes for the two 2013 crawls are ready
- also some indexes of the monthly crawls in 2014
- the remaining 2014 indexes will be available early in November
As usual you can access the URL index on
http://index.commoncrawl.org/
or get them on AWS S3 with the prefix
s3://commoncrawl/cc-index/collections/
In addition, the old URL index server for the 2012 crawl archives is up again. For the next days
it's temporarily reachable under
http://ec2-54-221-249-42.compute-1.amazonaws.com/
But we'll move it again to
http://urlsearch.commoncrawl.org/
If in doubt, please, try both URLs for the next time.
Best,
Sebastian