Winter and Spring Crawl

80 views
Skip to first unread message

Robert Meusel

unread,
Mar 4, 2014, 2:03:46 AM3/4/14
to common...@googlegroups.com
Hi,

I was wondering weather the two crawls of 2013 are related in any way? Are the same seeds used, or is the winter crawl an extension of the spring crawl? Or are the same URLs crawled in both datasets?

Would be really interesting to know.

Thanks a lot,
Robert

Jordan Mendelson

unread,
Mar 12, 2014, 5:14:39 PM3/12/14
to common...@googlegroups.com
Both crawls used URL lists donated by Blekko from their search engine crawler. They should contain a good number of URLs that overlap, but they were separately generated lists.

Jordan

--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to common-crawl...@googlegroups.com.
To post to this group, send email to common...@googlegroups.com.
Visit this group at http://groups.google.com/group/common-crawl.
For more options, visit https://groups.google.com/groups/opt_out.

Robert Meusel

unread,
Mar 23, 2014, 6:22:14 PM3/23/14
to common...@googlegroups.com
thanks a lot. and the selection strategy is also the same i guess?
Reply all
Reply to author
Forward
0 new messages