Host- and domain-level web graph data sets of May/June/July 2017 crawl

32 views
Skip to first unread message

Sebastian Nagel

unread,
Aug 18, 2017, 4:21:17 AM8/18/17
to common...@googlegroups.com
Hi all,

we're pleased to announce the second release of our webgraph data set.
More details and links to download the data set can be found on our blog [1].

In addition to the host graph we now also include a domain graph.

For both the host and domain graph data sets we provide
- the webgraph in text format
- and as BVGraph for use with the Webgraph framework [2]
- nodes ranked by harmonic centrality and pagerank

The web graphs are constructed from hyperlinks of the three last monthly crawls
(May, June, July).

Best,
Sebastian


[1] http://commoncrawl.org/2017/08/webgraph-2017-may-june-july/
[2] http://webgraph.di.unimi.it/
Reply all
Reply to author
Forward
0 new messages