Distributing a demo with wikipedia graph data.

3 views
Skip to first unread message

burtonator

unread,
May 28, 2012, 8:50:14 PM5/28/12
to peregrine...@googlegroups.com
It turns out that Wikipedia distributes a full dump of their page graph.

It's about 8GB compressed.

I'm going to throw together a working demo of importing the data and then computing page rank over it. 

Right now we are generating RANDOM data which isn't really helpful for testing anything. You also can't eye the data to see that it is correct.

Wikipedia also has category information so I could implement topic / category based teleportation for page rank.

Kevin
Reply all
Reply to author
Forward
0 new messages