Wikipedia rank working and another memory issue.

6 views

Skip to first unread message

unread,

May 29, 2012, 11:08:39 PM5/29/12

to peregrine...@googlegroups.com

I have most of the work done for wikipedia rank.

Really happy how this is working out because this is a GOOD sizable data set with real world applications.

I wonder if Spinn3r should release another dataset similar to this but for blogs.

Anyway, when I was testing it I found another memory leak that I can now easily duplicate..

I'm not sure what's causing it because the JVM's memory keeps growing so maybe I'm doing something native and not returning resource.

I can easily duplicate it so I'm going to track down exactly what's happening and hopefully have a patch soon.

This wikipedia snapshot is a good test case because it's 35GB uncompressed which is a good chunk of data to play with.

Reply all

Reply to author

Forward

0 new messages