I also finished up the Cassandra job output... so you can read AND write to Cassandra now.
I'm going to update the Pagerank code so that you can specify the input and output as Cassandra and the intermediate data is written to Peregrine.
I also have a lot of work done on a combiner too... however, my main goal is to get something very functional if now slower and then improve the performance.
Right now I'm working on some unit testing to verify the mathematical accuracy of our PR impl.
Would be nice to get k-means and maybe naive bayes implemented too.
I was thinking that an easier way to get people to use Peregrine is to ship with a bunch of machine learning algorithms out of the box...
Kevin