[machine learning] RDD & cross validation

50 views
Skip to first unread message

Rand Hindi

unread,
Nov 27, 2012, 7:33:54 AM11/27/12
to spark...@googlegroups.com
Hi!

Is there a way to do cross validation on a training set loaded in an RDD? I know I can take a sample, but what I want is n-fold cross validation. 

Thanks for any tips!

Rand


Matei Zaharia

unread,
Nov 27, 2012, 5:16:31 PM11/27/12
to spark...@googlegroups.com
Hi Rand,

Unfortunately there isn't anything specifically for this right now. You'd have to either take samples by hand, or partition the data multiple ways by hand before going through there. It would be interesting to add an operation similar to sample() that produces multiple samples for cross validation but there isn't one yet.

Matei

Rand Hindi

unread,
Nov 27, 2012, 10:31:20 PM11/27/12
to spark...@googlegroups.com
Great, thanks!

--
----------------------------------------------------
Rand Hindi

+ 33 (0) 6 69524000
rand...@gmail.com
www.randhindi.com
skype: randhindi
twitter: @randhindi
facebook.com/rhindi
----------------------------------------------------
Reply all
Reply to author
Forward
0 new messages