Daniel Lamblin
unread,Oct 24, 2011, 10:52:42 AM10/24/11Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to NYC Machine Learning Review
If I recall correctly one thing to try is to divide your sample data
(the one with 50,000 samples and 200 features) into two randomized
sets of say 35,000 learning samples and 15,000 test samples. Now try
solving polynomial fits with 1, 2, 3, 4 ... to some reasonable upper
bound that won't take all day like 8, or use gradient descent,
whichever, on the learning samples, then use the learned theta of each
to test the error-squared on the test samples, you'll pick the
polynomial power which has the minimal test error, while the learning
set error will get very very low with each higher order, at some point
you're actually over-fitting so it no longer applies generally, which
is where the test sample help you retain applicability.
If later you want to double check your choice of polynomial power, run
the whole process again on a different random segmentation of the
data, they'll probably match up, though the theta might be a tad
different. Now that you have the power that's about right, find the
coefficients for all the sample data you have at hand.