How did you choose parameter values for the learning algorithms?

jeffgress34

unread,

May 13, 2014, 3:13:40 PM5/13/14

to bikl...@googlegroups.com

Each of the learning algorithms have some input parameters that affect their performance. How did you choose these parameter values?

Dan Bikle

unread,

May 14, 2014, 4:53:57 AM5/14/14

to bikl...@googlegroups.com

For LIBSVM I ran grid.py on some old (1970s) data. I figured anything newer than that would be cheating.

https://www.google.com/search?q=LIBSVM+grid.py

In this file:

https://github.com/danbikle/spy611scripts/blob/master/script/cr_my_vectors.sql#L82

The values I pick to define the class boundries are medians of older y-values.

I did try setting the class boundry at 0 which would have been convenient.

I would have had 2 classes: gain greater than 0 and gain less than 0.

But I get better results if each class has the same amount of observations.

Make sense?

Dan

obispus

unread,

May 14, 2014, 10:21:07 AM5/14/14

to bikl...@googlegroups.com

Why not use *interleaved* runs of trading days for training and testing the models? That is, use x days in the past for training, then keep the following x days apart for testing (i.e., they don't get looked at during model creation), then use the following ones for training, repeat?

A couple remarks: typically you benefit from using more data for training than for testing, so it shouldn't be an even 50-50 split--but that's easy to arrange. And if you want to incorporate seasonal variations you can set x="a whole trading year". But I wouldn't do what I (perhaps mistakenly) read in your email: train with old data, then test with uniformy-newer data.
**Guillermo

Reply all

Reply to author

Forward