Controlling complexity of functions based on RMSE

28 views
Skip to first unread message

Hayden

unread,
Feb 10, 2014, 6:30:49 PM2/10/14
to pyev...@googlegroups.com
OK, first post with Google here, Pyevolve is a great library BTW.

Anyway, my current situation:

Create a response surface to a set of training data and generate a function that hits all these points and then use a validation set to cross validate the function. So I have made a program with Pyevolve and as a test run, a function was selected and then a set of data points was created and then a training set was randomly selected to check how good GP would stack up in finding a suitable function. Now when this was run, the accuracy is great, 99%+ on most runs. However, for a simple function like sin^2(x)*cos(y), it generates horribly complicated trees with sometimes unnecessary terms (like x - x = 0).

I'm wondering, is there a way in which a set of functions with the same fitness can be compared and then throw out the more complicated one? Furthermore, is there a way to assign a weighting to parameters such as complexity and accuracy?

Another problem that's been occurring is as soon as I start adding in more functions to use (like log, tan, sqrt, etc), the program has difficulty finding a good solution at all. Does anyone know much about this and is there a good way around it?

Cheers,

Hayden
Reply all
Reply to author
Forward
0 new messages