Controlling complexity of functions generated

31 views
Skip to first unread message

Hayden

unread,
Feb 10, 2014, 7:01:25 PM2/10/14
to pyev...@googlegroups.com
Hey guys, first post here. Pyevolve is awesome. I was hoping I could get a hand with my current implementation.

So, I have a problem where I'm given a training set from a universal data set and I need to generate a response surface that matches those data points but also successfully cross-validates with the remaining points.

Now as a test, a well-defined function has been selected and I extracted uniformly distributed points for the universal set and then a training set that was randomly selected. The GP implementation works with very high levels of accuracy (99%+ on most cases). There are two problems though:

1. For a very simple predefined function [e.g. sin(x)*cos(y)], the solutions generated become very complicated and large and sometimes introduce expressions that are useless (e.g. x - x = 0).

2. When I add more functions (e.g. log, tan, sqrt, etc), the program really struggles to find any semblance of accuracy.

I was thinking for the first one I could introduce a threshold and store the functions that pass this threshold in an array and then compare them at the end and then throw away the rest for the least complex one. I'm not sure how I would implement this in code though. I don't have the slightest idea for the second problem. 

Thanks for reading,

Hayden


Reply all
Reply to author
Forward
0 new messages