Patterns of models in convergence

engr.student

unread,

Apr 24, 2012, 6:15:35 AM4/24/12

to Eureqa Group

So I have done a few fitting of strange things and there seems to be a
pattern. I'm not sure about this so can someone tell me if this is
right?

At first there is a bunch of random, small complexity models. After
that "stage" there is a stage where it reproduces a lot of high-
complexity models mostly along the same general themes. I think it
gets the gist of the data and just pushes as far as it can go. At the
final stage there is a culling of the models and it settles down to a
fixed set and all "new" development is just the program spinning its
wheels. Inside the list of candidates is usually a very bright star -
something that declares itself to be the ideal model.

If this isn't just me convincing myself to see things that aren't
there then it might be useful if one could tell what stage of the
process the model is in.

Is this what I think it is?

How does it relate to the "stability" and "maturity" indicators?

engr.student

unread,

Apr 25, 2012, 4:20:44 AM4/25/12

to Eureqa Group

I have also noticed that it is useful to work for some time in one
metric (minimize squared error) then stop the search, change the
criteria (minimize r^2) and continue the search. What this seems to
do is give a good diversity of error minimizing models as seeds for
fit maximizing models.

Has anyone else explored switching the criteria to get substantial
results more quickly?

On Apr 24, 3:15 am, "engr.student" <engr.stud...@gmail.com> wrote:
<snip>

L

unread,

Apr 25, 2012, 10:25:58 AM4/25/12

to eureqa...@googlegroups.com

I've toyed with that, with mixed results. (I toy with things.) And I've toyed with other strategies, like smoothing and then turning smooth off.

I get good results by running it separately under various criteria, then combining the results by using the outputs as suggestions to a "finalist" run under the desired criterion. But it's not clear to me that this is any better than simply running for the same number of cycles, flat-out, with the desired criterion all along.

--
Eureqa Formulize ( http://www.nutonian.com )
-------------------------------------------------
Unsubscribe: eureqa-group+unsub...@googlegroups.com
View Group: http://groups.google.com/group/eureqa-group

Michael Schmidt

unread,

Apr 30, 2012, 4:44:08 PM4/30/12

to eureqa...@googlegroups.com

I've noticed this pattern also, but it not intentional or done explicitly. I think it arises from the fact that it's really easy to find extremely simple models, and it's easy to find accurate complex models, but it's the most challenging to find models that are simple and accurate at the same time. So the frontier is explored in a similar order.

Michael

--
Eureqa Formulize ( http://www.nutonian.com )
-------------------------------------------------

Unsubscribe: eureqa-group...@googlegroups.com
View Group: http://groups.google.com/group/eureqa-group

Michael Schmidt

unread,

Apr 30, 2012, 5:07:38 PM4/30/12

to eureqa...@googlegroups.com

I recall reading a recent paper that explored this technique (cycling through various fitness metrics). Apparently it works quite well. The analogy is varying environments in biological evolution and optimization. So there might be something to this.

Michael

L

unread,

Apr 30, 2012, 6:40:49 PM4/30/12

to eureqa...@googlegroups.com

Well, my degree of falutin'ness is somewhat lower than that.

I simply let it run under the various criteria, to get all the best it had for me. Then I let the results compete against each other, under the criterion I really needed (in this case, minimize the worst case).

This is not aw-shucks self-effacement. The last time I was in academia, I was there to drop my son off at class.

From: Michael Schmidt <michael.dou...@gmail.com>
To: eureqa...@googlegroups.com
Sent: Monday, April 30, 2012 5:07 PM
Subject: Re: Eureqa - Re: Patterns of models in convergence

Dave Nunez

unread,

Apr 30, 2012, 11:14:19 PM4/30/12

to eureqa...@googlegroups.com

On that note, does the tweaking of the building block complexity
values do anything to the model selection ? I've been getting mixed
results btw. Most of the time I have found that putting the building
block complexities to say all 1's yields more accurate but more
complex models with a tendency to over fitting.

Cheers, -d

Dave Nunez

unread,

Apr 30, 2012, 11:08:15 PM4/30/12

to eureqa...@googlegroups.com

As I recall I tried that and got some interesting results last year,
nothing concrete but interesting nonetheless. What I have been
playing around with is changing weight values when I suspect the
modeling as reached a local minimum somewhere. It's really hard to
tell where it's stuck though, and wish there was some way of at least
visualizing in an intuitive matter where the modeling is at. I find
the stability and maturity metrics quite helpful though. Throw is a
random set of weights with your cases once in a while to see if it
speeds up modeling. Just curious to see if that trick works on
different kinds of data.

Cheers, -d

L

unread,

May 1, 2012, 10:02:04 AM5/1/12

to eureqa...@googlegroups.com

It does, the building weights favor/disfavor the associated functions. That is to say, raise the assigned complexity of a function, and the search will "penalize" using it, accordingly - and disfavor it. Lower the assigned complexity, and the opposite will occur.

I've had occaision to lower the logical functions radically, because I was trying to crack a category problem. Best luck was to lower them into the range of the very basic functions - addition, multiplication, the like. IIRC, the best results were setting the logicals all to 2, but I don't know if this is generalizable or somehow coupled to my specific data. Nor even if that's optimal, but it was good enough to be successful.

As a broader statement, it seems obvious that when it gets stuck, anything you can do to disrupt the "stuckedness" is worth a try, even apparently unreasonable things. Then port the outcome back to a reasonable set of options and if it's really better, it will stand up or guide the search in that "general direction" in search-space. At the risk of a homey analogy, try to jiggle the pinball without tilting it.

There are, presumably, an infinite number of possible fit-functions it might locate, if left to run long enough - or at any rate, a vastly large number. The name of the game is to find one that both fits the data very well, and can be understood by the target audience (which might be yourself, might be others). I never use exponentials (eg) in my final result, because I know my target audience - but I'm very happy to use them in the search when they seem useful because they'll carve up the search-space usefully.

ENTER SIGNATURE TEXT HERE

> Unsubscribe: eureqa-group+unsub...@googlegroups.com

> View Group: http://groups.google.com/group/eureqa-group
>
>
> --
> Eureqa Formulize ( http://www.nutonian.com )
> -------------------------------------------------

> Unsubscribe: eureqa-group+unsub...@googlegroups.com

> View Group: http://groups.google.com/group/eureqa-group
>
>
> --
> Eureqa Formulize ( http://www.nutonian.com )
> -------------------------------------------------

> Unsubscribe: eureqa-group+unsub...@googlegroups.com

> View Group: http://groups.google.com/group/eureqa-group
>
>
> --
> Eureqa Formulize ( http://www.nutonian.com )
> -------------------------------------------------

> Unsubscribe: eureqa-group+unsub...@googlegroups.com

> View Group: http://groups.google.com/group/eureqa-group

--
Eureqa Formulize ( http://www.nutonian.com )
-------------------------------------------------

Michael Schmidt

unread,

May 7, 2012, 3:46:53 PM5/7/12

to eureqa...@googlegroups.com

It could have an indirect effect yes, but the primary effect is on the final solution list. Setting each to 1 is a reasonable setting I think. In most cases though, you would prefer a couple multiplies over a cosine or similar.

Reply all

Reply to author

Forward