Here's my thoughts:
> 1. I thought that R^2 was supposed to represent the percentage
R^2 is often interpreted as a percentage of unexplained variance. But,
it's possible to have a negative R^2 if the model has higher squared
error than the output variance since the formula is:
http://en.wikipedia.org/wiki/Coefficient_of_determination#Definitions
In other words, just predicting the mean should have an R^2 of 0.
Negative R^2 would imply the model is worse than predicting the mean
for squared error.
You might also see negative or low R^2 if you're optimizing for
something other than squared error, like absolute error for example.
It's possible for a model to do well using one metric but poorly on
another. Different metrics make different assumptions about what
distribution the residual noise is.
> 2. Is it possible to optimize on R^2
Yes, optimizing for squared error is equivalent to optimizing for R^2.
The R^2 calculation is just a linear transformation of the squared-
error. You can select to optimize squared error in the "fitness
metric" setting in Eureqa.
> 3. Is the R^2 value independent of the optimization criterion? That is, are the R^2 values computed with different optimization criteria comparable?
Yes, the R^2 statistic only depends on the data and the model, similar
to other error metrics.
> two different values of R^2
The R^2 calculation displayed in Eureqa is the ordinary R^2, not the
adjusted version which distorts "explained variance" interpretation.
Hope that helped,
Michael
On Apr 17, 3:10 pm, Russ Abbott <
russ.abb...@gmail.com> wrote:
> I apologize again for my naivety about statistics, but I have a few
> additional questions.
>
> 1. I thought that R^2 was supposed to represent the percentage of variation
> in the data that the function explained and that it varied from 0 to 1. In
> some of my runs, I'm getting a negative R^2. How is that possible?
>
> 2. Is it possible to optimize on R^2? Is the closest thing to doing that to
> optimize on correlation coefficient?
>
> 3. Is the R^2 value independent of the optimization criterion? That is, are
> the R^2 values computed with different optimization criteria comparable? In
> other words, if I want to optimize R^2 does it make sense to try
> different optimization criteria until I find one that does the best job?
>
> 4. In the book we are using (Data Mining with
> R<
http://www.liaad.up.pt/~ltorgo/DataMiningWithR/>)
> two different values of R^2 are given for a run. One value depends on
> degrees of freedom (adjusted R^2); the other doesn't. Which one is computed
> by Eureqa?
>
> Or am I completely misunderstanding what R^2 means?
>
> Thanks.
>
> *-- Russ *