Choosing among the "best models"

49 views
Skip to first unread message

KVininska

unread,
Sep 23, 2015, 6:42:55 PM9/23/15
to HyperNiche and NPMR
3 General Questions about choosing the "Best" Model

1. I have completed my first round of several NPMR analyses (playing with changing the overfitting settings, etc). However, I'm not confident at all in choosing the best model for a response variable. If the NPMR finds 4 or 5 "best" models with increasing xR2 values and # predictors, how to I select the "best" one for interpretation. I figure that you need to weigh the jump in xR2 between models with having to incorporate more variables into your interpretation. Does anyone have guidance on how they choose?

2. Am I correct in assuming that the order of the variables IS NOT analogous to a PCA where the first variable identified is the "best" predictor?I've looked through the incorporated manual, but am still not confident I know how to treat the output. The first variable in model 4 is often different than in model 2, etc. 

3. Finally, I've had mixed results with "tuning a model." The xR2 changes, but why and is this the procedure I should be using once I choose my best model? Someone, point me in the right direction!

Thanks for any help and advice.

Bruce McCune

unread,
Sep 24, 2015, 11:51:45 AM9/24/15
to hyper...@googlegroups.com
Here are some responses...

1. This is a decision that depends on the analyst and the nature of the problem.  If I am striving for a parsimonious model with a small number of predictors, a tiny increment (for example, 0.01 or 0.03) to xR2 is too little gain to merit adding another predictor. On the other hand, if I want to maximize the xR2 and don't care about adding predictors, then I would stop adding predictors only when I run out or the xR2 goes to zero or negative.  Most of the time I'm in the former situation.

2. Correct. To push this a little farther... when there are two predictors in the model, not only is the order in the model list irrelevant, but the model includes their interaction, which may be stronger than either predictor alone.

3. The idea of tuning is making small adjustments in the tolerances to improve xR2.  HyperNiche initially makes a broad brush attempt to maximize xR2 -- it is search for the maximum likelihood estimates for the smoothing parameters (tolerances), while penalizing for overfitting. The concept of tuning is that we will explore the predictor space in the same way, but only near that chosen model, and using smaller increments in the predictors. So normally this will result in a _slight_ improvement of xR2. I don't bother tuning a model unless I have studied it pretty carefully and am happy with it as the model I want to present. Lots of times the tuning makes no improvement to xR2.

Hope this helps.
Bruce McCune


--
You received this message because you are subscribed to the Google Groups "HyperNiche and NPMR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

KVininska

unread,
Sep 28, 2015, 6:23:00 PM9/28/15
to HyperNiche and NPMR
Thanks for your help Bruce. Your answer to #2 was especially illuminating. Perhaps I could trouble you for some more advice and clarification.

1. My goal with NPMR is to identify environmental variables that predict abundance, biomass, or presence of target species. We hope this information can be applied for management of species though the management or monitoring of a few specific variables. I suppose we aren't interested in maximizing xR2 if that means our model includes 5+ variables since that doesn't really aid in our management. Does this mean that our answer is that there aren't a few variables that are easily identifies as predictive of our target species responses? Here are some examples of what I'm seeing with respect to xR2 values:
Pred.Ct 1, 2,..: 0.257, 0.274, 0.464, 0.482
Another example: 0.023, 0.538, 0.573, 0.595
Given my goals as stated above, what would you recommend?

2. Should I be wary of models that have xR2 that are high (0.8-0.95)? Does this indicate there may be overfitting, or did I just hit the jackpot?

3. Finally, when I have tuned some models, I have gotten surprising jumps (0.5 to 0.8). Does this mean the NPMR did a poor job of searching model space? Are there some parameters I should change to rerun the analysis?

Thank you again for taking the time to answer my questions.

Bruce McCune

unread,
Sep 29, 2015, 11:14:08 PM9/29/15
to hyper...@googlegroups.com
1. On your goals. Yes, I'd say you are looking for a parsimonious model.

For this example,
Pred.Ct 1, 2,..: 0.257, 0.274, 0.464, 0.482
I'd say you are getting a nice bump with a three predictor model, suggesting that 3 way interaction is critical in your system.

Another example: 0.023, 0.538, 0.573, 0.595
Here you have very weak 1-predictor effects, but the 2 predictor model is great. Beyond that I see little gain. I find this kind of system really fascinating, in that if you didn't represent the interaction you'd have essentially nothing.

2. If your sample size is smallish and you get very high xR2 like that it says that the relationship is not overemphasizing _single_ points. But I would still be suspicious -- say you had two separate piles of points in the predictor space, then the cross-validated R2 might still be high, but an artifact of, for example, non-independence of the points within each group. I recommend exploring the model with Graph | Response Points. There are a number of options there that take some experimentation, but it is a powerful way of tracking back what is actually happening.

3. Your third example is curious -- I have never seen this. Because the tuning process is just making small adjustments in the smoothing parameters (tolerances), without changing the variables involved, the fact that you get a big shift in the xR2 is concerning. I don't know why this is happening, but I do know that strong clumping of points in the predictor space can give weird model behavior. If you figure it out, please let us know!

Bruce McCune

--
Reply all
Reply to author
Forward
0 new messages