AICc and Maxent replications; DISOM ENMeval Model Selection

瀏覽次數:961 次
跳到第一則未讀訊息

Darin Kopp

未讀,
2015年9月22日 下午4:16:092015/9/22
收件者:Maxent
Hi Folks, 

I'm curious on how to handle model selection using AIC.  I am using the DISMO package in R to generate a series of competing models (a full model with all environmental layers and a series of subsequent models produced by removing the variable with the lowest percent contribution, until only a single variable remains).  Each variable set is run with 100 bootstrap replications.

To select the best preforming model, i am then using ENMeval package to calculate AICc for each model.  My problem is that the calculation of AICc is based of a single model run.  If it is run multiple times using the same variables, different AICc values are given (the maxent model is not withholding any test occurrence points).  Is this a problem?  Should i consider averaging AICc values for each of the 100 maxent replicates and choose the lowest average? or just use the first AIC value?  Any advice/consensus with respect to maxent and model selection would much appreciate. Thank you in advance! 

Sincerely, 
Darin 

Husam El Alqamy

未讀,
2015年9月22日 晚上8:19:212015/9/22
收件者:max...@googlegroups.com

Dear Darin
I have to comment on your approach. You should eleminate variable that contributes least to the training gain in the jacknife test rather than adopting the percentage of contribution. Second you can not average the AIC values the alternative is to use the model that gives the least AIC among your 100 replicas but I afraid that is practically not possible as you need to know the set of sample points which are use in that particular replica which you can not get from Maxent. I dont know about Dismo. So using replicated models seems to be not the right approach along with AIC.
Another thing how big is your sample of presence points, to use 100 replicas you should be having big number of records at least over 100 points.
REGARDS

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

Jamie M. Kass

未讀,
2015年9月25日 晚上11:23:422015/9/25
收件者:Maxent
First off, why are you removing variables yourself at all? Some researchers choose variables based on ecological knowledge of what would be appropriate for their species, and others sometimes remove correlated variables (even though they might not need to), but I don't know if your technique makes sense -- just because a variable does not have a high contribution doesn't mean it's worthless. Maxent does its own variable selection, and rarely uses all the variables you give it. If you check the lambda file and look at the second column, anything not zero means Maxent used it. See this pdf for more on the lambda file. Therefore, you should be checking to see what was included in the model before thinking about any of this, because a variable with low contribution at least has a contribution, unlike other variables that might not make it in at all.

As for AIC values, alqamy is right that you cannot average them. When you run a range of models (over the same extent), you can choose the one with the lowest AIC value as the best fit. This is one way to do model selection. My lab also does this: minimize omission rate, then of those models if multiple ones have the same OR, maximize test AUC. This method, we've found, works quite well in choosing models that are not too overfit and have good predictability.

Jamie Kass
PhD student
City College, NYC

Nick Matzke

未讀,
2015年9月26日 凌晨1:30:232015/9/26
收件者:max...@googlegroups.com
If I might pitch in briefly --

I've observed that averaging AIC seems to be a common instinct among many people, in phylogenetics also (e.g., analyses run across many possible trees).  The key things that seem to not get communicated when people learn likelihood/AIC/AICc etc are that:

- likelihood is the probability of the data, given the model (NOT the reverse)
- therefore, likelihoods (and derived statistics like AIC/AICc) can only be used to compare how well *different* models explain the *same* data, meaning *the identical* data
- if the data are different, the likelihood/AIC/AICc are not comparable, and it is meaningless to average them

Longer semi-rant on the topic:

Cheers, Nick



--

Darin Kopp

未讀,
2015年9月28日 下午3:47:072015/9/28
收件者:Maxent
Thanks for your comments.  My intent with using a backwards, iterative approach is to eliminate the subjective selection of correlated variables following Van Gils et al 2014; J. Nature Conservation) - i am not removing them myself:

"The least contributing predictors were removed stepwise (Baldwin 2009), until the most parsimonious number of predictors resulting in an AUC > 0.80 was reached (van Gils et al. 2012). Parsimony in the number of predic- tors reduces the risk of over-fitting (Anderson & Gonzales 2011). Further, the hierarchical procedure removes one pair of collinear variables a posteriori and transparently rather than the custom- ary a priori, subjective selection of one of the two variables to be eliminated (van Gils et al. 2012)."

The use of average AICc was to select the variable combinations associated with the best preforming model. Now, I stand corrected on the AICc usage (assuming my misunderstanding is associated with the difference between deterministic and probabilistic modeling approaches?)

I will check into your recommendations.  

Sincerely, 
Darin 

Darin Kopp

未讀,
2015年9月28日 下午4:04:362015/9/28
收件者:Maxent、mat...@nimbios.org
Thanks Nick - this helps! 

Peter Glasnovic

未讀,
2015年9月29日 下午2:58:362015/9/29
收件者:max...@googlegroups.com
Dear all, is it correct to use AIC values to select a model from same sample with different sets of variables, chosen after testing correlation among them?

regards, Peter 

Lorenzo Bertola

未讀,
2016年7月14日 清晨5:00:302016/7/14
收件者:Maxent
Hello Peter

No, it is not correct. Nick explained why above:
回覆所有人
回覆作者
轉寄
0 則新訊息