AUC values for good fit

Megan S

unread,

Aug 14, 2014, 3:28:05 PM8/14/14

to max...@googlegroups.com

Hi all,

I am writing up a manuscript and my AUC value is 0.72 - I know not extremely high but I am modeling a widespread plant, and widespread species usually have lower AUCs in maxent models. I am looking for a citation that says AUC values above 0.7 indicate a good fit, which I am positive I have seen before, but can't seem to find now. Does anyone know of any?

Thank you,

Megan

Yoan Fourcade

unread,

Aug 15, 2014, 8:55:45 AM8/15/14

to max...@googlegroups.com

Hi Megan,

The interpretation of AUC values is usually referred by the following papers:

Araújo, MB, RG Pearson, W Thuiller, and M Erhard. 2005. “Validation of Species–climate Impact Models under Climate Change.” Global Change Biology 11: 1504–1513, which adapted the classification of Swets, JA. 1988. “Measuring the Accuracy of Diagnostic Systems.” Science 240: 1285–93.

However, I would advise you not to rely to much on this single value as it can only measure the fit of the model to your data but fails at representing its biological significance.

Their are several papers which show that AUC is almost pointless in evaluating SDMs, for example, Lobo, J M., A Jimenez-Valverde, and R Real. 2008. “AUC: A Misleading Measure of the Performance of Predictive Distribution Models.” Global Ecology and Biogeography 17: 145–151.

You can easily test that even 'bad' models (built with biaised or nonmeaningful data) can produce high AUC values. For example, in this paper: Fourcade, Y, J O. Engler, A G. Besnard, D Rödder, and J Secondi. 2013. “Confronting Expert-Based and Modelled Distributions for Species with Uncertain Conservation Status: A Case Study from the Corncrake (Crex Crex).” Biological Conservation 167: 161–171, a clearly wrong and biaised model was evaluated by an AUC value of 0.85, which would be classified as 'good' following the usual recommandations.

I hope I helped you.

Y

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

Megan S

unread,

Aug 15, 2014, 11:07:59 AM8/15/14

to max...@googlegroups.com, yoanfo...@gmail.com

Hi Yoan,

Thank you. I feel confident in my model because I did as much as I could to account for spatial bias and the predicted current distribution is close to what we expect. I know there are tons of papers on other metrics to use but it doesn't seem like any consensus has been made and most papers just report AUC. Are there any metrics that you recommend? I know that low omission is also good, and my mean omission curve is below the predicted omission line so I thought that was good.

Thanks so much for the help,

Megan

Sunil Kumar

unread,

Aug 15, 2014, 11:37:28 AM8/15/14

to Maxent

Hi Megan,

I suggest using Partial AUC Ratio (pAUC) recommended by Peterson et al. (2008), and test sensitivity.

In the paper below I used pAUC ratio to rank the models and the results showed that pAUC was able to detect subtle differences in model performance (Table 3, page 1038) that were not shown by MaxEnt generated AUC.

http://warnercnr.colostate.edu/~sunil/Kumar%20et%20al_2014_Assessing%20potential%20establishment_WCFF%20using%20Niche%20models_with_Appendices.pdf

Peterson, A. T., M. Papes, and J. Soberon. 2008. Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecological Modelling. 213: 63-72.

Above all, see if the species response curves make biological sense and make sure the model is not too complex.

See Fig. A3 (page 23) in our paper below.

http://www.esajournals.org/doi/pdf/10.1890/ES14-00050.1

Good luck with your modeling adventures.

Sunil

Inline image 1

Fig. A3. Effects of MaxEnt’s default settings on model complexity, model predictions and response curves. (A)

MaxEnt_Env model (default settings) had seven variables, 45 parameters, and AICc value of 2845.54; (B)

MaxEnt_Env model (Simple settings) had seven variables, 16 parameters, and AICc value of 2732.51. Simple

settings included only Linear, Quadratic and Product features. Default settings resulted in very complex response

curves (C, E), and Simple settings resulted in simple non-linear response curves (D, F). Simple settings model had

higher accuracy than default settings model (see Tables 2 and 3).

-----------------------------------------------

Sunil Kumar, PhD
Research Scientist II, Natural Resource Ecology Lab.,

Affiliate Faculty, Department of Ecosystem Science and Sustainability
B242 NESB, 1499 Campus mail, Colorado State University,
Fort Collins, Colorado-80523 USA.
Tel: 970-491-7056, Fax: 970-491-1965
Home page: http://warnercnr.colostate.edu/~sunil/

NREL: http://www.nrel.colostate.edu/

DESS: http://warnercnr.colostate.edu/ess-home

--

swammy50

unread,

Aug 15, 2014, 1:52:46 PM8/15/14

to max...@googlegroups.com

My two cents is that any statistic will fail to assess accuracy correctly if the input data are biased (training and or testing). I don't see why AUC would be more or less susceptible. I personally find AUC useful but acknowledge it is just one metric. Trouble with many other metrics are that they require the selection of a threshold to convert predictions to binary grids, which I have found to be an arbitrary decision without much support when using presence only data. I have never use partial AUC so can't comment on that but if I recall the Peterson paper, seems like there were cases it could be useful but not necessarily a metric to use in all situations. Think the argument was something about weighting errors in regions of the confusion matrix more or less depending on your study question. Not sure you always have reason to do this.

There are randomization procedures that have been shown in the literature or comparisons to a purely spatial model (no environmental covariates). Presence is a function of distance to a known occurrence.

Hope this helps.

Sam

On Aug 15, 2014, at 8:37 AM, Sunil Kumar <suni...@gmail.com> wrote:

Hi Megan,
I suggest using Partial AUC Ratio (pAUC) recommended by Peterson et al. (2008), and test sensitivity.

In the paper below I used pAUC ratio to rank the models and the results showed that pAUC was able to detect subtle differences in model performance (Table 3, page 1038) that were not shown by MaxEnt generated AUC.

http://warnercnr.colostate.edu/~sunil/Kumar%20et%20al_2014_Assessing%20potential%20establishment_WCFF%20using%20Niche%20models_with_Appendices.pdf

Peterson, A. T., M. Papes, and J. Soberon. 2008. Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecological Modelling. 213: 63-72.

Above all, see if the species response curves make biological sense and make sure the model is not too complex.

See Fig. A3 (page 23) in our paper below.
http://www.esajournals.org/doi/pdf/10.1890/ES14-00050.1

Good luck with your modeling adventures.
Sunil

<image.png>

Reply all

Reply to author

Forward