Interpretation and reporting of the "maximum possible test AUC"

385 views

Skip to first unread message

Maxent Newbie

unread,

Mar 8, 2013, 5:20:15 AM3/8/13

to max...@googlegroups.com

Hello,

I'm analysing results from a Maxent run for a number of species, and I’m wondering how to interpret the information about the "maximum possible test AUC" in the Maxent output. I guess this is closely related to this:

"with presence-only data, the maximum achievable AUC is less than 1 (Wiley et al., 2003). If the species’ distribution covers a fraction ‘a’ of the pixels, then the maximum achievable AUC can be shown to be exactly 1 − a/2. Unfortunately, we typically do not know the value of ‘a’, so we cannot say how close to optimal a given AUC value is." (Phillips 2006)

Here is the output from one of my species:

“If test data is drawn from the Maxent distribution itself, then the maximum possible test AUC would be 0.688 rather than 1; in practice the test AUC may exceed this bound.”

Training data AUC: 0.707

Test data AUC: 0.684

Now, what should be reported in this case? Is this a ‘bad’ model, because an AUC < 0.7 is considered ‘poor’ by most classifications? Or is it a ‘good’ model, because the AUC of the test data is close to the maximum possible test AUC?

I can’t remember any paper that reported the maximum possible test AUC from the Maxent output, but wouldn't this be extremely important information?

By the way, I also calculated the true skill statistic (=0.43), and calculated null-models; the AUC of this species is highly significant compared to the null.

A similar question was asked 4 years ago (here), but without any conclusive answer.

Thanks in advance for all answers! :)

Ps: Here is an explanation of S. Phillips of how the maximum possible test AUC is calculated:

"Anne Overgaard asked how Maxent computes the estimate of maximum achievable AUC. I think you're talking about the number given in the html output files, just before the ROC curve. This estimate is not the same as the (1-prevalence/2) estimate from my 2006 Ecological Modelling paper. The html output says this estimate is calculated by assuming that "test data is drawn from the Maxent distribution itself". In more detail: imagine drawing a collection of test data from Maxent's raw output distribution. Some of the test points would be in places with very suitable conditions for the species, at least according to the Maxent output. Others would have only moderately suitable conditions, or even fairly marginal conditions, again according to Maxent's estimate. The former points have hight Maxent output values, so they contribute towards a higher AUC. The latter points have moderate or fairly low Maxent output values, which lowers the AUC."

Reply all

Reply to author

Forward

0 new messages