ENMEval and Maxent results

1,121 views
Skip to first unread message

Mr. Arpit Deomurari

unread,
Jun 16, 2021, 8:26:06 AM6/16/21
to Maxent
Hi All,

I am running SDMs for many species fo my PhD work. I just wanted to inquire about my workflow using EnmEval

Step 1. Build ENMEval models and find RM and FC value for Delta AIC=0
Step 2. Build Maxent model using (Dismo) with derived RM & FC value with 10CV

Will the predictions change between using enmeval best model directly without running maxent again?
Will the AUC value changes between enmeval delataAIC=0 model and maxent with enmeval rm &FC?

Regards

gafna jeff

unread,
Jun 16, 2021, 8:41:40 AM6/16/21
to Maxent
Dear Arpit,
My experience with ENMeval package was that I just use the selected model directly without building another model using the selected parameters ie FC and RM. The difference in prediction even if their could be ignorable. And again remember that their is always no perfect model. On your second question. Select the model with the lowest AICc, then use this model for evaluation on the testing data set. You then report the AUC after doing evaluation on the testing data

Jeff

Jamie M. Kass

unread,
Jun 17, 2021, 3:11:49 AM6/17/21
to Maxent
Arpit,

Jeff is right that you can just use the model from ENMeval without having to build it again. Regardless of how many times you run it, you will get the same model using the same fc and rm settings every time unless you change the background records.

About using AICc, you would choose the model with lowest value (delta AICc = 0), unless there are models with delta AICc less than 2, which statistically are just as good as delta AICc = 0. In these cases, you need to choose which settings are best -- you could pick the simplest one, or the one with highest validation AUC, etc. However, you should be aware that if you plan to transfer the model to other times or places, AICc will not tell you much about transferability. See the discussion section of our new paper on ENMeval 2: https://doi.org/10.1111/2041-210X.13628.

Jamie

Mr. Arpit Deomurari

unread,
Jun 18, 2021, 9:47:56 AM6/18/21
to Maxent
Hi Jamie,

I have to rebuild the model based on the parameter derived as I wanted to carry out 10 replicates. surprisingly the AUC are different.

I have kept the background constant.
eval.results <- ENMevaluate(occ=pres.data.clean[,c("longitude", "latitude")], env=stk_Current, a=abs.data, RMvalues=c(0.50, 1.00, 1.50, 2.00, 2.50, 3.00, 3.50, 4.00, 4.50, 5),fc=c("L", "LQ", "H", "LQH", "LQHP", "LQHPT"),method = "randomkfold", kfolds = 5,rasterPreds=TRUE, algorithm='maxent.jar',bin.output = TRUE, clamp = TRUE,parallel = TRUE,numCores = ncore,progbar = F)

and here then I carry out 10 replicates using derived FC and RM values.

Here is ENMEval results for Delta AIC==0
train.AUC 0.9209
avg.test.AUC 0.904296
Whereas 10 replicates from maxent give me

train.AUC:0.8606 - 0.8575 - 00.8502 - 0.8514 - 0.8499- 0.8463 - 0.8561 - 0.8616 - 0.8564 - 0.8552
test.AUC: 0.7816 - 0.768 -  0.8457 - 0.7977 - 0.791- 0.9485-0.809-0.767-0.8437-0.8268

I'm using maxent.jar 3.3.3k

Regards

Jamie M. Kass

unread,
Jun 18, 2021, 1:45:51 PM6/18/21
to max...@googlegroups.com
Arpit,

So you're making replicates to see how AUC varies across random k-fold runs, I take it. Training AUC should not change given the same background, but test (validation) AUC should, so I'm not sure why you are seeing varying AUC.train values. Are you running these replicates by rerunning ENMevaluate() multiple times, or by using the maxent.jar GUI? 

The model objects stored in the ENMevaluation object slot "models" are built with all the data, so these should change for different values of fc and rm but remain the same for different runs with random k-fold -- all that should change there are AUC.test values.

Jamie

2021年6月18日(金) 22:48 Mr. Arpit Deomurari <deom...@gmail.com>:
--
You received this message because you are subscribed to a topic in the Google Groups "Maxent" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/maxent/ITcKVLQx9TY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to maxent+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/maxent/b5043e3a-f265-4112-85b9-a8a34497a8ben%40googlegroups.com.

Mr. Arpit Deomurari

unread,
Jun 19, 2021, 2:32:27 PM6/19/21
to Maxent
Hi Janie,

I'm re-running the final 10 replicates in R using dismo with maxent.jar. It's surprising to see train auc changes, as background is also same along with FC and RM value.

any insights?

regards

Jamie M. Kass

unread,
Jun 22, 2021, 10:01:45 PM6/22/21
to max...@googlegroups.com
Arpit,

Okay, I figured it out. The "replicates" option in the Java software maxent.jar (which can be accessed through dismo::maxent with args = "replicates=10" for example) has some different options (see A Brief Tutorial to Maxent <https://biodiversityinformatics.amnh.org/open_source/maxent/Maxent_tutorial_2021.pdf>), but the default is random cross-validation. This splits your presence data into k randomly-assigned groups (here, k would be 10 for 10 replicates), and constructs a model on k-1 groups, then evaluates the model on the left-out group. It repeats this process until all groups have been evaluated. The training AUC values reported for the replicates thus correspond to each model built with k-1 groups, and so they differ slightly. As the training data (the k-1 groups) differs less for each round of cross-validation than the left-out group, training AUC should differ less than test AUC (also called "validation AUC"). Usually, one would take the average of the test AUC over all replicates to determine model performance.

For users of ENMeval, the partition choice "randomkfold" refers to this procedure. However, the training AUC reported for each combination of model settings corresponds to AUC calculated on the full dataset with those settings, not to the average of training AUC values over the folds --- these values, however, are used to calculate AUC diff for each fold, which is train AUC (for the fold) - validation AUC. This can get confusing, which is understandable.

The bottom line is that you should be able to do whatever model selection routine you choose (for you, I believe it was AICc), extract the model corresponding to those settings, and use it for your analysis. Just remember that AICc does not use cross-validation results at all, so the choice of "randomkfold" here would not affect the AICc values. Other stats such as validation AUC, validation Continuous Boyce Index, and omission rates would, however, reflect average model performance for the assigned partitions.

Hope this clarifies a bit.

-----------------------------------------------------------
Jamie M. Kass, Ph.D.
JSPS Postdoctoral Scholar
Okinawa Institute of Science and Technology Graduate University


Jamie M. Kass

unread,
Jun 22, 2021, 10:03:10 PM6/22/21
to max...@googlegroups.com
By the way, here is code to show what I am talking about. It will work as long as the dismo package is installed. Train AUC values should be more similar to each other than values of test AUC over the replicates.

occs <- read.csv(file.path(system.file(package="dismo"), "/ex/bradypus.csv"))[,2:3]
envs <- raster::stack(list.files(path=paste(system.file(package='dismo'), '/ex', sep=''),
                                 pattern='grd', full.names=TRUE))
bg <- as.data.frame(dismo::randomPoints(envs, 1000))
names(bg) <- names(occs)

m <- dismo::maxent(x = envs, p = occs, a = bg, factors = "biome",
                   args = c("noremoveDuplicates", "noautofeature",
                            "nohinge","noproduct","nothreshold",
                            "betamultiplier=2","replicates=10"))

res <- data.frame(m@results)
res[5,]
res[8,]

-----------------------------------------------------------
Jamie M. Kass, Ph.D.
JSPS Postdoctoral Scholar
Okinawa Institute of Science and Technology Graduate University

Reply all
Reply to author
Forward
0 new messages