Interpret result of Maxent

798 views
Skip to first unread message

Stefany Vega

unread,
Jul 20, 2018, 10:14:48 AM7/20/18
to Maxent
Hi,

I am a begginer with Maxent, I am modelling the behaviour distribution of birds. I have three question.

1)  I run maxent with 10 replicates, 500 interactions and 25% of crossvalidate. I have a result for each replication and one of the average. I don't know with document to use, the document with the average of the replication of I have to choose one of the best replication according to the AUC?.

I am confused because in the average I have an AUC of 0,57 and in each replication the AUC is 0,80-0,89,
2) Do I have  to work with the average map?
3) Which is the best grafic to discribe the variables, the jacknife regularized training gain or Jacknife of AUC?, because the last one is very different to the table that estimates the relative contribution of the variables.
4) In the grafics of a categorical variable, I don't know how to relate the number of the categorie with the categorie of the map, because I have different codification?.

Thank you so much for your help!
Passer_domesticus_des_habitat_only.png

Heiko

unread,
Jul 24, 2018, 5:28:00 AM7/24/18
to Maxent
Hi Stefany.
First of all, what is "500 interactions and 25% of crossvalidate"?
What exactly is it you want to do? There is a difference between leaving a certain percentage out of the training for testing (e.g. 25%) and cross validation.

"Do I have  to work with the average map?"
In my opinion,if you do a cross validation you are interested in the average. That said, if there is a huge difference between the single steps, then there is something wrong with the model. Of course you could also just choose the best model, but that might then blurry the truth somehow.

The difference in the AUC might be due to the reason that one is the AUC for the training and the other for the test data? Please check that.

Your question 3 I am struggling with myself at the moment. Jackknife and the percent contribution / permutation importance can sometimes tell a very different story.

4) Yes, that is an issue for me as well. I always have to go back to my original data and see how it is coded there. A "2" in the original data should be a 2 in Maxent as well. Or am I wrong on this one? But of course, if you have names that are automatically renamed, its tricky. Maybe one way is to give it numbers from start.

Hope this helps a little!

Stefany Vega

unread,
Jul 24, 2018, 6:15:08 AM7/24/18
to max...@googlegroups.com
Thank you so much Heiko, you area helping me lot.

I am little confunsed about crossvalidation and the training test, could you please explain me what does each one?

The difference in the AUC might be due to the reason that one is the AUC for the training and the other for the test data? Please check that.

I am not sure, the lowest AUC is the average, for example in the average I have an AUC 0,57, but when I check each replication I have AUC between 0,8 and 0,9, and I don't know why the average AUC is too low, do you thing there is someting wrong in the model?.

Thank so much for your help :).





--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at https://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

Heiko

unread,
Jul 24, 2018, 6:35:56 AM7/24/18
to Maxent
"I am little confunsed about crossvalidation and the training test, could you please explain me what does each one?"
Its pretty easy. Subsampling just randomly sets aside a certain percentage of your training points for later testing (they are not included in the model training). You can repeat this step multiple times.
Cross validation seperates your whole data set into x bins (x = number of "replicates" in Maxent settings). So lets say you have 100 presence points and choose cross validation with 5 replicates. Your 100 points will be divided into groups of 20. Then the first 20 are set aside for testing, group 2,3,4,5 for training. Next round the second is set aside for testing and groups 1,3,4,5 for training. An so on... The advantage is that you use EVERY point. With subsampling its random. But if you only have, I dont know, 20 points, then cross validation might be less applicable.
So: If you choose 25% "random test percentage" and 5 replicates, Maxent will complain. Its either or.


Ok, but whats the TEST AUC in your single runs?
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.

Stefany Vega

unread,
Jul 24, 2018, 7:44:47 AM7/24/18
to max...@googlegroups.com
In the table there are the AUC for each replicate and the Mean AUC. Why the Mean AUC is low?, and what do you think, is it better to work with the mean AUC or I have to choose the best model of the replicates?

Replicate Training data AUC Test data AUC
1 0,88 0.623
2 0,886 0,6
3 0,86 0,822
4 0,88 0,54
5 0,886 0,731
6 0,882 0,496
7 0,866 0,741
8 0,88 0,837
9 0,884 0,647
10 0,868 0,783
Mean AUC 0,683

Thank you so much.

To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.

Stefany Vega

unread,
Jul 24, 2018, 8:02:59 AM7/24/18
to max...@googlegroups.com
Heiko, 

About the crossvalidation, I am doing un smd of bird behaviour, and I have 41 points to feeding, 54 to resting for one specie, and to the second one 8 to feeding and 30 to resting. In this case, which is better to do in crossvalidation?

I made my model with randon test porcentaje 25%  and 10 replicates.

That are the result that I mentioned, and That is the question that I first made,  Why the Mean AUC is low?, and what do you think, is it better to work with the mean AUC or I have to choose the best model of the replicates?

Heiko

unread,
Jul 24, 2018, 10:45:47 AM7/24/18
to Maxent
Hey Stefany.
Feeding and resting? Well, I dont know how to implement that in an SDM.
For me, this are presence points, no matter what the species does there. Maybe there is a way to use that information, but I never tried.
"In this case, which is better to do in crossvalidation?"
I dont understand the question. But lets say for species A you have 95 occurrence points if I understood you right. That is not bad. You can do cross validation or subsampling I would say. Whats better, I cant comment on.

"I made my model with randon test porcentaje 25%  and 10 replicates."
Ok, so you are not using cross validation but subsampling!

"Why is the mean AUC low?"
Well, why exactly the AUC is low is hard to say. Are you sure you covered the majority of the ecological drivers of that species? Or could there be other variables that effect its distribution but are missing in your model? That would be the most obvious but not the only possible reason.

Stefany Vega

unread,
Jul 24, 2018, 11:17:40 AM7/24/18
to max...@googlegroups.com
Heiko,

Thank you so much you are helping me a lot, I haven´t explained well, I have 4 behaviours, so I have a sample of 6 points, other with 40, and 30 points, for each one I am running a different model. 

I was reading that for small samples I have to do a one -leave crossvalidation, Do I have to put the same number of samples as a replicates?, and it means that Random test porcentage in Maxent should be 0? 

And the last question, is it better to use  10 percentile training presence as a threshold or what do you recommend for small samples?



To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.

Heiko

unread,
Jul 25, 2018, 5:04:22 AM7/25/18
to Maxent
 Like I said, I have no experience about including behavioural data into an SDM. I guess you could try with this data to show where the species is if it is doing a certain thing. E.g. feeding. Might be interesting.
The easiest way of course would be to just put together all data you have and use it as presence data. But that would ignore any information on WHY the species is present there of course.

6 points will probably not be enough to model anything useful I would say.

Yes, if you want to use cross validation (in whatever form), the "random test percentage" has to be zero. But try putting something in there. If you choose "cross validation" and a number > 1 for replicates, MaxEnt will tell you when you start the run that "random test percentage" is set to zero.
Leave one out CV I never used because I have ernough points to leave out more than 1 point for testing. Just try it out.

Your last question is about thresholds, a whole different topic that I dont really feel comfortable giving advice about. But one thing I can say. A question like "is it better to use..." is always difficult. If one option would be superior than the others, why should there be options? It always depends on the data, the goal, the whatever.
Maybe someone else can give some advice on when what to use.

Bye and good luck!
Reply all
Reply to author
Forward
0 new messages