Choosing Maximum number of Background Points

2,624 views
Skip to first unread message

Lyn Howe

unread,
Aug 5, 2017, 10:19:54 PM8/5/17
to Maxent
I am working on a Species Distribution model, with only 50 presences. I have been unable to find much information on how to choose number of background points. The default is 10000, but I don't know if that will be optimal for such few presences. Any literature or advice on the matter would be very helpful.
Thanks in advance

Iman Momeni

unread,
Jan 5, 2018, 4:31:04 AM1/5/18
to Maxent


On Sunday, August 6, 2017 at 4:19:54 AM UTC+2, Lyn Howe wrote:
I am working on a Species Distribution model, with only 50 presences. I have been unable to find much information on how to choose number of background points. The default is 10000, but I don't know if that will be optimal for such few presences. Any literature or advice on the matter would be very helpful.
Thanks in advance

No solution yet?

Dimitris Poursanidis

unread,
Jan 5, 2018, 4:34:16 AM1/5/18
to max...@googlegroups.com
The number of 10.000 bg points is fine to use.
See the attached paper on that issue.

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at https://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

Selecting pseudo-absences for species distribution.pdf

Iman Momeni

unread,
Jan 5, 2018, 5:39:58 AM1/5/18
to Maxent
Thank you Dimitris
But in this paper 10000 is the maximum number of background points which they evaluated.
What about 20000 or 30000 points?


On Friday, January 5, 2018 at 10:34:16 AM UTC+1, Dimitris Poursanidis wrote:
The number of 10.000 bg points is fine to use.
See the attached paper on that issue.
On 5 January 2018 at 11:31, Iman Momeni <momen...@gmail.com> wrote:


On Sunday, August 6, 2017 at 4:19:54 AM UTC+2, Lyn Howe wrote:
I am working on a Species Distribution model, with only 50 presences. I have been unable to find much information on how to choose number of background points. The default is 10000, but I don't know if that will be optimal for such few presences. Any literature or advice on the matter would be very helpful.
Thanks in advance

No solution yet?

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.

Dimitris Poursanidis

unread,
Jan 5, 2018, 5:43:16 AM1/5/18
to max...@googlegroups.com
Try and see

Also keep in mind that at each cell only one bg point is placed - thus you can see the number of cells of your raster files and see how many can fit.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.

Iman Momeni

unread,
Jan 5, 2018, 5:46:31 AM1/5/18
to Maxent
Thanks

Weverton Carlos

unread,
Jan 5, 2018, 3:44:10 PM1/5/18
to Maxent

The more background points you set, the higher will be the entropy of the model. Therefore, the distribution will be more spread out.

You can mark the option "Write background predictions" in Experimental settings to get a file with the points randomly selected to represent the background. This file can  be useful to do an analyze to see if the randomly points are a good representation of the environmental variables. If it's not, you should increase the number of background points.

Kaija Gahm

unread,
Jan 11, 2018, 3:27:47 PM1/11/18
to Maxent
I'm facing a similar problem in choosing the number of background points to use. One thing that I'm trying is to run a bunch of models (same presence data each time), each with a different number of background points (1000 through 20000 in increments of 500, say) and calculate AUC for each model using 10-fold cross-validation. 

Then I plot the number of background points on the x-axis and the AUC value on the y-axis. I'm finding that AUC increases up to about 10000 or 15000 background points and then levels off. I have about 19000 occurrence points. I don't know if this pattern is a function of my sample size or whether it's relatively universal. 

Does this seem like a valid way of investigating different numbers of background points?

Dimitris Poursanidis

unread,
Jan 12, 2018, 1:16:16 AM1/12/18
to max...@googlegroups.com
What Carlos says is correct.
Also, 19000 occurrence points what geographic extend cover (is global study?) and what is the spatial resolution of the predictors ?

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+unsubscribe@googlegroups.com.

Weverton Carlos

unread,
Jan 15, 2018, 4:25:51 PM1/15/18
to Maxent
At what value does the AUC level off?
When you increase the number of background points, you get a model with a higher entropy. Thus, less points tend to be outside of suitable areas, and higher must be the AUC. Probably there are some ocurrence points situated in areas very unsuitables and that is why the AUC values level off.

Iman Momeni

unread,
Jan 16, 2018, 3:18:54 PM1/16/18
to Maxent
Dear Carlos,
Did you see figure 4 in Phillips & Dudik, 2008?

Weverton Carlos

unread,
Jan 18, 2018, 5:41:07 PM1/18/18
to Maxent
Yes, I did. It's a graph showing what I said: the increase in background points leads to an AUC increase because the entropy becomes higher. I suppose the AUC tends to level off because of ocurrence points situated in pixels that represent very unsuitable environmental conditions to specie.

Jamie M. Kass

unread,
Feb 8, 2018, 4:19:16 AM2/8/18
to Maxent
Something perhaps not considered in this conversation is the need to get a good sample of the study extent with the background points. Is a sample of 10,000 good for a raster with 20 million cells? I would say no, and it would mean that each background sample of 10,000 would result in a very different model. The point of the magic number 10,000 was to try and get a representative sample without exhaustively sampling every cell because of concerns about runtime or computing power. However, if the computer can handle it, getting a better sample is important. Please see the paper below for an example of how poor background samples can lead to erroneous predictions for the past.

Guevara, L., Gerstner, B. E., Kass, J. M., & Anderson, R. P. (2017). Toward ecologically realistic predictions of species distributions: A cross‐time example from tropical montane cloud forests. Global change biology.

Jamie Kass
Phd Candidate
City College of NY

Nolan Helmstetter

unread,
Sep 5, 2019, 10:11:53 AM9/5/19
to Maxent
Great article! I just saw this post and was reminded to go back and re-read it! 

One thing I am curious about, in regard to maxent and validating models, is if background points include presence locations, how does this impact model validation (such as test-AUC)?

Can maxent discriminate between a background point with no occurrences vs a background point with an occurrence point? I realize background points aren't considered absences in maxent, only a sample of environmental conditions, or what's available to a species, but I'm curious about the implications of including presences in background selection vs. excluding presences from background selection. Just something that has been on my mind!

Thanks,

Nolan

Adam Smith

unread,
Sep 6, 2019, 5:44:10 PM9/6/19
to Maxent
Hi Nolan,

Presumably Maxent (and any other SDM) using background sites which could land on a presence (known or unknown) would assign background sites that landed on a presence a higher prediction than if the background landed on a true absence (known or unknown).

Generally for a well-tuned model the maximum achievable AUC calculated with background sites will be 1 - a/2 where a is the prevalence (mean probability of occurrence) of the species.  This means if you get an AUC of, say, 0.6, you can't say whether this is arbitrarily good or bad--if the species occupies 80% of the study region (mean Pr(occ)= 0.8), then this would be an excellent AUC.  But if it occupied less 0.6 might be low. However, you should be able to compare AUCs between models for the same species using the same data (pres + bg)... higher values should indicate better models because they should correlate positively with AUC calculated with presences and absences (which is what we normally would like to have). 

However, it's possible to have AUC (pres+bg)  > 1 - a/2 if you overtune your model, in which case there will be  a *negative* relationship between AUC calculated with presences/absences and presences/background sites. So chasing a higher AUC(pres+bg) can actually make a worse model, even if it seems better.

I have no empirical basis for saying this, but I expect that Maxent's use of LASSO regularization tends to guard against overtuning models, so the point in the paragraph above is less an issue.  However, I have seen other algorithms which I really felt were overtuned, and this gave very high AUC (pres+bg), but probably would have given low AUC (pres+abs).

Adam
Assistant Scientist
Missouri Botanical Garden

Terrell Roulston

unread,
May 1, 2025, 8:16:51 PMMay 1
to Maxent
Reopening this discussion as it is the top result when searching Google for how many background points are needed in Maxent models.

Valavi et al. 2022 (https://doi.org/10.1002/ecm.1486) tested just this (along with many other SDM methods). 

They found 50,000 points was the "gold standard" for large geographic areas (i.e. New Zealand). Although it should be noted that this comes at an increased computational expense, so perhaps an intermediate number of points between 10,000 and 50,000 may be more appropriate for your application. See Figure 1.

As others have mentioned it is important to note that increasing the number bg points can artificially increase AUC, so other metrics of model performance such as Continous Boyce Index are more reliable.
Reply all
Reply to author
Forward
0 new messages