SWD format – Background points – strange results

657 views
Skip to first unread message

Melanie

unread,
Apr 4, 2014, 11:07:28 AM4/4/14
to max...@googlegroups.com

Hello everybody,

I am new in the group and have a question regarding the SWD format in Maxent.

Our data are in grid-format and we have 2900 grids. Our species occurs in 2000 of the grids (present points). We want to include every grid in the modelling either in as present point or as background. We put the 2000 present points with the environmental data as sample file and only the 900 background points with the environmental data as environmental layer file. Is this correct or should we put the presence and the absence points (2900) as background file in the environmental layer? We are also wondering why the modelling results are split in two csv.files (species.csv with modelling results at background points and species_samplePredictions.csv with the presences).

 

We used the following settings:

Command line to repeat this species model: java density.MaxEnt nowarnings noprefixes -E "" -E species responsecurves jackknife outputdirectory=xxx samplesfile=xxx environmentallayers=xxx maximumbackground=10000 nothreshold nohinge noaddsamplestobackground noautofeature

Another point is, that the "omission on training samples" in the “Analysis of omission/commission” looks very strange (see appendix). Could it be due to the small number of background points for modelling? Or is this a problem of the settings? In addition we checked the AUC value with another software and the results are different.

Can anybody help us with this problem?

Many thanks for your help.

Melanie

Jamie Kass

unread,
Apr 6, 2014, 10:58:57 AM4/6/14
to max...@googlegroups.com
There's a lot going on here, but let's start with the topic of background points first. MaxEnt, by default, generates 10,000 randomly distributed background points per run, so there is no need to add your own. There are cases where you might like to, for example a suite of runs with or two varying parameters (beta multiplier, spatially filtered presence points, etc.). In these cases, you'd probably want to keep all other parameters as consistent as possible, and thus inputting your own background points for each run will ensure they remain unchanging. However, for most other cases, you can leave the background points to MaxEnt. As a last note, 900 background points may be too few, depending on your extent. Ideally, you'd like one background point for every cell in your predictor raster (some decide to exclude cells that have presence points).

Second, I think you might be confusing background points with absence points. MaxEnt does not accept absence points as an input, as it is a presence-background model. Other models, such as GARP or GLM, are presence-absence models, and if you have actual absence data (that you trust), you may want to model with one of these algorithms instead. The methodology between presence-absence and presence-background is quite different, and it would be good to read up a bit on what this all means. A good start to better understand what's going on in presence-background models is to read the Phillips tutorial (attached). It might answer some of your other questions too.

Third, values of AUC will be different across different implementations because the metric depends on how many samples are taken (the number of thresholds used) to build the graph. The more samples that are taken, the smoother the response will be, and therefore the AUC will probably be slightly higher. If the results are very different, that is cause for concern. Further, you should not rely on AUC as a predictor of model quality, rather, testing AUC (not training, as you need an independent dataset to test on) should be examined along with other metrics like omission rate, and more complex model selection techniques (like AIC or jackknifing) can help you pick out the best of your available models. These papers are great guides to model selection:
Radosavljevic, A. and R. P. Anderson. 2014. Making better Maxent models of species distributions: complexity, overfitting, and evaluation. Journal of Biogeography. 41:629-643.
Shcheglovitova, M. and R. P. Anderson. 2013. Estimating optimal complexity for ecological niche models: a jackknife approach for species with small sample sizes. Ecological Modelling, 269:9-17.

Hope this helps.
philips_tutorial.pdf
Message has been deleted

Melanie

unread,
Apr 7, 2014, 9:52:07 AM4/7/14
to max...@googlegroups.com
Dear Jamie,
dear all,

many thanks for your quick answer. But to be honest this doesn't help us. We already read the tutorial, but for us it is not clear how to handle the SWD-Format regarding the settings.

For the SWD-Format I need a .csv-File for the "Samples" and one .csv-File with Background data for the "Environmental layer".

Our study area contains of 3000 raster grids. The species is present in 2000 of the grid cells and absent in 1000 grid cells. Is it possible to use MaxEnt with 2000 presences and only 1000 absences / background points?

What does “Add sample to background” and “Add all sample to background” in the settings (advanced) exactly mean? We want to use only the 1000 absences as background. So we click neither “Add sample to background” nor “Add all sample to background” in the settings (advanced). Is this the right way?

Thanks for your help in advance.

Regards,

Melanie

Reply all
Reply to author
Forward
0 new messages