Spatial autocorrelation in Maxent

2,750 views
Skip to first unread message

Alastair

unread,
Apr 21, 2011, 8:34:16 AM4/21/11
to Maxent
Good day all,
A reviewer has asked how spatial autocorrelation may affect my
results. I know a little about spatial autocorrelation, but nothing
about how it relates to Maxent (but I have seen some glib comments
without citation stating that Maxent is immune to autocorrelation
issues). I would very much appreciate some help trying to clear up the
issue. There seems to be a huge literature dealing with spatial
autocorrelation problems, but very little of it overlaps with Maxent.
Thus, I think this would be valuable discussion to many Maxent users.

Background:
An extract from Dormann et al. (2007: Ecography) to start off with:

"Species distributional or trait data based on range map (extent-of-
occurrence) or atlas survey data often display spatial
autocorrelation, i.e. locations close to each other exhibit more
similar values than those further apart. If this pattern remains
present in the residuals of a statistical model based on such data,
one of the key assumptions of standard statistical analyses, that
residuals are independent and identically distributed (i.i.d), is
violated."

The observed occurrence-probability given by Maxent can be considered
the model residuals (at least according to Mateo-Tomas&Olea 2010
PLoS).

One can test for spatial auto-correlation in model residuals using
correlogram of Moran's I - as done by Mateo-Tomas& Olea (2010) and
suggested by Dormann et al. (2007).

Questions:

1) I have read in a few places that spatial autocorrelation greatly
affects linear models. Does spatial autocorrelation affect non-linear
models too (such as Maxent)?

2) I have tested the occurrence-probability with a correlogram of
Moran's I. While the upper bins are close to 0 and non-significant,
the lower bins have high and significant Moran's I values - but surely
this is to be expected as the species' distribution is more likely to
be clumped than distributed evenly through space?

3) Mateo-Tomas&Olea generate their Moran's I correlogram and then
apply a Bonferroni correction to the p-values - this changes what I am
sure would have been significant Moran's I bin values to non-
significant. Thus, they stated in their study that their results were
free from autocorrelation issues. I have much higher Moran's I values,
and these too become non-significant when I use this correction. Can
anyone comment as to why a correction was used, and whether the
Bonferroni correction may be suitable given its conservativeness.

Please add other spatial autocorrelation questions to this post if you
have any.

To those with the answers: thanks for your time and for sharing your
knowledge in advance.

Cheers,
Alastair



Colin Driscoll

unread,
Apr 22, 2011, 5:19:51 AM4/22/11
to max...@googlegroups.com


I think a response would be to discuss the impact of correlation between the environmental variables used in the modelling process. Maxent might be tolerant of this but it doesn't help determine which variables are the main drivers in the model process. Take elevation for example which could be correlated with several other variables such as temperature, moisture, rainfall for example. Also SAC bias in the samples and there are papers about dealing with this.

Dormann is talking about spatial bias in the input samples data, not the final model. As you say, the final model would be expected to have SAC.
 



--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To post to this group, send email to max...@googlegroups.com.
To unsubscribe from this group, send email to maxent+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/maxent?hl=en.


Chloe

unread,
Apr 22, 2011, 7:37:45 AM4/22/11
to Maxent
Hi,

I have been looking into this issue myself because I have found that I
have significant residual SAC in some of my models. The problem with
this is that it may:

1) Inflate accuracy measures like AUC values (eg. Veloz 2009 -
Spatially autocorrelated sampling falsely inflates measures of
accuracy for presence-only niche models)
2) Inflate the importance of some autocorrelated variables (see Diniz-
Fihlo 2003 - Spatial autocorrelation and red herrings; but also see
Hawkins et al 2007 Red herrings revisited: spatial autocorrelation and
parameter estimation in geographical ecology)

... And these issues do apply to MaxEnt, not just GLMs.

However it is a tricky issue to deal with. One thing you could do is
add some kind of autocovariate into your model which accounts for
local spatial structure (see Segurado 2006 - Consequences of spatial
autocorrelation for niche-based models), or eigenvector maps to take
into account at multiple scales (eg. Blach-Overgaard 2010 -
Determinants of palm species distributions across Africa: the relative
roles of climate, non-climatic environmental factors, and spatial
constraints) - but these are not useful if you plan to transfer
predictions to new or wider areas - as I am.

What I have settled on is:

1) Subsampling the data so that rSAC disappears/dramatically reduces,
and re-running models to show that variable importance does not change
a great deal
2) Testing my models with geographically independent data to show
predictive success remains high

If you don't have any independent data available, you could consider
separating your training and test data geographically (eg. - Parolo
2008 - Toward improved species niche modelling: Arnica montana in the
Alps as a case study).

I hope that helps. I'd really appreciate anyone else's thoughts on the
matter...

Cheers,

Chloe

Bruce Miller

unread,
Apr 22, 2011, 11:57:52 AM4/22/11
to max...@googlegroups.com
This may be of interest re:spatial auto-correlation.

http://cran.r-project.org/web/packages/spdep/vignettes/CO69.pdf

Eons ago 1989 Daniel Wartenberg released SAAP- A Spatial Autocorrelation
Analysis Program V. 4.3

I used to have this on my older machine, but likely long gone now.

I am sure under the R and CRAN projects there are loads of analytical
tools out there.

A fact of evolution and ecology seems to be critters/plants evolved
where conditions were suitable hence locations are Auto correlated as if
not they would not be there if the conditions were not suitable.

No Tru?

Bruce


Alastair

unread,
Apr 26, 2011, 11:02:47 AM4/26/11
to Maxent
Hi Chloe,
I was really hoping that maxent somehow magically dealt with
autocorrelatoin. Oh well.

I've been looking into your suggestion to adding an autocovariate into
the model.
De Marco & Biollett (2008; Biol. Lett. - Spatial analysis improves
species distribution modelling during range expansion ) model
simulated distributions using environmental and environmental plus
SEVM data. They conclude that using SEVM data is most beneficial when
a species is not in equilibrium with its environmental conditions
(i.e. expanding or contracting – expansion is most troubling when
modelling invasive species). However, I am extremely confused by their
Figure 2(c) which demonstrates that the Maxent model residuals remain
autocorrelated whether SEVM is included in the model or not – thus, my
interpretation is that the SEVM has not removed the problem of
autocorrelation. Also, their maxent distributions do not look too
different between the SEVM-present and SEVM-absent models.

It looks like we have a way to go before we understand the ins and
outs of controlling for autocorrelation - e.g. Dormann et al. (2007 -
Methods to account for spatial autocorrelation in the analysis of
species distributional data: a review) followed by Betts et al. (2009
- Comment on ‘‘Methods to account for spatial autocorrelation in the
analysis of species distributional data: a review’’).

For your two suggestions:
1) Do you sample using distances greater than those where Moran's I
predicts significant SAC?
2) What kind of independant data do you use? (I am struggling to think
of what non-spatial data I could use, unless you are suggesting a
mechanistic model).

Cheers,
Alastair

michelle g

unread,
Apr 27, 2011, 4:51:07 AM4/27/11
to Maxent
Hi Alastair
Perhaps this paper could be of use: Blach-Overgaard, A., Svenning, J.-
C., Dransfield, J., Greve, M. & Balslev, H. (2010) Determinants of
palm species distributions across Africa: the relative roles of
climate, non-climatic environmental factors, and spatial constraints.
Ecography, 33, 380-391
Regards,
Michelle

Chloe

unread,
Apr 27, 2011, 5:31:09 AM4/27/11
to Maxent
Hi Alastair,

Yes, it does seem to me that there is no simple way to account for SAC
in species distribution models. Methods are still developing and
nothing seems to provide a quick fix. I do think that people need to
consider its implications, but that a model with residual SAC can
still be useful and shouldn't be discounted.

I filtered the data so that my presence points were further apart (50m
---> 500 m), but I could not use the largest significant distance as
indicated by the correlogram because I would have had too few data
points. The most important variables did not change, but others became
less important. My problem is that I can't separate the effect of
reducing SAC from the effects of decreasing the sample size. However,
various papers argue that SAC doesn't affect variable importance and
response curves. Others argue that it does... but most of these
studies seem to have used regression techniques - there's hardly
anything published on SAC + MaxEnt models.

I think a bigger problem is the inflated measures of model
performance. I built my models in a small area and then projected it
to a much larger one. The independent data I used were presence points
I collected from this wider region - I used these to provide a second
test of model performance. As these were far from my training data, I
hope to show that the models are still valid as the AUC did not drop
significantly when I tested the models with these geographically
separated data.

As Veloz (2009) says:

"... the results from this study suggest that ensuring that clusters
of training data are not excessively clustered around test data will
provide a better assessment of prediction accuracy."

I'm still working on this... I'd be interested in hearing what you
decide to do - and from anyone else with similar problems.

Cheers,

Chloe


epiphyte

unread,
Apr 27, 2011, 10:20:10 PM4/27/11
to Maxent
I have a very basic question about this issue.

Does significant SAC of SDM matter? Since environmental factors reveal
certain degree of SAC (some are produced by interpolation), and
dispersal (for plants) is related to distance, what's wrong if SDM
results are spatial autocorrelated?

Please forgive my limit knowledge of statistics, and thanks for any
response.

Kind regards,

Rebecca

Ophelia

unread,
Oct 1, 2013, 12:15:11 AM10/1/13
to max...@googlegroups.com, pot...@gmail.com
Hi Alastair et al.,

Were you able to address your questions 2) and 3) in your final paper? How did you deal with SAC at the end? Can I have the citation of your publication?

I'm facing the same question from the editor and my model residuals also showed the pattern you described in your question 2): significant at lower bins and non-significant at higher bins. So far I've only tried different Moran's I parameter settings using ArcGIS and SAM. 

I would like to learn more about why a Bonferroni correction may be suitable too. What was your adjusted p-value? 

Thanks,

Ophelia  

Ophelia Wang (王用和)

unread,
Oct 1, 2013, 3:47:32 AM10/1/13
to Alastair Potts, max...@googlegroups.com
Hi Alastair,

Thanks so much for sharing the doi and your response to the comment.
You did a great job to summarize debates in literature regarding this
issue; congratulations for the publication! The comment we received
was similar to yours, "Can you clarify how spatial autocorrelation is
dealt with within the statistical model? The analysis won't be
appropriate if you are assuming independent errors if the errors are
correlated in space." I also feel that there has not been
well-explored method to deal with SAC in Maxent. Perhaps comparing
Maxent models to models built using e.g. GLM might shed some light,
but that's beyond the scope of our manuscript.

Ophelia

On Tue, Oct 1, 2013 at 2:01 PM, Alastair Potts <pot...@gmail.com> wrote:
> Hi Ophelia,
>
> Here is the DOI to the paper (doi:10.1111/j.1365-2699.2012.02788.x); however
> I never did find suitable answers to the autocorrelation question. I had to
> argue my way out of that query.
>
> Below are my responses to the reviewer's query:
>
> [Reviewer]
>
> Perhaps associated with this is the question of spatial autocorrelation and
> how susceptible Maxtent’s results are to this. AST cells are distinctly
> clustered in space, as are the background cells used for calculating AUC
> values. Could the way Maxent apportions the likelihood of a cell having
> presence allow explanatory climate variables that are not important to creep
> into the model because of this? If so, projected results might have some
> further limitations.
>
>
> [Response]
>
> Spatial autocorrelation is may be a significant problem in all species
> distribution modelling exercises, but as yet, we feel that there has been no
> consistent and well-researched means to implement controls for
> autocorrelation. There have been suggestions on how to deal with this – e.g.
> Dormann et al. (2007) – however, the applicability of these methods and
> their influence on results has not been explored (for example, see Betts et
> al., 2009). How different models are influenced by autocorrelation has also
> not been explored.
>
>
>
> With regards specifically to Maxent: the inclusion of Spatial Eigen Vector
> Maps (SEVM) have been used in a handful of studies to ‘control’ for
> autocorrelation. De Marco & Biollett (2008) model simulated distributions
> using environmental and environmental plus SEVM data. They conclude that
> using SEVM data is most beneficial when a species is not in equilibrium with
> its environmental conditions (i.e. expanding or contracting – expansion is
> most troubling when modelling invasive species). We feel that the AST can be
> considered to be largely in equilibrium with its current environment as
> dramatic expansion and contraction (other than man-induced contraction) have
> not been noted by botanists in the region. Furthermore, we are worried by
> the usage of the SEVM in the study by De Marco & Biollett (2008): Figure
> 2(c) demonstrates that the Maxent model residuals remain autocorrelated
> whether SEVM is included in the model or not – thus the SEVM has not removed
> the problem of autocorrelation. SEVM has been used inconsistently and rather
> opaquely in other studies (e.g. Reshetnikov & Ficetola, In Press), with no
> explicit discussion on how this influences the results produced by Maxent.
>
>
>
> Other authors have claimed to test for autocorrelation and used overly
> conservative p adjustments to state that they have no autocorrelation in
> their model residuals. For example, Mateo-Tomas & Olea (2010) produce a
> Moran’s I correlogram (Figure S1) that would surely have significant Moran’s
> I values had they not used a progressive Bonferroni correction.
>
>
>
> In summary, we feel that autocorrelation is possibly an important factor
> that may affect our results; however, there is no ‘standard’ and
> well-explored method to deal with it. As pointed out recently by Dormann
> (2007), spatial autocorrelation is a largely unresolved problem in species
> (in our case vegetation) distribution modelling. However, Diniz-Filho et al.
> (2003) state “Claims that analyses that do not take into account spatial
> autocorrelation are flawed are without foundation” primarily because of the
> remaining uncertainty that surrounds the issue. The ‘problem’ of
> autocorrelation is that we do not know how to deal with it (both in this
> paper and in the niche modelling community) – as such we have decided not to
> comment on this issue in an already over-laden paper. If the Editor believes
> this is incorrect, and would like us to deal with the issue, then we will
> gladly do so.
>
>
> Betts, M.G., Ganio, L.M., Huso, M.M.P., Som, N.A., Huettmann, F., Bowman, J.
> & Wintle, B.A. (2009) Comment on “Methods to account for spatial
> autocorrelation in the analysis of species distributional data: a review”.
> Ecography, 32, 374-378.
>
> De Marco, P., Diniz-Filho, J.A.F. & Bini, L.M. (2008) Spatial analysis
> improves species distribution modelling during range expansion. Biology
> Letters, 4, 577-580.
>
> Dormann, C.F., Mcpherson, J.M., Araújo, M.B., Bivand, R., Bolliger, J.,
> Carl, G., Davies, R.G., Hirzel, A., Jetz, W., Kissling, W.D., Kühn, I.,
> Ohlemüller, R., Peres-Neto, P.R., Reineking, B., Schröder, B., Schurr, F.M.
> & Wilson, R. (2007) Methods to account for spatial autocorrelation in the
> analysis of species distributional data: a review. Ecography, 30, 609-628.
>
> Diniz-Filho, J.A.F., Bini, L.M. & Hawkins, B.A. (2003) Spatial
> autocorrelation and red herrings in geographical ecology. Global Ecology and
> Biogeography, 12, 53-64.
>
> Mateo-Tomás, P. & Olea, P.P. (2010) Anticipating Knowledge to Inform Species
> Management: Predicting Spatially Explicit Habitat Suitability of a Colonial
> Vulture Spreading Its Range. PLoS ONE, 5, e12374.
>
> Reshetnikov, A.N. & Ficetola, G.F. (In Press) Potential range of the
> invasive fish rotan (Perccottus glenii) in the Holarctic. Biological
> Invasions.Robertson, M.P. & Palmer, A.R. (2002) Predicting the extent of
> succulent thicket under current and future climate scenarios. African
> Journal of Range & Forage Science, 19, 21-28.
> --
> Dr. Alastair J. Potts
> Claude Leon Postdoctoral Fellow
> Botany Department
> Nelson Mandela Metropolitan University
>
> Cell #: 082 491-7275
> Office #: 041 504-4375 (Mon, Thurs, Fri)
> Fax #: 086 273-2675
>
> "Research presumes dissatisfaction with existing descriptions of reality and
> explanations of our experience of it – it rests on the desire to do better
> than the current consensus. Research, therefore, requires freedom to
> question received wisdom and some background knowledge of why we think we
> know what we think we know." John F. Allen (2003; Future Med. Chem, 2:15-20)



--
"Thousands of mountains bask in the same moonlight, millions of
households enjoy the same springtime, a thousand rivers reflect a
thousand moons, and the clear, clean sky stretches for millions of
miles".
Reply all
Reply to author
Forward
0 new messages