Autocovariate from residuals to account for spatial autocorrelation

1,933 views
Skip to first unread message

mike worland

unread,
Jun 22, 2015, 11:05:12 PM6/22/15
to unma...@googlegroups.com
Hello Unmarked Users,
 
I'm studying relationships between habitat variables and bird density using temporary emigration models (gmultmix) in Unmarked.  I wanted to look into the effects of spatial autocorrelation in my data and have tried the approach of Crase et al. 2012 (Ecography).  They advocate a residuals autocovariate model (RAC) where the autocovariate term is based on the spatial relationship of the residuals rather than the response variable.  Here's how they put it:
 

"By deriving the autocovariate from model residuals, only the variance unexplained by the explanatory variables is incorporated, and therefore the RAC model better captures the true influence of these explanatory variables, resulting in stronger inferential performance than the autologistic approach."

 

They promote the method as being simple and widely applicable, including in GLMs and decision trees.  I'm wondering if it works in an unmarked model.

 

The approach calls for a 2-stage analysis, first running one set of models, then using the residuals from the best model to calculate the autocovariate, then re-running models using the autocovariate.  Or at least something close to that.

 

Below I show results for Least Flycatcher. "glbl" is the best model from the initial model run with no autocovariate, then I made 4 more models where I tried the autocovariate term in different ways.  The glbl model has a delta AIC of 113 when compared to the best of the autocovariate models.  This is the most extreme result I found and not typical, but for most species I've tried so far the autocovariate models have been clearly better than the best model with no autocovariate.

 

First, one specific question: I have 4 columns of count data in the unmarkedFrame and so 4 residual values are given for each site.  I just summed across these columns to get a single residual value for each site.  Is that correct?

 

Additionally I'd really appreciate hearing others thoughts on this approach: is it accounting for spatial autocorrelation and not biasing parameter estimates in these gmultmix models? 

 

I include more script below to give an idea of how I calculated the autocovariate.

 

 

Best,

 

Mike Worland

 

 

 

# the original best model from models with no spatial autocovariate: 

(glbl       <-gmultmix(~site +year +ba +bigBA +scDiv +con +yb +snag,
                ~site +year,
                ~obsvr +sky +windy +time +date +ba, uf))

# adjusting the best model with the spatial autocovariate:               
(glbl.acAll <-gmultmix(~site +year +ba +bigBA +scDiv +con +yb +snag +ac,
                ~site +year +ac,
                ~obsvr +sky +windy +time +date +ba +ac, uf))
               
(glbl.acL   <-gmultmix(~site +year +ba +bigBA +scDiv +con +yb +snag +ac,
                ~site +year,
                ~obsvr +sky +windy +time +date +ba, uf))
               
(glbl.acPhi <-gmultmix(~site +year +ba +bigBA +scDiv +con +yb +snag,
                ~site +year +ac,
                ~obsvr +sky +windy +time +date +ba, uf))
               
(glbl.acD   <-gmultmix(~site +year +ba +bigBA +scDiv +con +yb +snag,
                ~site +year,
                ~obsvr +sky +windy +time +date +ba +ac, uf))
               
AClist <-modSel(fitList (glbl, glbl.acAll, glbl.acL, glbl.acPhi, glbl.acD))

 

> AClist
           nPars     AIC  delta   AICwt cumltvWt
glbl.acPhi    28 1501.90   0.00 8.4e-01     0.84
glbl.acAll    30 1505.16   3.25 1.6e-01     1.00
glbl.acL      28 1516.42  14.52 5.9e-04     1.00
glbl.acD      28 1568.68  66.78 2.6e-15     1.00
glbl          27 1614.94 113.04 2.4e-25     1.00

 

 

> (best <-glbl.acPhi)

Call:
gmultmix(lambdaformula = ~site + year + ba + bigBA + scDiv +
    con + yb + snag, phiformula = ~site + year + ac, pformula = ~obsvr +
    sky + windy + time + date + ba, data = uf)

Abundance:
            Estimate     SE      z  P(>|z|)
(Intercept)  -0.4591 0.2848 -1.612 1.07e-01
siteflam      1.0122 0.3737  2.709 6.75e-03
sitenhal      0.7586 0.3488  2.175 2.96e-02
year2005      0.7793 0.3596  2.167 3.02e-02
year2006      0.6886 0.3831  1.797 7.23e-02
year2007      0.1069 0.3347  0.319 7.49e-01
ba            0.0273 0.0737  0.371 7.10e-01
bigBA        -0.0478 0.0966 -0.495 6.21e-01
scDiv         0.3116 0.1045  2.983 2.86e-03
con          -0.2526 0.1143 -2.210 2.71e-02
yb            0.2047 0.0519  3.944 8.01e-05
snag          0.1371 0.0610  2.248 2.46e-02

Availability:
            Estimate    SE      z  P(>|z|)
(Intercept)   0.3289 0.451  0.729 4.66e-01
siteflam     -1.8888 0.483 -3.911 9.20e-05
sitenhal     -1.7737 0.484 -3.663 2.50e-04
year2005     -0.9833 0.518 -1.900 5.75e-02
year2006     -1.1698 0.525 -2.230 2.58e-02
year2007      0.0843 0.506  0.167 8.68e-01
ac            0.9655 0.100  9.646 5.11e-22

Detection:
             Estimate    SE       z  P(>|z|)
(Intercept)    2.4523 0.463  5.2998 1.16e-07
obsvrother2   -0.8653 0.582 -1.4856 1.37e-01
obsvrworland  -1.0400 0.617 -1.6850 9.20e-02
skypart       -1.4440 0.493 -2.9284 3.41e-03
skycloud      -0.0328 0.629 -0.0522 9.58e-01
windyy         0.1890 0.540  0.3498 7.26e-01
time           0.7755 0.264  2.9354 3.33e-03
date          -0.0635 0.251 -0.2533 8.00e-01
ba            -0.4134 0.230 -1.7941 7.28e-02

AIC: 1501.903

 

 

#example script for making autocovariate

 

#best model from first run with no autocovariate
(mod <-glbl)

 

# add residuals      
r <-residuals(mod)
r <-data.frame(r)
r$resid <- rowSums(r) # <- not sure this is correct
mrg <-cbind(r[,5],xplot[,c(2,10,22,25)])
colnames(mrg)[1] <- "resid"

 

# Make a variogram plot: use the "range" as the neighborhood distance when

# calculating the autocovariate (700 m in this case). 

library(geoR)

xplot04 <-mrg[mrg$year =="2004",]
v1 <-variog(coords =xplot04[4:5], data =xplot04$resid, breaks =seq(100,1500,100))
plot(v1,xlim=c(100,1500))
lines(v1) 

 

# Creates the spatial autocovariate for 2004 surveys.
xy<-cbind(xplot04$Easting,xplot04$Northing)
library(spdep)
ac04 <-autocov_dist(xplot04$resid,xy,type="inverse",style="W",
nbs = 700,longlat=NULL,zero.policy=TRUE)

 

#add autocovariate to dataset
ac<-c(ac04,ac05,ac06,ac07)
xplot2<-cbind(xplot,ac)

 

 

 

 

 

 

 

 

 

Richard Schuster

unread,
Jun 23, 2015, 12:34:42 AM6/23/15
to unma...@googlegroups.com
Hi Mike,

Others might have different ideas, but in my opinion you can only calculate residuals for most unmarked models (I usually use occu) using a Bayesian framework, as you can't estimate the latent variable (true occupancy or adundance) in a maximum likelihood framework.

That being said I do have R code (and an example for this). The general approach I used (others like Zipkin have done this as well) is described in Appendix S1 of this paper: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0099292
In there we describe how to test and subsequently account for spatial autocorrelation similar to what Crase suggested.
We also test for model goodness of fit using AUC.

I put up a repository on GitHub with the species dark-eyed junco as an example.
An illustrative example inf form of a figure of how residual spatial autocorrelation can be accounted for is also included (https://github.com/yeronimo/BUGS_spatial_autocorrelation/blob/master/Spatial_Autocovariate.jpeg)

Have a look and let me know if you have any questions (I didn't annotate very much as I just put this together quickly).

Cheers,
Richard
--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Giancarlo Sadoti

unread,
Jun 23, 2015, 2:30:01 AM6/23/15
to unma...@googlegroups.com
Mike,

Richard started a thread a few years back on this topic (https://groups.google.com/forum/#!msg/unmarked/WwZ918J2p60/MsZJ7gbO_AMJ) and it remains controversial.

I haven't read the Crase et al. paper, but a popular method to calculating a spatial autocovariate generalizable to many unmarked models is based on Augustin et al. (1996. Journal of Applied Ecology 33:339-347) and proposed (for single-season occupancy models) by Moore and Swihart (2005. Journal of Wildlife Management 69:933-949) wherein a Moran's I correlogram or semivariogram is employed (e.g. via rules/suggestions in Legendre and Legendre, 1998) to assess the radius of autocovariate calculation.

The main sticking point for many (as Richard noted) is the validity of the method of residual calculation. Because the latent state is partially hidden (e.g. when y = 0), Moore and Swihart proposed site- (or site-year-) level residuals as the difference between the naive state (e.g. at least one detection vs. no detections) and the predicted probability of at least one detection. The latter can be thought of as analogous to psi*p (employed, for example in goodness-of-fit tests when compared to y [0 or 1]; e.g. https://groups.google.com/d/msg/unmarked/LvLJj2xoe1s/QnZVfjT9QtEJ). I've used this in my own (likelihood-based) work, but there are good counter-arguments, and (as Richard noted) more robust Bayesian approaches.

Giancarlo

mike worland

unread,
Jun 23, 2015, 9:48:41 PM6/23/15
to unma...@googlegroups.com
Thank you so much for the helpful responses, Richard and Giancarlo.  It sounds like this is yet another reason for me to go the Bayesian/Bugs route, about which I know very little (if I go that route your script will be invaluable, Richard, thank you).  I'm curious about what kind of results I would get with the regular autologistic regression where the autocovariate is calculated using the response (density in my case).  No estimation of residuals is required here as I understand it.  Its simplicity is appealing.  Its downfall, apparently, is that it could lead to biased coefficients, but there also seems to be a consensus in the literature that it improves prediction accuracy.  At this point in my analysis, accuracy of abundance estimates may be more important than coefficients, so perhaps this approach is at least better than ignoring spatial autocorrelation. 
I'm not necessarily looking for an answer to a question here but if either of you or anyone else in the group have thoughts on this, I would appreciate hearing them.  I will probably at least try the autologistic method if there aren't any strenuous warnings.
 
Mike

John Clare

unread,
Jun 24, 2015, 8:06:07 AM6/24/15
to unma...@googlegroups.com
Mike,

It looks like coefficient bias using auto-logistic models may result from improper weighting: this pending article provides some insight into appropriate weighting schemes for different response structures using the 'spdep' package.

John

 

 

Richard Schuster

unread,
Jun 24, 2015, 10:21:45 AM6/24/15
to unma...@googlegroups.com
Hi Mike,

You are welcome.
I completely agree with John and its great to see another paper on the subject by Brendan Wintle's group. I haven't read the paper John mentioned in detail, but I have talked to Crase about this a while ago. We both agreed that biases are likely and by looking at her paper its easy to see that coefficient values change quite a bit when one adds an autocovariate term to the model. There was also changes to coefficients when she included a autocovriate term modeled on the residuals because she included that at the same time as other covariates I believe.  I am not sure if its the proper way, but what I sort of did in the BUGS code was to fix the coefficient estimates generated by unmarked, put them in the model and only investigate the autocovariate effect afterwards. It often took quite a bit to figure out the appropriate autocorrelation distance per bird species, but in 6 out of the 7 species I found residual spatial autocorrelation for (out of a total of 47 species) I was able to remove the autocorrelation with a single autocovariate term. For the last species I thought about using 2 terms via a non-linear specification like a quadratic relationship, but I never prusuied this approach any further as that one species was not included in any downstream analysis anyways.

For your analysis it would also be important to know if you would like to eventually produce adundance predictions (i.e. maps). If so I think it would be important to try to account for residual spatial autocorrelation.

In case you want to get started with BUGS and Bayesian analysis, I would highly recommend the following 3 books in order.

Great intro and I highly recommend this book, not just for the Bayesian part but also as a great way to understand the progression from t-test over ANOVA to GLMM's.
Marc Kery (2010) Introduction to WinBUGS for Ecologists.

Pretty much starts where the previous book ended.
Kery & Schaub (2011) Bayesian Population Analysis using WinBUGS.

This is more for the advanced reader I found.
Royle & Dorazio (2008) Hierarchical Modeling and Inference in Ecology.

Cheers,
Richard

mike worland

unread,
Jun 29, 2015, 7:26:54 PM6/29/15
to unma...@googlegroups.com
John, thank you so much for bringing that paper to my attention.  Bardos et al's basic recommendation I believe is to use a weighted sum rather than a weighted mean to calculate the autocovariate.  I can't digest all of their math but some of their explanation makes sense to me.  A very simple fix that seems almost too good to be true, but my results so far suggest it's an improvement.  

I believe since I'm using density (rather than presence-absence) to calculate this autocovariate, that I'm actually developing "auto-normal" models rather than autologistic?  Whichever you call it, just as in the residual autocovariate approach I mention above, I continued to use a 2-stage approach where I first applied an unmarked model with no spatial autocovariate to estimate densities across sites, and then I used those densities to calculate the autocovariate.  As opposed to simply using raw counts to estimate densities.  And then I run another set of models with the autocovariate.  I show some script below for making this autocovariate using gmultmix models.  It can probably be tweaked pretty easily for other unmarked models.  If anyone sees problems here I would sure appreciate hearing them.

Thank you Richard for the Bayesian resources.  I have borrowed Kery's first book because of my basic interest, even if I don't use a Bayesian approach for this particular analysis.  To answer your question, yes, I want to produce spatial abundance predictions from this analysis, so I appreciate your confirmation of the need to account for spatial autocorrelation--I've been feeling pretty unsure about it. 


Mike


#best model from first model run with no autocovariate
(mod <-glbl)
# estimate density for each site, incorporating detection & availability (phi in gmultmix models)
ab <-bup(ranef(mod))
ab<- cbind(ab,xplot)
phi <-predict (mod, xplot, type ="phi", append=T)
names(ab)[names(ab) =="ab"] <-"abund"
names(phi)[names(phi) =="Predicted"] <-"phi"
mrg<-merge (phi[,c(1,6,14,15)],ab, by=c("plot","treat","year"),
            all.x=T,all.y=T)
mrg$dens <-(mrg$abund*mrg$phi)/0.07854
 
# Make variogram plots and calculate spatial autocovariate
#
# 2004
# Use the range on the variogram plot as the neighborhood distance for
# calculating autocovariate.

library(geoR)
xplot04 <-mrg[mrg$year =="2004",]
v1 <-variog(coords =xplot04[c(24,27)], data =xplot04$dens, breaks =seq(100,1500,100))
plot(v1,xlim=c(100,1500))
lines(v1)

# Creates the spatial autocorrelation covariate.
xy<-cbind(xplot04$Easting,xplot04$Northing)
library(spdep)
# style ="B" means weighted sum, as recommended by Bardos et al 2015, MEE
ac04 <-autocov_dist(xplot04$dens,xy,type="inverse",style="B",
nbs = 700,longlat=NULL,zero.policy=TRUE)

David Bardos

unread,
Jun 30, 2015, 10:15:37 PM6/30/15
to unma...@googlegroups.com
Hi Mike, Richard, Giancarlo, John,

my collaborator Gurutzeta Guillera-Arroita alerted me to this chat & I thought I should try to summarize the work that led to the MEE paper you've mentioned and also to a
second paper (an arXiv preprint) that we cite in the Discussion of the MEE paper:

http://arxiv.org/pdf/1501.06530


The blunt version of the situation is this:  the ecology literature discussing "bias" in auto-model covariate estimates is completely wrong.

There are two separate problems with the literature, which makes things more confusing, but here goes:

Problem 1:
about half of the auto-model papers use incorrect weightings (weighted mean autocovariate), resulting in incorrect parameter estimates.  Sometimes, using invalid weightings only causes small errors, but in other cases, such as the Dormann et al 'snouter' examples, the errors are huge.

So the "bias" reported by Dormann et al was really just an implementation error. This is discussed in great detail in the MEE paper.

The good thing about this problem is that it is easily fixed, it is just a technical error that needs to be corrected. It doesn't require a new way of thinking about the models.

The second problem is much more fundamental. Fixing it requires thinking about model comparisons in a completely different way.

Problem 2:
the idea of directly comparing covariate parameter values between structurally different models is just flat-out wrong.  For example, suppose we have a logistic model and a corresponding auto-logistic model, where both models include the same covariates. The covariate parameters play very different roles in the two models.

- When these models are fitted to the same data, the covariate parameters should not necessarily have similar values.  

- directly comparing the two parameter values is not meaningful.

- to find out what the parameter values actually do mean, we must explore predictive simulations drawn from the estimated models.

- instead of comparing parameter values, we evaluate a new quantity, called the "covariate effect", which is calculated from the effect on model predictions of "tuning off" a particular covariate.

The second paper (http://arxiv.org/pdf/1501.06530) illustrates this for the fairly extreme case of Hyrdocotyle vulgaris in Germany, using data derived from Carl, G. & Kühn, I. (2007). ["Analyzing spatial autocorrelation in species distributions using Gaussian and logit models." Ecological Modelling, 207, 159-170.] 

In this case, the auto-logistic covariate parameter was about 5 times smaller than the GLM estimate.  Yet when predictive simulations are examined from the estimated models, the "covariate effect" for the two models are similar and the auto-logistic model has better predictions according to the AUC.

So the overall conclusion is that the "bias" problem, for the most part, doesn't actually exist. Parameter values can be very different between models that give similar predictions. Generally we expect the auto-models to give better predictions than GLMs, since they are much more flexible models. The new quantity called "covariate effect" can be derived for each model using predictive simulations (eg using Gibbs sampling) and these quantities can be meaningfully compared between models.


David
Message has been deleted

David Bardos

unread,
Jun 30, 2015, 10:39:40 PM6/30/15
to unma...@googlegroups.com
Oops, I guess it's obvious but instead of "tuning off" I meant to say "turning off" a particular covariate ...

David


mike worland

unread,
Jul 1, 2015, 2:26:38 PM7/1/15
to unma...@googlegroups.com
David, thank you for the explanation.  And thank you for this extremely helpful idea that as far as I can tell allows me to account for spatial autocorrelation in an unmarked model. 

Your Problem 2 makes sense to me simply because whenever one introduces a very important covariate one shouldn't be surprised to see coefficients change. 
Problem 1 is a little fuzzier to me, but from your paper I understood that the basic problem boils down to edge points--correct me if I'm wrong.  A weighted mean puts those edge points on even ground with interior points, when in fact those edge points may have a smaller adjacent population.  We don't know what's going on outside of the sampled area.  So a weighted sum is a better way to calculate the autocovariate.  That's probably way too simple, but please let me know if that's close.

I haven't had a chance to look at your 2nd paper but certainly will.  Thank you again David.


Mike

Kery Marc

unread,
Jul 2, 2015, 3:18:51 AM7/2/15
to unma...@googlegroups.com
Dear Mike,

you may call me pedantic but: there are NO "unmarked models" ! Nor were there ever any "MARK models" (this term I even read in published papers and it makes me cringe). These are two software packages that fit perhaps 20 and 100 different types of models. So not only is the term "{software_name} model" totally uninformative, but, more importantly and more subtly, it also takes away the emphasis from the model and puts it on the software. The first and most important thing that we always have to understand is the model that we fit, regardless of whether we fit it in MARK, unmarked, PRESENCE, or using Bayesian or likelihood inference.

OK, back now to my "WORD book". And I hope you are not offended by my "DELL email".

Kind regards  -  Marc



Chris Smith

unread,
Dec 11, 2017, 2:03:20 PM12/11/17
to unmarked
Hi Mike, I am a biologist trying to publish a paper on otter habitat selection using functions occu and pcount.  My data consist of 4-7 back-to-back segments in a row up a total of 10 rivers, each segment being 450m long.  I have sub-divided each segment into 150m "sub-segments" (which have commonly been used in the literature as 3 separate "visit" within each 450 m segment (site)).  I have created 2 detection covariates (track substrates and scat substrates) and my response variable (total sign) at the 150m "sub-segents" and are using them as observation-level covariates, along with about a few habitat site-level covariates.  Having my segments being consecutive means they are probably spatially autocorrelated, and indeed, when I ran a correlogram of Moran's I values on them, a number of the habitat covariates are spatially autocorrelated.  I have been trying to figure out how to incorporate an "autocovariate" into my models, using your script below, but am struggling at a few points, and was wondering if you could help.  My level of coding isn't as high as yours, but I gave it my best shot.  I have attached my 2 data files and script.  I ran an initial model set, and used the top model to predict values.  The part that is confusing me is that when I try to make a variogram from the predicted values to find my "nbs" value, the variogram have no sill that levels off (I don't know what range to use).  I am also just trying to clarify, that as soon as I get this variable, the proper technique is to simply add it into the model set, and re-run them with the new covariate.

Thanks in advance for any help you can give,
Chris
Autocovariate, Pcount Script.R
Spatial Autocorrelation Analysis.csv
150 Analysis data.csv

John Clare

unread,
Dec 11, 2017, 2:20:42 PM12/11/17
to unmarked
Hey Chris,

Have you looked into using one of the models Jim Hines has worked on (I think the original paper is "Tigers on Trails: occupancy models for cluster sampling", in Eco Apps and written in 2010)? Sounds like it's designed for exactly your sampling protocol (as of now, I think the only packaged software that can fit it is Presence, but could be wrong).

John

mike worland

unread,
Dec 12, 2017, 4:31:05 PM12/12/17
to unmarked
Chris,

I looked at your script and I think you're close.  Bear in mind that a variogram isn't necessary to calculate the autocovariate.  I think it's fine to just calculate a series of autocovariate values based on different neighborhood distances, plug them into your model/s, and then settle on the distance that results in the lowest AIC. 

I don't recommend using a neighborhood distance that encompasses different sites/blocks (rivers in your case).  Then your autocovariate values will be influenced by differences between sites and not by autocorrelation.  So if you have 4-7 segments in a river, each 450 m long, I wouldn't consider anything more than around 3200 m.

Here's some of your script I played with.  You could also justify trying the autocovariate as a detection variable.

library(spdep)
ac3200 <-autocov_dist(z=ab, xy=coord, nbs=3200, type="inverse",style="B",longlat=FALSE, zero.policy=TRUE)
siteCovsAC <- cbind(siteCovs,ac3200)
siteCovsAC[1,]
UMF <- unmarkedFramePCount(y=ydata, siteCovs=siteCovsAC, obsCovs=obsCovs)
##Run a simple version of 4 models
mforest100m <- pcount(~1 ~forest100m,  data= UMF)
mslope <- pcount(~1 ~slope, data=UMF)
mforest100mdet<- pcount(~scatsub+tracksub ~forest100m, data=UMF)
mslopedet<- pcount(~scatsub+tracksub ~slope, data=UMF)
Cand.mod <- list( mforest100m, mslope, mforest100mdet, mslopedet)
Modnames <- c( "forest100m", "slope", "forest100m+Sub", "slope+Sub")
mforest100mAC <- pcount(~1 ~forest100m +ac3200,  data= UMF)
mslopeAC <- pcount(~1 ~slope +ac3200 , data=UMF)
mforest100mdetAC<- pcount(~scatsub+tracksub ~forest100m +ac3200, data=UMF)
mslopedetAC<- pcount(~scatsub+tracksub ~slope +ac3200, data=UMF)
Cand.mod <- list( mforest100m, mslope, mforest100mdet, mslopedet,
mforest100mAC, mslopeAC, mforest100mdetAC, mslopedetAC)
Modnames <- c( "forest100m", "slope", "forest100m+Sub", "slope+Sub",
"forest100mAC", "slopeAC", "forest100m+SubAC", "slope+SubAC")
print(aictab(cand.set = Cand.mod, modnames = Modnames), digits = 4)
mforest100mdetAC
mforest100mdet


Mike Worland



Chris Smith

unread,
Dec 15, 2017, 6:31:50 PM12/15/17
to unmarked
Hey Mike, thanks again for the quick and straightforward response!  I ended up running the initial model set with just habitat and detection covariates and coming up with a top model.  I then used this top model, plus autocovariates at 450, 800, 1250, 1700, 2150, 2500 m (2500m was the longest straight line distance of any river I used) and putting all these in a model set, which I pasted below.  I was surprised none of the autocovariate models came out on top, but maybe that is from spatial autocorrelation in the top model without autocovariates?  I wanted to check if a reasonable way to proceed was to re-run the few top models from the initial model set with the top autocovariate (ac1250) in all, and use that as my final model set.  Is there any straightforward way to check if this alleviated my spatial autocorrelation problem?  

         K     AICc Delta_AICc AICcWt Cum.Wt       LL
nullfarm 5 172.7779     0.0000 0.2977 0.2977 -80.7507
ac1250   6 174.5103     1.7324 0.1252 0.4228 -80.3421
ac2500   6 174.5109     1.7329 0.1251 0.5480 -80.3424
ac2150   6 174.5457     1.7678 0.1230 0.6710 -80.3598
ac1700   6 174.5970     1.8191 0.1199 0.7908 -80.3855
ac800    6 174.6245     1.8466 0.1182 0.9091 -80.3992
ac450    6 175.1492     2.3713 0.0909 1.0000 -80.6616

One last question as well (if you have time). I had a collegue mention that he thought my problem was with pseudoreplication and not spatial autocorrelation. From the papers I could find, it looked like spatial autocorrelation can be seen as the reason for my pseudoreplication problem....does that seem right to you?

Thanks again for all your help,
Chris

Kery Marc

unread,
Dec 16, 2017, 6:14:13 AM12/16/17
to unma...@googlegroups.com
Dear Chris,

about your last line: yes, spatial autocorrelation is exactly one of the factors that can lead to pseudoreplication, one very broad, heuristic definition of which could be "treating things as independent which are not".

(I think Hurlbert's own heuristics was "using the wrong error term for F-tests in an ANOVA table", but since nowadays everybody is a mixed or a hierarchical modeller and nobody runs good old ANOVAs anymore, this is a little obsolete.)

Best regards --  Marc



From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Chris Smith [cscjsm...@gmail.com]
Sent: 16 December 2017 00:31
To: unmarked

Subject: Re: [unmarked] Autocovariate from residuals to account for spatial autocorrelation

mike worland

unread,
Dec 18, 2017, 5:02:26 PM12/18/17
to unmarked
Chris, if the spatial autocovariate alleviated spatial autocorrelation, then I think at a minimum it should be an important variable--models that contain it should have a lower AIC than models that don't.  So if this isn't happening, I'm not sure what's going on, except apparently the autocovariate isn't capturing the spatial autocorrelation. 

Another option for you is to run a mixed model.  I believe you would use river as a random effect and the habitat variables you're interested in as fixed effects.  This should account for spatial autocorrelation/pseudoreplication within rivers.  The disadvantage is that mixed effect models are not run through unmarked (unless something has changed recently).  I understand that there's a way to combine a mixed effect model and an n-mixture model using Bayesian methods in WinBUGS, but I've never gone that route and so can't advise any further on that.


Mike  
Reply all
Reply to author
Forward
0 new messages