Addressing reviewer comments about autocorrelation in data

21 views
Skip to first unread message

Daniel Palacios

unread,
Apr 18, 2019, 11:48:15 AM4/18/19
to HyperNiche and NPMR
I am responding to reviewer comments on a manuscript using NPMR currently under consideration, using a data set of whale satellite tracks (daily positions). The response variable is behavior mode (either resident or transient, derived from movement parameters) at each position, and the predictors are several environmental and topographic variables.

The primary criticism is whether the spatial autocorrelation present in animal tracks has been sufficiently accounted for in my NPMR model results. In other modeling frameworks like GLM and GAM, autocorrelation structures can be explicitly specified to account for this source of bias. Although autocorrelation is a feature of animal tracking data (and therefore analysts take steps to address it), I presume that spatial autocorrelation is a common concern in any habitat modeling study, as most data sets are sampled in a spatial pattern or grid. So one initial question is whether autocorrelation is a concern in NPMR and what are the impacts? How can these be mitigated?

My way to address this in the manuscript was to obtain a random subsample from the tracks to reduce the inherent autocorrelation. But the reviewers want to see more formal evidence that the autocorrelation has been addressed, while at the same time lamenting the loss of data due to the subsampling/decimation. Is subsampling/decimating a densely sampled data set a common practice in habitat models? Is there a reference I could cite for this approach.

One reviewer suggested fitting a purely spatial model (longitude and latitude as the only predictors) and somehow compare to the model fitted on environmental predictors. Somewhere in the HyperNiche documentation I read something about this at some point, but now I cannot find it.

Thanks for your thoughts,

Daniel




Bruce McCune

unread,
Apr 18, 2019, 12:29:09 PM4/18/19
to hyper...@googlegroups.com
Daniel,

The reviewer's suggestion of a making model based on lat/long would be quite powerful and flexible and a great model of spatial autocorrelation. The flexibility comes because it is not based a linear model and nor other assumptions about the nature of the autocorrelation. If you wanted to remove the autocorrelation you could fit a lat/long model and use the residuals for further analysis, but that risks throwing out the baby (or parts of the baby? -- sorry the analogy is not so good) with the bath water, since some of your predictors are likely autocorrelated too.

Another approach is to go ahead with the models you have but measure the degree of autocorrelation in the residuals. And you could do a randomization test of that to see if significant autocorrelation remains.

One thought about all of this, however, is that the biggest problem with autocorrelation is biasing formal hypothesis tests. I think you are relatively safe if your purposes are more descriptive.

Btw, here is a recent paper with caribou locations in Alaska that does a nice job, I think, of explaining the advantages of NPMR for this kind of data. It also has a component on determining the spatial scale of the caribou-habitat relationships: Nelson, P. R., K. Joly, C. Roland and B. McCune. 2018. Evaluating relocation extent versus covariate resolution in habitat selection models across spatiotemporal scales. Ecological Informatics 48: 245-256.  Peter might want to correct me if I'm wrong, but I don't think it addresses the autocorrelation issue directly.

Bruce McCune

--
You received this message because you are subscribed to the Google Groups "HyperNiche and NPMR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Palacios

unread,
Apr 18, 2019, 1:52:36 PM4/18/19
to HyperNiche and NPMR
Thanks for your helpful thoughts, Bruce. The recent paper by Nelson et al. will be useful for citing a relevant reference that has used NPMR for these purposes, as editors and reviewers seem fixated on GLMs/GAMs with mixed effects and spatial covariance structures as the only valid modeling framework for animal tracking data. In this regard, your comment about autocorrelation mainly impacting hypothesis testing is worthwhile reminding the editors and reviewers, since this is indeed a descriptive study.

A separate but somewhat related question that the reviewer raises is in relation to addressing multicollinearity in the predictors. This is how I addressed it in the original submission: "Multicollinearity among the predictors was assessed with the pairwise Pearson correlation coefficient (r) and graphically with scatterplot matrices. Redundant predictor pairs (i.e., those with correlations |r| ≥ 0.7) were considered for exclusion from multivariate models based on their relative performance in univariate models." However, the reviewer is requesting a more formal measure of multicollinearity like VIFs (variance inflation factors), or alternatively, a reference that shows that my choice of correlation threshold and comparison of univariate models is effective. Do you have thoughts on how to respond or address this? If I recall correctly, collinearity can bias parameter estimates in parametric modeling frameworks, so I could remind the reviewer that this does not apply to NPMR. Or am I incorrect here?

Finally, as a follow-up question to your suggestion of fitting a long x lat model as a great model for spatial autocorrelation, I don't think I want to use the residuals of such a model for further analysis because of the issue you mention, but I still think there would be value in using the predictions from the spatial model for addressing the reviewer comments. But I'm still a little bit unsure as to the metric(s) of autocorrelation that I would be computing on the long x lat model fit. It occurs to me that I could take slices along longitude and along latitude of the predicted surface, and compute the ACF for them. Or I could compute the ACF on the predicted response for each animal track. Or do you have better suggestions about how to actually report autocorrelation metrics?

Thanks again!

Daniel
To unsubscribe from this group and stop receiving emails from it, send an email to hyper...@googlegroups.com.

Bruce McCune

unread,
Apr 19, 2019, 12:11:11 AM4/19/19
to hyper...@googlegroups.com
Daniel, these seem like a couple of tough question to me, because they are based on concepts out of linear regression, and I'm not sure how they apply to nonparametric regression. Adding to this is that xR2 estimates given by HyperNiche are all cross validated, so the trace of the "hat" matrix contains zeros. and the model contains zero parameters, according to AIC concepts..

I think your approach to the collinearity issue is completely reasonable. If you have few predictors and you have enforced a modest limit on the correlation between predictors, then the variance inflation factor should be no big deal. Maybe you could calculate a variance inflation factor manually by doing the regressions of the predictors on each other, but that's in a linear parametric framework, so how does it apply to NPMR? I don't know.

On the autocorrelation function -- There is no single autocorrelation function in nonparametric regression -- it could be different in one part of the response surface than another or different in one direction than another. I suppose you could force the issue by using the fitted response surface to lat/long to calculate an autocorrelation function -- but then you are not measuring the autocorrelation in relation to the overall variance, because you have filtered out everything but the autocorrelation.

As for a metric for the autocorrelation, I think the cross-validated R2 for the nonparametric regression of Y on lat and long would be sufficient.

Others please feel free to chime in here. So far I have been more concerned with getting the response shapes and interactions right than these other issues, so I'm sure that others have thought about this much more than I have.

Bruce


To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+...@googlegroups.com.

Daniel Palacios

unread,
Apr 19, 2019, 12:34:47 AM4/19/19
to hyper...@googlegroups.com
Thanks for your further thoughts, Bruce. Just a clarification that since my response is binary, I'm reporting the logB and AveB metrics rather than xR2. Is it reasonable to assume that your comments equally apply to binary responses and logB? Or are you referring to the xR≤ (xR2?) value reported in the results of the model evaluation table that also includes SST, SSE, COR, etc.?

Daniel 

You received this message because you are subscribed to a topic in the Google Groups "HyperNiche and NPMR" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/hyperniche/dTxSOl36dXs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to hyperniche+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.


--
Daniel M. Palacios, Ph.D.
Endowed Assistant Professor in Whale Habitats
Whale Telemetry Group
Marine Mammal Institute and Dept. of Fisheries & Wildlife
Oregon State University
Hatfield Marine Science Center
2030 SE Marine Science Drive
Newport, OR 97365, USA

Office: HMSC 227 West Wing
Phone: 541-990-2750
Fax: 541-867-0128
Email: daniel....@oregonstate.edu
MMI Profile | Google Scholar | ResearchGate
Reply all
Reply to author
Forward
0 new messages