Only 1 level for factor covariate (when stratified)

46 views
Skip to first unread message

Fay Frost

unread,
Apr 26, 2022, 5:55:18 AM4/26/22
to distance-sampling
Hi all, 

Background:
I have been trying to conduct some distance analysis from data with yearly surveys. I have many different covariates such as habitat and weather. 

The aim is to get density estimates for each year so I am currently stratifying by year with the resolution of all my estimates at the stratum (year) level. For CDS, this is no problem. 

However, when I try to include covariate information and stratify by year particular covariate factor levels becomes sparse and in many cases I do not have observations for some levels. 

In contrast, if I do not stratify by year and pool all my data then the analysis runs fine but does not make sense ecologically as the data is across multiple years. 

Question: When you have a lack of data for some factor levels in particular strata, what can you do? Is there a way to share information from other strata (years)?

Thanks, 
Fay 

Eric Rexstad

unread,
Apr 26, 2022, 8:45:55 AM4/26/22
to Fay Frost, distance-sampling
Greetings Fay

You have a couple of choices: if your data set is sufficiently rich that you can analyse years individually, then you can apply covariates in years where you have sufficient covariate information and don't apply covariates in years where covariate combinations are insufficient.

Alternatively, you could make year a covariate rather than a stratification criterion; but I suspect this won't be particularly helpful as you will treat it as a factor covariate likely compounding the problem.

My recommendation is that you ignore your habitat and weather covariates and rely upon the unique property of distance sampling by which unbiased estimates of abundance are produced even if sources of variation in detection probability are ignored.  If you want more insight into pooling robustness, consult Section 11.12 of Advanced Distance Sampling by Buckland et al. (2004).

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Fay Frost <fayfr...@gmail.com>
Sent: 26 April 2022 10:55
To: distance-sampling <distance...@googlegroups.com>
Subject: {Suspected Spam} [distance-sampling] Only 1 level for factor covariate (when stratified)
 
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/1477810b-8654-4d41-92e4-b32aa5968656n%40googlegroups.com.

Fay Frost

unread,
Apr 27, 2022, 9:44:53 AM4/27/22
to distance-sampling
Hi Eric, 

Thanks for the quick response, as always. 
This brings me nicely onto my next question. I had recently read that estimates from distance sampling should be largely unaffected if there is unmodelled heterogeneity in probability of detection. If that is the case, why should we collect covariate information and try to include this at all? 

Best wishes, 
Fay

Stephen Buckland

unread,
Apr 27, 2022, 10:58:22 AM4/27/22
to Fay Frost, distance-sampling

If you want just an overall density or abundance estimate, pooling robustness works.  If however you want to estimate abundance in each of several geographic strata, and you fit a common detection function model, then the individual stratum estimates will be biased if you do not model important covariates.  The same is true if you have several years’ data, and wish to estimate abundance in each year, but with a common model across years.  This applies in your case.  You might try Eric’s suggestion of excluding weather and habitat covariates, but including year as a factor.  That way, pooling robustness will work for each individual year.  You could also try a stepwise approach to covariates:  start with none, and each one at a time, select the best model (using AIC), take that one, then again add each of the remaining covariates one at a time.  Stop when AIC gets worse, or when model fitting fails due to inadequate data.

 

Steve Buckland

Eric Rexstad

unread,
Apr 27, 2022, 11:52:44 AM4/27/22
to Fay Frost, distance-sampling, Stephen Buckland
Thanks Steve.  Fay, just to amplify what Steve has said, covariates can help solve problems where no other approach will (e.g. stratum- or species-specific abundance estimates where some strata/species are data poor; or reducing size-bias when detections record detections of groups).

Other problems can be solved without covariates (e.g. pooling robustness can cope with unmodelled sources of heterogeneity).  In my experience, biological inference about animal abundance are virtually identical for models with and without covariates for reasonable data sets.  Angst over model selection with a large candidate set of covariates can often be targeted toward other facets of the survey (e.g. is spatial replication sufficient to produce reliable estimates of encounter rate variance and to ensure uniformity of animal distribution with respect to transects).

From: 'Stephen Buckland' via distance-sampling <distance...@googlegroups.com>
Sent: 27 April 2022 15:58
To: Fay Frost <fayfr...@gmail.com>; distance-sampling <distance...@googlegroups.com>
Subject: RE: {Suspected Spam} Re: {Suspected Spam} [distance-sampling] Only 1 level for factor covariate (when stratified)
 
Reply all
Reply to author
Forward
0 new messages