Getting detection probabilities greater than 1.0

James Baldwin

unread,

Nov 16, 2014, 1:19:53 AM11/16/14

to distance...@googlegroups.com

I'm getting detection probabilities greater than 1.0 for some distances (and sometimes as high as 1.5) when using a covariate but yet the AICc is much smaller than without a covariate.

I'm running DISTANCE 6.2 and using a variable called Beaufort as a factor. Typically, there are just 3 levels for this factor. The DISTANCE commands are as follows:

Data /Structure=Flat;

Fields=STR_LABEL, STR_AREA, SMP_LABEL, SMP_EFFORT, DISTANCE, SIZE, Beaufort;

Factor /Name=Beaufort /Levels=3 /Labels=0, 1, 2;

Infile='c:\users\jbaldwin\Desktop\Example\dist.dat';

End;

Estimate;

Distance /Nclass=7 /Width=194 /Left=0;

Density=All;

Density=Stratum /Design=Strata /Weight=Area;

Encounter=Stratum;

Detection=All;

Size=All;

Estimator /Key=HN /Adjust=CO /Criterion=AIC /Covariates=Beaufort;

Pick=AIC;

GOF;

Cluster /Bias=GXLOG /Test=0.15;

VarN=Empirical;

End;

The manual says that I can't request a strictly decreasing probability detection function when using covariates (although I don't see why not when the covariate is a factor). How do I remedy this situation?

Any suggestions would be greatly appreciated.

Thanks,

Jim

Stephen Buckland

unread,

Nov 17, 2014, 4:33:07 AM11/17/14

to James Baldwin, distance...@googlegroups.com

Jim, I would re-run your model without any adjustment terms (manual selection of adjustments, then specify zero). This ensures that the fitted curves are non-increasing. If AIC gets appreciably worse, that would suggest an assumption failure. For example, you may have some avoidance of the line (in which case the above should ‘average out’ the shortage of detections near the line with the excess away from the line), or animals close to the line are being missed (as occurs for example with aerial surveys; left-truncation may work in this case).

Steve Buckland

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To post to this group, send email to distance...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/40940b25-b404-4cc0-bacf-a7493fe0fd80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jim Baldwin

unread,

Nov 17, 2014, 7:04:50 PM11/17/14

to Stephen Buckland, distance...@googlegroups.com

Thanks, Steve. Yes, removing the cosine adjustment (/Adjust=CO) keeps things below 1.0. I'll probably fit the series of data sets I have with the cosine adjustment and without the cosine adjustment and pick the smallest AICc except when any of the estimates exceed 1.0. Does that sound too convoluted?

I'm assuming that while the estimation procedure does force the probability to be 1.0 a zero distance, the likelihood (and therefore the AICc) is based on a general curve fitting procedure rather than restricting the maximum estimate to be no greater than 1.0 (especially when using the cosine adjustment).

Jim

--

Jim Baldwin

Station Statistician

Pacific Southwest Research Station

USDA Forest Service

510-559-6332

Stephen Buckland

unread,

Nov 18, 2014, 4:00:47 AM11/18/14

to Jim Baldwin, distance...@googlegroups.com

>Thanks, Steve. Yes, removing the cosine adjustment (/Adjust=CO) keeps things below 1.0. I'll probably fit the series of data sets I have >with the cosine adjustment and without the cosine adjustment and pick the smallest AICc except when any of the estimates exceed >1.0. Does that sound too convoluted?

That should be OK. Personally, I’m not a fan of using adjustment terms in mcds. They are there to put ‘wiggles’ in the detection function, if one of the key functions cannot fit the data well on its own, so giving greater flexibility. Modelling covariates in the detection function also gives greater flexibility, and it’s not often that you get a big improvement in fit by including adjustments in addition to covariates, unless there are problems with the data (e.g. substantial rounding, responsive movement, etc).

>I'm assuming that while the estimation procedure does force the probability to be 1.0 a zero distance, the likelihood (and therefore the >AICc) is based on a general curve fitting procedure rather than restricting the maximum estimate to be no greater than 1.0 (especially >when using the cosine adjustment).

If you have no adjustment, this cannot occur, because the models for the detection function are non-increasing. However, once you include adjustments, there is no guarantee that the fitted probability density function is non-increasing. Because we obtain the fitted detection function by scaling the density function so that it is one at zero distance, that detection function might then increase above one. The cds engine of Distance deals with this by incorporating a penalty term in the likelihood, that results in the fitted function being (almost) non-increasing. The mcds engine does not have this feature.

Steve

jeff....@noaa.gov

unread,

Nov 18, 2014, 5:49:11 PM11/18/14

to distance...@googlegroups.com, jbal...@fs.fed.us, st...@st-andrews.ac.uk

Cds restricts probabilities <=1 by creating a grid across 0 to W and then enforcing g(x)<=1 at that grid. You can do that with adjustments but if you then add in covariates which scale the detection function you would have to set up the grid over x for each covariate value. That might be doable for a single factor covariate but imagine what would be needed with several factor covariates or worse a numeric variable. Even if you then restricted it for each unique set of covariates in the data, that would not prevent a prediction for g(x) >1 for covariate values not included in your observed data.

I very much agree with Steve. If you are using covariates then you are trying to explain what affects g(x). When you use adjustment functions you are simply fitting for prediction without any attempt to explain. I'd go futher than Steve and say that I do not think the paradigms should be mixed.

--jeff

David Lawrence Miller

unread,

Nov 18, 2014, 7:24:19 PM11/18/14

to distance...@googlegroups.com

If anyone is interested, Len Thomas and I have developed a method for
building detection functions with mixtures of half-normals. This means
that one can have a flexible model with covariates that is monotonically
decreasing by construction (rather than by constraint).

Some slides from a talk I gave recently are available on my website
(http://dill.github.io/talks/ncsu-mixtures/talk.html) and there is an R
package (mmds) on CRAN. Folks are welcome to e-mail me directly if they
would like a pre-print (paper currently in revision).

best,
--dave

> --
> You received this message because you are subscribed to the Google
> Groups "distance-sampling" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to distance-sampl...@googlegroups.com

> <mailto:distance-sampl...@googlegroups.com>.

> To post to this group, send email to distance...@googlegroups.com

> <mailto:distance...@googlegroups.com>.

> To view this discussion on the web visit

> https://groups.google.com/d/msgid/distance-sampling/3B6DB36E8C25E24CAB5BE610849E8531AD9BD382%40UOS-DUN-MBX1.st-andrews.ac.uk
> <https://groups.google.com/d/msgid/distance-sampling/3B6DB36E8C25E24CAB5BE610849E8531AD9BD382%40UOS-DUN-MBX1.st-andrews.ac.uk?utm_medium=email&utm_source=footer>.

Sarah Anderson

unread,

Jan 25, 2017, 2:09:09 PM1/25/17

to distance-sampling, jbal...@fs.fed.us, st...@st-andrews.ac.uk

Stephen,

Sorry to jump in on a 2 year old thread but I am having a similar problem as Jim and can't quite seem to find an answer to my problems. All of my models have a detection probability >1 despite different truncation and binning techniques. If I am understanding everything correctly, I never added any adjustment terms into my models. My current best model is a half-normal cosine with right and left truncation.

Logically, it doesn't make sense to me that detection probability could be >1 so I'm trying to understand this. I appreciate any insight you might have for me!

Sarah

Stephen Buckland

unread,

Jan 25, 2017, 2:50:03 PM1/25/17

to Sarah Anderson, distance-sampling, jbal...@fs.fed.us

Sarah, I suspect you’re just mis-interpreting the plots. The detection function is the smooth curve, and should never exceed 1. It should be exactly one at zero distance. The histogram bars are NOT estimates of detection probability, but are there to help judge whether the fitted detection function fits the data well.

(If you add adjustment terms, and don’t impose a monotonicity constraint, it is possible for the cure to go above one, and would probably indicate a problem with the data.)

Steve

Reply all

Reply to author

Forward