Dear list,
I have stumbled at a simple hurdle while attempting to estimate and select a suitable detection function for my line transect data (CDS, via R).
The figure below shows detection functions (solid black line) and observed distances (histogram) for our aerial survey (binned data, left truncated).
The half-normal detection function (left) is much preferred by AIC relative to the hazard rate detection function (right).
I wish to understand two aspects better:
1. To me, the fitted line of the hazard rate model seems to ‘track’ the histogram much better than the fitted line of the half-normal model. Yet this model receives no AIC support at all, relative to the half-normal model. Why may this be?
2. The scale of the histograms differ between the two figures. In particular, should I be concerned that the half-normal histograms do not reach a detection probability of 1 in the first bin (g(0))?
Thanks in advance for any feedback, and I hope my ignorance don't offend anyone!
Chris
Hi Chris, list
I'll start by your second question, which is easier to address. You should not be worried about the fact that the "scale" of the histograms is different. The y-axis on the plot should be used only to read the detection probability as given by the fitted line, so by definition the line intercepts the y axis at one (since we assume g(0)=1 in conventional distance sampling). The histogram bars are simply rescaled such that the area above the line is the same as the area below the line. Something else that often raises questions (e.g. I have been explicitly asked by a reviewer about this, and it might look odd to folks not familiar with distance sampling data) is the fact that in your right plot one of the bars is above 1, yet a probability is bounded to be in the (0,1) interval. Again, this is not something to worry about, because the bars heights can't be interpreted on that y-axis scale.
Regarding your first question, sometimes a question just raises
more questions ;) it is a bit difficult to know what might be
going on without looking at the data, especially since you are
using left truncation, which can be raise additional difficulties
since to some extent one is asking the software to extrapolate a
function to where there's no data. You do not really say much
about sample size and the AIC values you are looking at. I know
this is not your question, but why are you using left truncation?
Is that due to the blind spot under the helicopter/airplane? I
note that the HR model seems to fit worse in the tail, but in
general it is not with the tail that you should be worried. I
would say that the HN model seems to overshoot in an unexpected
way, so might lead to an overestimation of density, and it being
based on extrapolation of a fit in an area where you have no data,
that might be questionable. As a side note, with not left
truncated unbinned data often the opposite happens, with the HN
model tending to average across spikes at small distances and the
HR model leading to overestimation. How different are your density
estimates from the two models? What are your sample sizes? Have
you considered adjustment terms? Have you considered to just
subtract 25 meters (or whatever distance you are left truncation
the data at) to all the distances and fit a detection function to
the data? I wonder if in such a case you will still get the HN
model as the best fit, which if you do not might hint to problems
with the current left truncation considered.
hope this helps, maybe others have additional comments re your first point
cheers
Tiago
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To post to this group, send email to distance...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/7148b90b-b6a3-40b0-857a-12385772b7b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
| Sem vírus. www.avast.com |
Chris
Thanks for the additional detail.
Your keen observation that you have non-convergence with the model with an SE=203 is undoubtedly correct. If the model has not converged, then the likelihood reported for the model is wrong and consequently the AIC value is wrong and not to be believed.
I cannot make a general statement that applies to your situation,
but the optimisation ability of the 'ds()' function in R does
sometimes struggle to converge, particularly when incorporating
covariate and most specifically when those covariates are factors
rather than continuous.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/7d461345-4c82-4934-a840-67ecc58b03d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-- Eric Rexstad Research Unit for Wildlife Population Assessment Centre for Research into Ecological and Environmental Modelling University of St. Andrews St. Andrews Scotland KY16 9LZ +44 (0)1334 461833 The University of St Andrews is a charity registered in Scotland : No SC013532