exact distances vs. bad distance bins

265 views
Skip to first unread message

Vaughn Bodden

unread,
Mar 30, 2020, 11:20:05 AM3/30/20
to distance-sampling
Hi all,

My colleagues and I recently completed a point transect survey of a bromeliad plant. The population occurs in high densities in a 17.6-hectare forest over difficult karst terrain. For the majority of the observations we recorded exact distances (777 obvs.) but had to use distance bins for plants we could not reach safely (307 obvs.), after 15-meter truncation. Total observations are 1084 across 51 points after truncation.  Unfortunately, the distance bins used were too large for the study organism and perform poorly on the goodness-of-fit tests. The choice of whether to use binned or exact distances has a noticeable effect on the encounter rate of clusters estimates, 21.25 vs. 15.23 respectively, and the final density estimates. 

See the attached detection functions for comparison; the exact distance figure uses 1-meter bin width for the histogram. The lack of observations in the first meter of observations in the exact distance bin plot is likely due to us finding clear patches to stand while making observations to avoid trampling the plants. 

Do the 'Distance' or ' mrds' R packages have the functionality to estimate the encounter rate and the detectability separately? Or is it possible to do with a custom model?  Ideally, I would like to use all observations to estimate the encounter rate and the exact distance only to estimate detectability. 


In addition, I am having trouble using any hazard rate models with this data. I keep getting the  error message: 

Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent

Any suggestions? The data I am using appears to be fine.

Kind regards,
Vaughn Bodden
OG_binned_distances_5m.png
OG_exact_distances_1mBins.png

Vaughn Bodden

unread,
Mar 30, 2020, 11:29:54 AM3/30/20
to distance-sampling
The detection functions attached to the previous post were fitted with a half-normal key function, no adjustments, no covariates.

Eric Rexstad

unread,
Mar 30, 2020, 11:35:12 AM3/30/20
to Vaughn Bodden, distance-sampling
Vaughn

Quite a list of questions in your message: goodness of fit, exact vs binned, encounter rate matters, custom modelling, hazard rate warnings.

Before dealing with the others, it seems the central issue revolves around what to do with the detections beyond 15m.  I would want to know more about matters before making a recommendation, but my first thought would be to ignore them completely.  Detections at large distances from the transect are contribution little information about the shape of the detection function, particularly the shape near 0.  The fits of your half normal to the exact distance data don't look alarming.

I wouldn't be so excited about the difference in encounter rate for inclusion/exclusion of detections beyond 15m; there is compensation for that given that truncation distance (w) appears in the denominator of the density estimator.  I wouldn't think that there would be vast differences in estimates resulting from choice of truncation distance.  But happy to explore further with you off-list.

Eric Rexstad
Senior Research Fellow, CREEM, Univ. of St Andrews, Charity SC013532
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.

Tiago Marques

unread,
Mar 30, 2020, 12:30:07 PM3/30/20
to Vaughn Bodden, distance-sampling
Hi Vaughn,

I was a bit worried when I read your wording "The choice of whether to use binned or exact distances has a noticeable effect on the encounter rate" because, in the way most people refer to it, binning should not have an impact on encounter rate. Hence, let me restate your problem to see if I got it right.

You have a total of 1084 observations, but for a subset of these, namely 307, you could not get the exact distances. You are worried because you would like to use the full information in the other 777 distances, while for the 307 distances you just know the 5 meter class it corresponds to.

If that is the case, then I suspect in practice the loss of precision by using the 3 bins is not that big (your detection function fits seem similar) but you could also try:

1. Use all the data for encounter rate, but estimate detectability just from the 777 distances, and then use the estimated detection probability as a multiplier for the 1084 observations
2. (harder because requires bespoke coding) coding your own likelihood with two types of data coming in but informing the same likelihood

Did that help?

Cheers

T


On Mon, 30 Mar 2020 at 16:29, Vaughn Bodden <vaughn...@gmail.com> wrote:
The detection functions attached to the previous post were fitted with a half-normal key function, no adjustments, no covariates.

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.

Vaughn Bodden

unread,
Mar 30, 2020, 12:52:04 PM3/30/20
to distance-sampling
Hi Tiago, 

Thanks for the advice. 
I was speaking with Eric off list about the issue too. Yes, it seems that the fit is not noticeably different between the two plots. 
The final model selected by AIC scores uses two categorical covariates so I will not be able to use the goodness-of-fit test with the binned data but can assess fit visually. 

I'll look into writing a custom model but for now, I will proceed using the binned data and a visual assessment of fit. 

That helps, thank you! 

Any suggestion on the error message (see below) I was receiving when fitting hazard rate models?

Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent

Nothing obviously wrong with the data.


 

Vaughn Bodden

unread,
Mar 30, 2020, 3:09:40 PM3/30/20
to distance-sampling


I am still unsure of what could be causing the error I received when trying to fit hazard-rate models. 

The error message:
Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent
Error in dimnames(x) <- dn :
  length of 'dimnames' [2] not equal to array extent


and it continues on printing that error message out numerous times. There is nothing unusual about my data that I can find that should cause that error.

Here are first few rows of the data:
head(dat)
  Sample.Label Region.Label     Area    Effort size distance distbegin distend position age flower habitat_type visibility karst_index
1         OG22           IW           17.60383      1   NA       NA        NA      NA     <NA>                         dry_forest          4           1
2         OG23           IW           17.60383      1    5     5.38         5          10          g             a      n     dry_forest          4           2
3         OG23           IW           17.60383      1    3     4.39         0          5           g              a      n     dry_forest          4           2
4         OG23           IW           17.60383      1    4    12.00        10       15          g              a      n   dry_forest          4           2
5         OG23           IW           17.60383      1    2     9.98         5         10          g              a      n   dry_forest          4           2
6         OG23           IW           17.60383      1    1    14.00        10      15           g              a      n   dry_forest          4           2


Any ideas on what may be causing this?

Best,
 Vaughn

Vaughn Bodden

unread,
Mar 30, 2020, 3:36:19 PM3/30/20
to distance-sampling
So it seems that restarting R and reuploading of my data fixed the problem with fitting hazard rate models.. not sure what the actual cause of the error message was.

Thanks for your time, Eric and Tiago, it is appreciated.

All the best,
Vaughn  


Valeria Valeria

unread,
Apr 5, 2020, 8:47:39 AM4/5/20
to Vaughn Bodden, distance-sampling
Good morning,I hope this email finds you well.I am looking for a copy of the GEBCO Digital Atlas (GDA). General Bhatrimetric Chart of the Ocean. The GEBCO Digital Atlas (GDA) was a two-volume DVD and CDROM set that contained GEBCO's 30 arc-second interval global gridded bathymetric data set and GEBCO Centenary release collection of bathymetric contours.
Is anyone has a copy of this or do you know where can I find it? I really needed a copy of it.Thanks for you help.Best Regards,Valeria

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.

leob...@gmail.com

unread,
Aug 11, 2021, 2:10:06 PM8/11/21
to distance-sampling
Hi Vaughn, or anyone who might like to answer,

Can you please explain why you cannot preform a goodness of fit test with categorical covariates and binned data?  

Thanks,

Eric Rexstad

unread,
Aug 12, 2021, 3:10:19 AM8/12/21
to leob...@gmail.com, distance-sampling

Leo

The issue is degrees of freedom.  With binned data, degrees of freedom is the number of bins minus 1.  For example, distances recorded in 4 distance categories leaves the analyst with 3 degrees of freedom.

If a half normal key function with a factor covariate with 2 levels, that model has 2 parameters (intercept and offset for the second level).  Fitting that model to the 4 bin data would leave 1 degree of freedom for evaluating the chi-square goodness of fit.

However, if a hazard rate key function with the same two level factor covariate were fitted to the same data, this model would have 3 parameters, the same number as degrees of freedom, leaving no degrees of freedom available for the chi-square goodness of fit test.

With multiple factor covariates in the detection function model, the number of parameters in such models quickly exceeds the number of degrees of freedom from binned data.

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/cc3930b1-31ea-45ed-98f3-1f74cd8b8fdan%40googlegroups.com.
-- 
Eric Rexstad
Centre for Ecological and Environmental Modelling
University of St Andrews
St Andrews is a charity registered in Scotland SC013532
Reply all
Reply to author
Forward
0 new messages