Potential over estimation of density/abundance for a common species using camera trap distance sampling?

41 views
Skip to first unread message

Anna Staudenmaier

unread,
Jul 22, 2021, 10:44:10 PMJul 22
to distance-sampling
Hello all,

I am new to distance sampling and have found this group and the distancesampling.org materials extremely helpful in teaching myself how to implement this methodology to a camera trapping study, so thank you all so much for that!

I am, however, wondering if I did everything correctly in my study. Going through the steps outlined in this example (Analysis of camera trapping data (distancesampling.org)), all models ran successfully and yielded density estimates with a decent CV (0.24) after 1000 bootstraps, but my density estimate itself was ~275 animals/km2. I am studying a non-native ungulate on an island where it is abundant and likely overpopulated, but I am concerned that I have unknowingly done something in my modelling or data collection to cause bias and overestimation as this is a very high density estimate. This population has never been studied before so I have nothing to directly compare my results to, though similar systems have reported densities that are not statistically significantly different from mine.

I am happy to provide more detailed information about the study design, results, etc. off list but here is a brief description.

Cameras were deployed at 23 sites for ~900 active trap days (full 24-hr days, no malfunctions or researcher visits) resulting in ~7000 videos of the target species. Due to this high number of videos I limited analysis to peak activity times (2 hours total) as determined by a histogram of video start times (~1000 videos remained). After excluding videos with obvious reactivity to the camera I was left with ~500 videos and pulled ~5000 distance measures from those videos. Distances to each individual's midpoint in the FOV at the snapshot moment (t=2) were recorded. These data resulted in the attached histogram of detection distances, which I thought looked ok when compared to other histograms in similar published studies (Howe et al. 2017, Bessone et al. 2020). Those data were best fit to a hazard rate model without adjustments as seen in the detection probability graph yielding the PDF seen in the neighboring graph. I overlooked the extremely high first bin in the detection probability histogram, but I have a feeling something is wrong with that. 

If anyone sees any red flags that would explain an overestimation or has suggestions I would appreciate their insights.

Thanks!
Anna
ARSDistanceListQ.pdf

Eric Rexstad

unread,
Jul 23, 2021, 2:22:37 AMJul 23
to Anna Staudenmaier, distance-sampling

Anna

Glad you are finding our materials useful.  At first glance, I'm not too concerned about the radial distance histogram you provided nor the hazard rate fit to them.  The pdf plot (bottom right) is more informative than the usual detection function plot (bottom left).  Note the fitted hazard rate function doesn't try to fit that peak you describe.

A more likely culprit of a miscalculation could be the calculation of effort.  Revisit the number of snapshot moments in your two hours of sampling effort per day.  We could discuss further off line, if needed.

On 23/07/2021 03:44, Anna Staudenmaier wrote:
Hello all,

I am new to distance sampling and have found this group and the distancesampling.org materials extremely helpful in teaching myself how to implement this methodology to a camera trapping study, so thank you all so much for that!

I am, however, wondering if I did everything correctly in my study. Going through the steps outlined in this example (MailScanner has detected a possible fraud attempt from "examples.distancesampling.org" claiming to be Analysis of camera trapping data (distancesampling.org)), all models ran successfully and yielded density estimates with a decent CV (0.24) after 1000 bootstraps, but my density estimate itself was ~275 animals/km2. I am studying a non-native ungulate on an island where it is abundant and likely overpopulated, but I am concerned that I have unknowingly done something in my modelling or data collection to cause bias and overestimation as this is a very high density estimate. This population has never been studied before so I have nothing to directly compare my results to, though similar systems have reported densities that are not statistically significantly different from mine.

I am happy to provide more detailed information about the study design, results, etc. off list but here is a brief description.

Cameras were deployed at 23 sites for ~900 active trap days (full 24-hr days, no malfunctions or researcher visits) resulting in ~7000 videos of the target species. Due to this high number of videos I limited analysis to peak activity times (2 hours total) as determined by a histogram of video start times (~1000 videos remained). After excluding videos with obvious reactivity to the camera I was left with ~500 videos and pulled ~5000 distance measures from those videos. Distances to each individual's midpoint in the FOV at the snapshot moment (t=2) were recorded. These data resulted in the attached histogram of detection distances, which I thought looked ok when compared to other histograms in similar published studies (Howe et al. 2017, Bessone et al. 2020). Those data were best fit to a hazard rate model without adjustments as seen in the detection probability graph yielding the PDF seen in the neighboring graph. I overlooked the extremely high first bin in the detection probability histogram, but I have a feeling something is wrong with that. 

If anyone sees any red flags that would explain an overestimation or has suggestions I would appreciate their insights.

Thanks!
Anna
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/fe6d2062-a1d2-4af7-9ae8-c6e41868a3a0n%40googlegroups.com.
-- 
Eric Rexstad
Centre for Ecological and Environmental Modelling
University of St Andrews
St Andrews is a charity registered in Scotland SC013532

Anna Staudenmaier

unread,
Jul 25, 2021, 7:14:37 PMJul 25
to distance-sampling
Hello Eric, 

Thanks for the quick reply. It's good to know the graphs aren't concerning. 

For my calculation of effort to be entered into the flat file I did the following.
1. Calculated the number of complete 24-hour days in which cameras did not malfunction and were not visited by researchers
2. Multiplied that number of days by the number of seconds in 2 hours (the amount of time encompassed in my peak activity times)
3. Plugged that value into the following equation to get the value for the effort column for effort expended at each camera site (ek): 

ek = (Camera FOV in radians * Number of seconds from bullet #2 which differed for each camera) / (2 * 3.14 [Pi] * )    >> = 2 for my study

I got this equation from Equation 1 in the Howe et al. (2017) paper. This was however followed by Equation 2 in the paper which substituted ek for Tk. I wasn't sure if I was correct in using the formula for ek to calculate effort for the flat file or if I should have just calculated effort as (Tk/t) or just used the value of Tk.

I did account for the corrected FOV separately in my density estimation in R using the camera's FOV in radians as the 'sample_fraction' value in the 'dht2' function as outlined in the analysis of camera trapping data example.

If you could clarify the correct way to calculate the values for the effort column for a camera trap data flat file I would greatly appreciate it. 

Thanks!
Anna

Eric Rexstad

unread,
Jul 26, 2021, 2:48:38 PMJul 26
to Anna Staudenmaier, distance-sampling

Anna

I've walked through the calculation of effort in the "peak activity" analysis of the Maxwell's duiker data presented in Howe et al. (2017), which is the data set shipped with the Distance package.  Follow this description of Howe et al. (2017:1560, bottom of column 2):

"Maxwell’s duikers were sampled from 28 June through 21 September 2014... Second, we assumed that all animals were available only during apparent times of peak activity (6.30.00–8.59.59 h and 16.00.00–17.59.59 h) and recalculated temporal effort and censored distance observations accordingly (Tk/t per day = 8098)"

Converted into R code (using the `hms` package for time calculations)

startp1 <- as_hms("06:30:00")
endp1 <- as_hms("08:59:59")
startp2 <- as_hms("16:00:00")
endp2 <- as_hms("17:59:59")
dur.p1 <- difftime(endp1, startp1, units="secs")
dur.p2 <- difftime(endp2, startp2, units="secs")
moments.m2 <- ((as.numeric(dur.p1) + as.numeric(dur.p2)) / snapshot.interval) - 1
print(moments.m2)

[1] 8098

effort.m2 <- floor(moments.m2) * as.numeric(daysrunning)
print(effort.m2)

[1] 680232

> print(DuikerCameraTraps$Effort[1])
[1] 680232

That's how the effort value in the data file was derived.  Perhaps you can duplicate this for your situation.  The camera angle adjustment is handled as the `sample_fraction` argument in `dht2` as you suggest.

Anna Staudenmaier

unread,
Jul 27, 2021, 9:38:33 PMJul 27
to distance-sampling
Hi Eric,

Thanks for the code, it worked great and I've now rerun my models with the corrected effort values. 

On a surface level, it appears the biggest impact my having calculated effort using the ek equation instead of (Tk/t)*days camera was active was that it caused an artificial inflation of the density estimate due to dividing the estimated density by the camera angle adjustment value twice (once in my original ek effort value and once by defining it as the 'sample_fraction' in the model). There's probably more to it than that, but I thought it was an interesting and frustratingly simple broad explanation to an error.

Ironically, when exploring this in the R Distance package and trying to rerun the 'bootdht' command with my updated flatfile I found that there appears to be something newly wrong with how the 'bootdht' function is reading in the value introduced to the model for the camera angle adjustment through 'sample_fraction'. Namely, it doesn't appear to be reading in that value at all when running the bootstraps or any value entered through the 'sample_fraction'. 

Here is my code for the singular 'dht2' function (which is still using the value defined as the 'sample_fraction'):

viewangle.new <- 43.7 # degrees
samfrac.new <- viewangle / 360
conversion.new <- convert_units("meter", NULL, "square kilometer")
peak.hr.density.CORRECTED <- dht2(hr0.CORRECTED, flatfile=RotaDeerSet1Forest.df.CORRECTED,
                     sample_fraction = samfrac.new, strat_formula = ~1, er_est = "P2", convert_units = conversion.new)

However, for the bootstrapping function 'bootdht' it is not reading in the 'sample_fraction' value though it is listed identically in the code:

viewangle.new <- 43.7 # degrees
samfrac.new <- viewangle / 360
conversion.new <- convert_units("meter", NULL, "square kilometer")
mysummary.new <- function(ests, fit){
  return(data.frame(Dhat = ests$individuals$D$Estimate))
}
RotaDeer.boot.peak.hr.density.TEST <- bootdht(model=hr0.OGeffort, flatfile=RotaDeerSet1Forest.df, resample_transects = TRUE,
                       nboot=100, summary_fun=mysummary.new,
                       convert.units = conversion.new)

The 'bootdht' function was working correctly not long ago, so this might be a new bug I thought I should bring attention to.

Thanks for all the help!

-Anna

Eric Rexstad

unread,
Jul 28, 2021, 2:28:21 AMJul 28
to Anna Staudenmaier, distance-sampling

Anna

Happy to hear your effort calculations are sorted out.

With regard to `bootdht` and `sample_fraction`; the code you provided does not include `sample_fraction` in your call to `bootdht`.  Specify it as an argument:

> args(bootdht)
function (model, flatfile, resample_strata = FALSE, resample_obs = FALSE,
    resample_transects = TRUE, nboot = 100, summary_fun = bootdht_Nhat_summarize,
    convert.units = 1, select_adjustments = FALSE, sample_fraction = 1,
    multipliers = NULL)

If you provide the sampling fraction to `bootdht`, your results should work out better.  Let us know how you get on.

Anna Staudenmaier

unread,
Jul 28, 2021, 2:48:59 AMJul 28
to distance-sampling
Apologies, I pasted the wrong code. The line of example code I pasted above for 'bootdht' was meant to be this:

RotaDeer.boot.peak.hr.density.TEST <- bootdht(model=hr0.CORRECTED, flatfile=RotaDeerSet1Forest.df.CORRECTED, resample_transects = TRUE, nboot=100, summary_fun=mysummary.new, sample_fraction = samfrac.new,
                       convert.units = conversion.new)

I wish the fix was as easy as a forgotten line of code, but running the code for 'bootdht' with or without providing the sample_fraction yields the same result, which is similar to the value 'dht2' outputs when you do not provide that function with the sampling fraction. This is what lead me to believe that for some reason 'bootdht' is not utilizing the sampling fraction in it's calculations. 

Eric Rexstad

unread,
Jul 28, 2021, 4:13:36 AMJul 28
to Anna Staudenmaier, distance-sampling

Anna

I did a quick check of the effect of altering the `sample_fraction` argument in `bootdht`.  Indeed, changing the value of that argument *does* have an impact upon the density estimates reported by `bootdht`:

> summary(daytime.boot.hr) # sample_fraction = 0.117
Bootstrap results

Boostraps          : 50
Successes          : 48
Failures           : 2

     median  mean   se  lcl   ucl   cv
Dhat  19.94 18.74 7.48 7.13 31.37 0.38
> summary(nofrac.boot.hr)
# sample_fraction = 1.0

Bootstrap results

Boostraps          : 50
Successes          : 50
Failures           : 0

     median mean   se  lcl  ucl   cv
Dhat    2.8 2.91 1.09 1.29 5.68 0.39

So I think `bootdht` is making use of the `sample_fraction` argument.

Anna Staudenmaier

unread,
Jul 29, 2021, 12:43:49 AMJul 29
to distance-sampling
Hi Eric, 

Sorry, very different time zones, but I tried to get the developmental Distance package to run on my windows machine and was not successful. I've never tried to access a developmental package before though so I likely did something wrong.

I have however been trying to run bootdht numerous different ways in Distance version 1.0.3. I have changed the values provided to sample_fraction, provided no value at all (similar to what you did above), and for some reason for me it continues to yield similar values regardless of the sampling fraction provided.

I did install the old version of Distance 1.0.2 to rerun bootdht and it worked as expected, yay! So, either there is something wrong with my computer specifically and how the new version of Distance (1.0.3) runs on it or perhaps something was changed in the package update that impacts how bootdht utilizes the sampling fraction?

What version of the Distance package did you run your quick check in?

-Anna

Eric Rexstad

unread,
Jul 29, 2021, 2:19:17 AMJul 29
to Anna Staudenmaier, distance-sampling

Anna

The results I sent to you yesterday were from the development version, from a Github branch.  I'm happy to walk you through the installation of the package from the development version off-line.  Then you can give it a test and report back to the list with your findings.

Anna Staudenmaier

unread,
Aug 2, 2021, 2:10:09 AMAug 2
to distance-sampling
Hello all, 

After some back and forth off-list I wanted to report back to the list with updates. The 'boodht' function and bootstrapping is running as expected with the current developmental version of the Distance package and with version 1.0.2 of Distance, but was not running correctly with version 1.0.3 of Distance as of July 30th, 2021.

We also chatted a bit about bootstrapping in general with an emphasis on understanding variation between density estimates produced by 'dht2' and 'bootdht'. Paraphrased from Eric, "The reason to conduct bootstraps is to assess the precision of the point estimates.  The point estimate produced by `dht2` is the best estimate of abundance.  The inconsistency between the estimate produced by `dht2` and the central tendency of the bootstrap distribution can be attributable to skewness in the distribution of the bootstrap replicate estimates. Skew can arise when there is great variation in encounters (detections) between camera stations--some stations have none, other stations have large numbers of detections.  Resampling by station can result in replicates with high encounter stations dominating the resample--resulting in high estimates of density. The skewness potentially derives from two possible sources: large estimates deriving from bootstrap resamples that by chance exclude camera stations with few detections, or random resamples that consist largely of detections close to the cameras (few detections at distance) leading to fitting models that estimate detection probability is low (hence abundance is high)."

For my study and often for many camera trap studies there is a large amount of encounter rate variation between camera stations, which is responsible for much of the uncertainty and lack of precision in my study and in camera studies generally.

Thanks again for all the help Eric!

-Anna
Reply all
Reply to author
Forward
0 new messages