DSM with camera trap distance sampling data

Benjamin Debetencourt

unread,

Sep 16, 2021, 10:25:59 AM9/16/21

to distance-sampling

Hello everyone,

I am working with camera trap distance sampling, and I want to use those distance data from the cameras to make a DSM. As I have not seen yet any studies using CTDS data to do some DSM, my first question is actually

1) Is it ok to use the data from CT, which are highly non independent in a GAM and in the function dsm that is built under R in the package dsm right now?

I still dived into DSM with my CT data without being sure I was not outrageously violating some modelling requirements, I manage I think to control well for the effort in it, and I obtained sensible results. I use as predictors a couple of spatial predictors, like distance to the river, a human population density that is weighted with the distance cost from the village to my points and a couple of other. I have strong concurvity between some spatial predictors and my latitude and longitude that are in my model. I first removed all the predictors that were having the “worst” concurvity check above 0.8 with my latitude and longitude and kept my s(X, Y) term. But then I do not explain why I have those spatial variations in abundance biologically… So my second question would be

2) Can I instead, remove s(X, Y) from my model ? In all the examples I read and worked through, the lat and long are always in the model. If I remove s(X, Y) in the model, do I have to add a specific term to take into account what I think s(X,Y) does which is if I have a lot of individuals at my specific point, I will more likely have more individuals close to this points ?

Finally, I am struggling with something a bit more specific, linked to the area we are sampling. I am working in a highly mosaic habitat region, mostly bushy and woody savanna, with a bit of gallery forest. We are studying chimpanzees, and they mostly use the gallery forest. In the design, we stratified by habitat, placing more cameras in gallery forest, a bit less in woody savanna, and less in bushy savanna. For the DSM I combined all my distance data together to have this one detection function. The issue I am encountering is that my number of capture or distances I have of chimpanzees locally in each camera is mostly driven by which habitat the camera was placed in. Yet, I did not integrate in my model this local habitat where my camera is placed. Because all of my predictors are on larger scale, like percentage of gallery forest in a 2000m buffer around my point (roughly small chimp territory size). I am not capturing this very local variation between my cameras. But if my camera is set locally in a savanna habitat, but has high forest gallery around it, I am going to have points with zero capture with high value of forest gallery and vice versa. I ended up with the forest having no effect on my count of chimp which I know is not true. My predictors as they are now do not explain why I have more capture at one point and less at the other point. So here is my third question:

3) How could I consider the habitat my cameras are set in in the model formulation for dsm? My problem being that I do not see how I could possibly integrate this very specific local habitat into a predictor value for my predictor cells (the habitat map I have is 10m x10 m resolution VS predictor cell I am using so far 1km² (I could make it finer, but I do not think it makes sense to do a prediction on 10x10m grid cell?)

I hope my questions are clear enough, and I would greatly appreciate any input on any of my three questions!

Best regards,

Benjamin Debetencourt

David Lawrence Miller

unread,

Sep 17, 2021, 11:15:45 AM9/17/21

to Benjamin Debetencourt, distance-sampling

Hi Benjamin, hi listfolk,

I can try to answer some parts of your questions and perhaps those who
know more about camera trapping can correct me where I'm wrong.

1. I think it's totally fine to put CT data into a DSM, in theory. The
issue I can see at the moment is that I haven't written a very
sophisticated way to deal with temporal availability of animals. You can
currently make a per-segment correction via the availability= argument,
but this doesn't account for the uncertainty in the data (cf how this is
handled in bootdht in the Distance package,
https://rdrr.io/cran/Distance/man/bootdht.html). It seems to me like
that aspect probably needs more investigation.

2. You don't have to have any particular covariates in the model. The
reason usually to have s(x,y) in a model is that it usually provides
good results within the surveyed area (though can be problematic when
extrapolating). You should certainly try any options that make sense to you.

3. One option for this kind of situation I have seen is using
"proportion of habitat type X" for a given segment or prediction cell
(e.g.,
https://esajournals.onlinelibrary.wiley.com/doi/full/10.1890/11-1400.1).
That would allow you to use larger cells while retaining some
information and also give you a continuous variable to potentially
smooth over.

Hope that helps,
--dave

On 16/09/2021 15:25, Benjamin Debetencourt wrote:
>
> Hello everyone,
>
> I am working with camera trap distance sampling, and I want to use those
> distance data from the cameras to make a DSM. As I have not seen yet any
> studies using CTDS data to do some DSM, my first question is actually
>

> 1)Is it ok to use the data from CT, which are highly non independent in

> a GAM and in the function dsm that is built under R in the package dsm
> right now?
>
> I still dived into DSM with my CT data without being sure I was not
> outrageously violating some modelling requirements, I manage I think to
> control well for the effort in it, and I obtained sensible results. I
> use as predictors a couple of spatial predictors, like distance to the
> river, a human population density that is weighted with the distance
> cost from the village to my points and a couple of other. I have strong
> concurvity between some spatial predictors and my latitude and longitude
> that are in my model. I first removed all the predictors that were
> having the “worst” concurvity check above 0.8 with my latitude and
> longitude and kept my s(X, Y) term. But then I do not explain why I have
> those spatial variations in abundance biologically… So my second
> question would be
>

> 2)Can I instead, remove s(X, Y) from my model ? In all the examples I

> 3)How could I consider the habitat my cameras are set in in the model

> formulation for dsm? My problem being that I do not see how I could
> possibly integrate this very specific local habitat into a predictor
> value for my predictor cells (the habitat map I have is 10m x10 m
> resolution VS predictor cell I am using so far 1km² (I could make it
> finer, but I do not think it makes sense to do a prediction on 10x10m
> grid cell?)
>
> I hope my questions are clear enough, and I would greatly appreciate any
> input on any of my three questions!
>
> Best regards,
>
> Benjamin Debetencourt
>

> --
> You received this message because you are subscribed to the Google
> Groups "distance-sampling" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to distance-sampl...@googlegroups.com
> <mailto:distance-sampl...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com
> <https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Benjamin Debetencourt

unread,

Sep 22, 2021, 8:29:32 AM9/22/21

to David Lawrence Miller, distance-sampling

Dear David,

thank you so much for your answer, and sorry for the late reply, I needed a bit of time to check everything you mentioned !

1) I am glad that CTDS data can a priori be used in dsm, I am going to admit though I did not fully understand the concern

about the availability bias. So far I used a temporal availability computed with the Rowcliffe method using the data form

the camera trap, and I correct it in my effort in my segment data and observation data. I also correct it for my FOV in the effort.

Indeed I do not take into account any kind of uncertainty for this temporal availability value (I just take the estimate directly) but I am not sure

it is the concern you are mentioning. I should look into that indeed when I have a satisfying model to estimate the variance as well as possible.

2) Perfect thanks for the input !

3) Thanks for the suggestion and the paper. I actually did that but had troubles as the cells were too large and did not

then contain the information of my local habitat where my cameras were placed.

I thus reduced the cell size to obtain a % of forest in my cell that somehow reflects my local habitat and it seems to work well !

I just need now to see how the prediction goes, and shape a new prediction grid at this smaller resolution.

Thank you very much again for your quick reply !

Best regards,

Benjamin Debetencourt

David Lawrence Miller

unread,

Sep 23, 2021, 4:54:47 AM9/23/21

to Benjamin Debetencourt, distance-sampling

Hi Benjamin, hi listfolk,

Re: 1), I think that sounds right to me. If you're calculating an
overall temporal availability (i.e., it does not vary in space) then I
think that you can just add the squared CV from the availability to the
squared CV from the abundance estimate to get a total squared CV (and
root that to get the total CV). I think that's reasonable, as there's
nothing there to be correlated with space (since the estimate is
"flat"), so the independence assumption is okay. If you're doing
something more complicated (estimating temporal availability per
stratum?) you might be in for a headache. As I said before, perhaps
other folks who know more about camera trapping will hopefully wade-in
here if I'm talking nonsense.

Hope that helps,
--dave

> <da...@ninepointeightone.net <mailto:da...@ninepointeightone.net>> a écrit :

>
> Hi Benjamin, hi listfolk,
>
> I can try to answer some parts of your questions and perhaps those who
> know more about camera trapping can correct me where I'm wrong.
>
> 1. I think it's totally fine to put CT data into a DSM, in theory. The
> issue I can see at the moment is that I haven't written a very
> sophisticated way to deal with temporal availability of animals. You
> can
> currently make a per-segment correction via the availability= argument,
> but this doesn't account for the uncertainty in the data (cf how
> this is
> handled in bootdht in the Distance package,
> https://rdrr.io/cran/Distance/man/bootdht.html

> <https://rdrr.io/cran/Distance/man/bootdht.html>). It seems to me like

> that aspect probably needs more investigation.
>
> 2. You don't have to have any particular covariates in the model. The
> reason usually to have s(x,y) in a model is that it usually provides
> good results within the surveyed area (though can be problematic when
> extrapolating). You should certainly try any options that make sense
> to you.
>
> 3. One option for this kind of situation I have seen is using
> "proportion of habitat type X" for a given segment or prediction cell
> (e.g.,
> https://esajournals.onlinelibrary.wiley.com/doi/full/10.1890/11-1400.1

> <https://esajournals.onlinelibrary.wiley.com/doi/full/10.1890/11-1400.1>).

> <mailto:distance-sampling%2Bunsu...@googlegroups.com>
> > <mailto:distance-sampl...@googlegroups.com
> <mailto:distance-sampling%2Bunsu...@googlegroups.com>>.

> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com
> <https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com>
>
> >

> <https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com?utm_medium=email&utm_source=footer
> <https://groups.google.com/d/msgid/distance-sampling/5d3de899-fe52-488e-a482-5cb546583cddn%40googlegroups.com?utm_medium=email&utm_source=footer>>.

>
> --
> You received this message because you are subscribed to the Google
> Groups "distance-sampling" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to distance-sampl...@googlegroups.com
> <mailto:distance-sampl...@googlegroups.com>.
> To view this discussion on the web visit

> https://groups.google.com/d/msgid/distance-sampling/CANq5Wkh8P2S2Mreg4ZRgt%2BMPNBTo7FXK3u2ZfnBbSu30o%3DO3tw%40mail.gmail.com
> <https://groups.google.com/d/msgid/distance-sampling/CANq5Wkh8P2S2Mreg4ZRgt%2BMPNBTo7FXK3u2ZfnBbSu30o%3DO3tw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Benjamin Debetencourt

unread,

Sep 23, 2021, 9:46:36 AM9/23/21

to David Lawrence Miller, distance-sampling

Hi David,

thanks for your answer on how to add my CV from my temporal availability in the mix !

Yes I calculated a temporal availability overall, so this should be ok !

Thanks again, and wish you a nice end of the week,

Benjamin Debetencourt

Eric Howe

unread,

Sep 23, 2021, 2:33:57 PM9/23/21

to distance-sampling

Hi Benjamin,

I don't have experience with distance sampling DSMs, however, I agree that it should be no problem to correct (inflate) your abundance estimates to account for temporally limited availability for detection after the fact. If it's also possible to include the uncertainty in your CV of abundance, so much the better.

Outside the DSM framework, availability and it's uncertainty can be included by including both the proportion of time active/available and it's SE as a multiplier.

If you're estimating availability over a 24 hour day (the default) this assumes you included detections from all times of day in your data, and "daily Tk" was 24 hours (as the number of seconds in 24 hours if the time interval between snapshot moments when distances were recorded [t] is measured in seconds). If you only recorded distances for part of each day, then Tk is < 24 hours and you should estimate availability only within those times (using a "bounded" distribution rather than the full circular distribution in the activity package).