descritizing camera data for occupancy analysis

974 views
Skip to first unread message

Robin Steenweg

unread,
Jan 16, 2015, 12:35:10 PM1/16/15
to unma...@googlegroups.com
Hi All,

Does anyone know of a published recommendation on how to descritize camera data into separate sampling sessions before performing occupancy analysis? I see lots of papers use one day, or one week, or some other seemingly arbitrary temporal window...

Thanks!
Robin


Robin Steenweg
Ph.D. Candidate

***************************
Wildlife Biology Program
University of Montana

Kery Marc

unread,
Jan 17, 2015, 5:22:12 AM1/17/15
to unma...@googlegroups.com
Dear Robin,

if you discretize and analyse using a standard occupancy model, then you typically lose information, whenever more than a single detection per occasion gets subsumed into a single '1' in the resulting history. Viewed from this angle, it would be best to discretize finely so that you never have more than a single detection per occasion. On the other hand, this may lead to very large data sets (lots of occasions) and moreover, there may be serial dependence, i.e., neighbouring occasions may not be independent samples from the underlying presence/absence process. This is a problem that you would have to deal with by adoping more complex occupancy models, see Hines et al. (tigers on trails paper, 2010) and recent follow-up paper for dynamic occupancy. These models have been implemented in PRESENCE (not sure about MARK). Note that independence or lack thereof in space (as for transect-based sampling) is very similar or identical to that in time (as for your camera-trap data).

Some would argue that discretisation of measurements made on a continuous process is always a bad idea and that you should directly model a continuous detection process, in space or in time. This typically leads to Poisson process models, which in the context of occupancy estimation have been introduced by Gurutzeta Guillera Arroita (GGA) (papers in JABES, MEE and elsewhere). The simplest such models assumes independence in space or time. The individual detections can be aggregated into a detection frequency per transect or per camera and some total time interval, say C_i for the number of detections at trap or transect i.

Then, you can specify the following variant of occupancy model, which has a Poisson instead of a Binomial description of the observation process:

z_i ~ Bernoulli(psi)              # Presence/absence (z) is like a coin flip
C_i ~ Poisson(z_i * lambda_i) # lambda is detection rate, i.e., the expected number of detections given that a site is occupied.

You can model pattern in occupancy probability by logit(psi_i) = some linear model and those in the detection frequency by log(lambda_i) = some linear model, exactly as in a GLM (this model is the combination of a logistic regression for occupancy and a Poisson regression for the number of detections).

This model can easily be fitted in BUGS. Note that for this model, a single replicate is enough (there is no j). Also note that one would usually have an offset in the Poisson part of the model, to account for unequal length (of time or transect) of observation.

If you can assume independence, then this seems to be a very useful model. If you don't have independence (for instance, some aggregation of detection in time or space), then this autocorrelation ought to be modelled and you must model the individual detections. One conceptually simple way of doing so is by imagining two latent processes underlying the measured detections, between which there are switches with some probability. GGA have lately developed such a 2MMPP model (a 2-state Markov-modulated point process model --- try to say this a couple of times quickly ....). She has code in Matlab, but I think not (yet) for any language that is widely spoken in ecology, such as R or BUGS.

Kind regards  --  Marc

______________________________________________________________
 
Marc Kéry
Tel. ++41 41 462 97 93
marc...@vogelwarte.ch
www.vogelwarte.ch
 
Swiss Ornithological Institute | Seerose 1 | CH-6204 Sempach | Switzerland
______________________________________________________________
 
*** Introduction to Bayesian statistical modeling: Kéry (2010), Introduction to WinBUGS for Ecologists, Academic Press; see www.mbr-pwrc.usgs.gov/pubanalysis/kerybook 
*** Book on Bayesian statistical modeling: Kéry & Schaub (2012), Bayesian Population Analysis using WinBUGS, Academic Press; see
www.vogelwarte.ch/bpa
*** Upcoming workshops: www.phidot.org/forum/viewforum.php?f=8


From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Robin Steenweg [robins...@gmail.com]
Sent: 16 January 2015 18:29
To: unma...@googlegroups.com
Subject: [unmarked] descritizing camera data for occupancy analysis

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kery Marc

unread,
Jan 17, 2015, 5:33:05 AM1/17/15
to unma...@googlegroups.com
btw: the need for a (much) more complex model for un-discretized camera-trap detections in the case of serial dependence may provide one motivation for discretisation and modeling of aggregated data. In that way you may get rid of dependence and hence be able to use a simpler model (or do estimation at all, if you don't know how to fit the more complex model). Stats is like real life, so full with trade-offs  ......

Regards  -- Marc



From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Kery Marc [marc...@vogelwarte.ch]
Sent: 17 January 2015 11:22
To: unma...@googlegroups.com
Subject: RE: [unmarked] descritizing camera data for occupancy analysis

Murray Efford

unread,
Jan 17, 2015, 11:27:49 AM1/17/15
to unma...@googlegroups.com
Marc
Thanks for a superb overview of the issues. Almost a publishable unit!
I wonder if you didn't shoot yourself in the foot with your addendum - doesn't aggregation just shift the problem from 'independence' to 'overdispersion'?
On 2MMPP - I described Gurutzeta's original paper as an 'instant classic'. I adapted her (privately provided) code for R and C at the time and have it somewhere. I became a bit disillusioned with 2MMPP as a way of modelling cue production (singing) by birds - it seemed to have limited ability to match the distribution of total cue number when birds sing intermittently because birds sing in non-Poisson bouts. The same limitation may apply elsewhere.
Murray


Sent: Saturday, 17 January 2015 11:33 p.m.

Robin Steenweg

unread,
Jan 19, 2015, 12:08:59 PM1/19/15
to unma...@googlegroups.com
Thanks Murray for sharing your experience with the 2MMPP model, and thanks Marc for distilling the Gurutzeta papers (and your touch of enlightening life philosophy :)

I gather from this list serve (and from a cross post on the yahoo  camera trapping list serve) that a succinct answer to my question of whether there was some published recommendations is: no, not much published... except for the general recommendations to NOT descritize continuous data, both in the occupancy literature (Gurutzeta) and S(E)CR literature (Borchers et al. 2014 MEE). I suppose the best way to take this advice when still chopping up continuous data is to chop it up as fine as possible (while still considering the auto-correlation issues if possible). Your point, Marc about issues with actually being able to estimate occupancy when data is chopped up very finely is well taken; I have had issues in unmarked (the usual Hessian error) when camera data was descritize a little too finely, but that goes away when collapsing data, even just a little bit. Switching to BUGS may avoid this error though, but then my simulations become too unwieldily.

Thanks for your thoughts!
Robin
 

Robin Steenweg
Ph.D. Candidate

***************************
Wildlife Biology Program
University of Montana

Jeffrey Royle

unread,
Jan 19, 2015, 1:25:39 PM1/19/15
to unma...@googlegroups.com
hi Robin, Marc, Murray, et al:
 
 This is a really good discussion of an important issue. 

 I wanted to add my 2 cents worth on the issue of "to aggregate or not" -- I think for most camera trapping studies of carnivores it is pointless to try to model the continuous time data because often the very short term "visits" are almost completely dependent and those detections are not informative about the important parameters of the model.  e.g., a tiger will go back and forth along a trail segment 10 times in an hour or two, even a minute or two apart.  This is why I think using a nightly sample interval makes sense (and also it's a natural period of activity for most species).  Of course there is still potentially dependence from one night to the next but this is well modeled in many cases with a behavioral reponse.

  I make the same argument in attempting to model scat detections of some species. Many carnivores use latrines or similar high density areas and the actual rate of scatting on latrines is essentially irrelevant unless you're studying behavior.

regards
andy

Murray Efford

unread,
Jan 19, 2015, 2:32:19 PM1/19/15
to unma...@googlegroups.com
I agree strongly with Andy on this - the key point being to pick a time unit that we expect for biological reasons to yield plausibly independent data. For the remaining dependence, presumably you mean a Markovian (transient) behavioural response rather than the conventional permanent one of Otis et al. Mb.

Also, re-stating what may be already obvious and well documented: If your (nightly) observations are independent (i.e. Bernoulli trials) and you don't want to model changing detection probability over time then it is mathematically equivalent to model their aggregate as binomial (i.e. to model the sum across occasions). Can save time.

Since I mentioned 2-MMPP: I have found a 3/4-documented R package I wrote that fits this to continuous data from multiple sites for occupancy (0/1), Poisson and negative binomial N (cf Royle and Nichols). If anyone wants this they can contact me (I won't be putting it on CRAN).

Murray

From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Jeffrey Royle [jar...@gmail.com]
Sent: Tuesday, 20 January 2015 7:25 a.m.
To: unma...@googlegroups.com
Subject: Re: [unmarked] descritizing camera data for occupancy analysis

Mathias Tobler

unread,
Jan 20, 2015, 4:59:11 PM1/20/15
to unma...@googlegroups.com
I think the issue has been summarized really well so far but I would like to add a few things based on my experience with camera trap data and occupancy models. While correlation can be an issue on a short time scale (1-2 days) overdispersion of the data is often a bigger issue, especially for abundant species and species with small home ranges. Camera traps are often run over long periods of time (e.g. 60 to 90 days) so that you can potentially end up with a large number of detections. For some species in our dataset the number of days a species was detected at a camera can range from 1 to 35 out of 60 sampling days. This heterogeneity among cameras can lead to a poor model fit. Generally we found that the Royle-Nichols model improves fit but even then there is too much overdispersion when using one day as a sampling period. Pooling the data over 5 or 6 days helps bringing those large numbers down while little affecting the low numbers. So loosing information can be helpful. So far I had little success with modeling the overdispersion using a negative binomial or a Poisson log-normal model. Using a Poisson detection model instead of a Binomial as suggested by Marc did not affect model fit either.

Overdispersion is less of an issue for carnivores with large home ranges. Those species cover large areas and spend little time near a single camera. Ungulates or smaller mammals on the other hand can can pass by a camera almost daily in some cases. Some of the differences among cameras can be addressed by covariates such as on/off trail or habitat type but we found that a lot of the variation can not be explained. It is easy to imagine that if a camera is close to an individuals home range center we would get more detections while if the camera is at the edge of a home range we get fewer detections, which is exactly what the SECR models assume. Cameras can also unintentionally be placed close to an important food source, denn, travel path etc. which increases the number of detections compared to other cameras. These detections are not necessarily correlated in time.

So my suggestion would be to try to model  your data with a Royle-Nichols model with 1 day as a sampling occasion and the assess model fit. If the model does not fit then I would pool data over several days.

Regards,

Mathias

Sent: 17 January 2015 11:22
To: unma...@googlegroups.com
Subject: RE: [unmarked] descritizing camera data for occupancy analysis
Dear Robin,

if you discretize and analyse using a standard occupancy model, then you typically lose information, whenever more than a single detection per occasion gets subsumed into a single '1' in the resulting history. Viewed from this angle, it would be best to discretize finely so that you never have more than a single detection per occasion. On the other hand, this may lead to very large data sets (lots of occasions) and moreover, there may be serial dependence, i.e., neighbouring occasions may not be independent samples from the underlying presence/absence process. This is a problem that you would have to deal with by adoping more complex occupancy models, see Hines et al. (tigers on trails paper, 2010) and recent follow-up paper for dynamic occupancy. These models have been implemented in PRESENCE (not sure about MARK). Note that independence or lack thereof in space (as for transect-based sampling) is very similar or identical to that in time (as for your camera-trap data).

Some would argue that discretisation of measurements made on a continuous process is always a bad idea and that you should directly model a continuous detection process, in space or in time. This typically leads to Poisson process models, which in the context of occupancy estimation have been introduced by Gurutzeta Guillera Arroita (GGA) (papers in JABES, MEE and elsewhere). The simplest such models assumes independence in space or time. The individual detections can be aggregated into a detection frequency per transect or per camera and some total time interval, say C_i for the number of detections at trap or transect i.

Then, you can specify the following variant of occupancy model, which has a Poisson instead of a Binomial description of the observation process:

z_i ~ Bernoulli(psi)              # Presence/absence (z) is like a coin flip
C_i ~ Poisson(z_i * lambda_i) # lambda is detection rate, i.e., the expected number of detections given that a site is occupied.

You can model pattern in occupancy probability by logit(psi_i) = some linear model and those in the detection frequency by log(lambda_i) = some linear model, exactly as in a GLM (this model is the combination of a logistic regression for occupancy and a Poisson regression for the number of detections).

This model can easily be fitted in BUGS. Note that for this model, a single replicate is enough (there is no j). Also note that one would usually have an offset in the Poisson part of the model, to account for unequal length (of time or transect) of observation.

If you can assume independence, then this seems to be a very useful model. If you don't have independence (for instance, some aggregation of detection in time or space), then this autocorrelation ought to be modelled and you must model the individual detections. One conceptually simple way of doing so is by imagining two latent processes underlying the measured detections, between which there are switches with some probability. GGA have lately developed such a 2MMPP model (a 2-state Markov-modulated point process model --- try to say this a couple of times quickly ....). She has code in Matlab, but I think not (yet) for any language that is widely spoken in ecology, such as R or BUGS.

Kind regards  --  Marc

______________________________________________________________
 
Marc Kéry
Tel. ++41 41 462 97 93
marc...@vogelwarte.ch
www.vogelwarte.ch
 
Swiss Ornithological Institute | Seerose 1 | CH-6204 Sempach | Switzerland
______________________________________________________________
 
*** Introduction to Bayesian statistical modeling: Kéry (2010), Introduction to WinBUGS for Ecologists, Academic Press; see www.mbr-pwrc.usgs.gov/pubanalysis/kerybook 
*** Book on Bayesian statistical modeling: Kéry & Schaub (2012), Bayesian Population Analysis using WinBUGS, Academic Press; see
www.vogelwarte.ch/bpa
*** Upcoming workshops: www.phidot.org/forum/viewforum.php?f=8

Robert Long

unread,
Jan 22, 2015, 4:54:09 PM1/22/15
to unma...@googlegroups.com
Hi folks,

I'm also really pleased to see this being discussed. For years I've wondered about the ramifications of collapsing camera data arbitrarily for occupancy estimation, especially with fairly sparse carnivore datasets. After discussing this with many statisticians and researchers my take is that--within reason--the results you get should be the same regardless of encounter interval length as long as you 1) interpret estimates using the correct interval length (e.g., 1-day probability of detection if daily encounter histories, 1-week if collapsed over a week), 2) don't create histories that are too sparse or dense (i.e., all 1s), and model any obvious heterogeneity in detection probability (e.g., if you were analyzing using a 1-day interval but baiting a site every two weeks).

I currently have a student looking at this with a black bear dataset--trying various interval lengths for encounter histories and seeing what kind of estimates he gets for p and psi. 

Best,

Robert

Sun, Ge

unread,
May 25, 2015, 5:06:07 AM5/25/15
to unma...@googlegroups.com
Thanks for everyone here,
I would like to make some supplements:) This is also a problem puzzling me for a long time. I had asked some field scientists who using occupancy analysis, somebody suggest me to pool the daily camera-trap data until the detection probability exceeded 0.1, the others recommended me to pool the data until the mean autocorrelation coefficient of the generated dataset reached the lowest level.
Then I used nine simulated datasets (with different degree of autocorrelation, psi and p) to test their suggestions, and found:
1. The difference of the AIC between best fitted model and true model didn't correlated with dataset's mean autocorrelation coefficient, neither did the detection probability.
2. When the autocorrelation is low, the best fitted model was very similar to the true model no matter how to pool the data;
3. When the autocorrelation is high, the AIC difference between best fitted model and true model is also high no matter how to pool the data, and there wasn't any significant trend or pattern between that AIC difference and mean autocorrelation or detection probability.
Thus my conclusion was, when the daily camera-trap data's modeling result was good, the pooling data's result was also good; when the daily data's modeling result was poor, just pooling the data couldn't generate better results.
However, my simulation was very simple and didn't incorporate replications. Thus I wonder is there any progress abou this issue? How about Rebert's bear data modeling results, please? I am using R-N model to reanalyse my camera-trap data in the hope that it can do better than the previously-used single-season occu model.
Best wishes and thanks a lot for all of the folks here,
Sun, Ge

在 2015年1月23日星期五 UTC+8上午5:54:09,Robert Long写道:
Reply all
Reply to author
Forward
0 new messages