Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

detection probabiliities by collection method for presence/absence data

281 views
Skip to first unread message

Christy Meredith

unread,
Feb 2, 2018, 11:40:11 AM2/2/18
to unmarked

Hello-
 I am hoping someone can point me in the right direction regarding the right unmarked approach for my study.

I have collected zooplankton at 30 locations in a bay Lake Superior, with 3 sampling occasions at each location.  For each occasion, species in the sample were identified using traditional taxonomy and DNA metabarcoding. Very simply, I want to determine the detection probability for each species and identification method with confidence intervals.  I want a species to be considered present if it was present using either method.

Later, I may want to try something more complex such as considering false presences.  For now, I just want to simply estimate detection probability by method, but with presence depending on the combined data (it was present if it was detected using either method). 

It seems pretty basic, but am a novice at this type of analysis. Is there a method in unmarked that will work for this? Or should I try to write my own code in Winbugs?

 I was considering the gmultmix approach (double observer), but does that only work for count data?

Thanks,

Christy Meredith


Brittany Mosher

unread,
Feb 5, 2018, 9:28:54 AM2/5/18
to unma...@googlegroups.com
Hi Christy,
It sounds to me like you could use the standard occupancy model (single species, single season -- function occu() in unmarked) to proceed. You would have 30 sites, each with 6 total surveys (one traditional and molecular survey for each of 3 occasions). You can use a binary survey-level covariate to differentiate the traditional surveys from the molecular surveys, and can use AICc to see if there is support for a method-level effect on detection. 

You might also look into the "multi-method" or "multi-scale" model by Nichols et al. (2008). This model would be preferred if you thought the occupancy state changed from occasion 1 to 2 to 3, so depending on the temporal scale of sampling you may or may not be interested in this. While this model isn't implemented in unmarked yet, it can be implemented in MARK and Presence. It can also be implemented in R via package eDNAoccupancy (Dorazio and Erickson, 2017). Ideally you'd have some additional survey replicates at each occasion, but I wanted to mention it in case it becomes useful to you in the future. The eDNA/aquatic detection field seems to be embracing this model for data similar to yours.

Dorazio, Robert M., and Richard A. Erickson. "ednaoccupancy: An r package for multiscale occupancy modelling of environmental DNA data." Molecular ecology resources (2017).

Nichols, James D., et al. "Multi‐scale occupancy estimation and modelling using multiple detection methods." Journal of Applied Ecology 45.5 (2008): 1321-1329.

Best,
Brittany
-- 
Brittany A. Mosher, Ph.D.
The Pennsylvania State University
USGS Patuxent Wildlife Research Center ARMI Program
Turners Falls, MA
Twitter: @BAMdoesscience





--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christy Meredith

unread,
Feb 5, 2018, 10:14:18 AM2/5/18
to unmarked
Thanks Brittany,
I will give this a try! Also, it is good to know about the eDNAoccupancy package for the future.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

Katie Davis

unread,
Sep 2, 2024, 3:49:21 PM9/2/24
to unmarked
Hi,

I have a similar study design to Christy in terms of combining visual surveys and eDNA at each sampling occasion, and I'm really struggling to wrap my head around what the appropriate methodological approach is.

I visited ~200 wetlands sites over 2 years (at least half of sites were visited in both years, but some were only visited in one or the other, mostly due to drying of wetlands in one year or the other). Each year, the site was visited once. During the site visit, we conducted two visual encounter surveys (independent observers, 15 minutes apart) and collected 1 eDNA sample (later used targeted qPCR for species of interest). It is not biologically reasonable to assume closure between years (due, again, to wetland drying), but I don't want to do a multi-season model, as my understanding is with only two seasons I do not have enough power to estimate the additional extinction/colonization parameters.

So a detection history looks like (for example):
Year 1                             Year 2
VS1, VS2, eDNA   VS1, VS2, eDNA
010                                     001

I'm interested in evaluating the difference in detection rate between visual surveys (and have some survey covariates such as observer identity, start time, first or second survey) and eDNA (covs include water chemistry and amount of water filtered), but my primary research questions are actually focused on the drivers of occupancy itself (e.g., wetland area, depth, vegetation type, etc.). I'm unsure if a multi-scale approach (Nichols 2008) is appropriate, and if so, what "theta" should represent in my dataset, as I see several ways of structuring it (see images below):



In the first scenario, I wonder if it would be appropriate to interpret theta as year-specific occupancy, incorporating my occupancy variables at that scale, and almost ignore psi, given that closure cannot be assumed between years? I also wonder if I can get separate detection rates for eDNA and Visual surveys in this approach? Or if method would simply be included as a survey covariate (in which case, is it possible to make it so water chemistry is only included as a covariate for surveys where the method was eDNA)?

Somewhere I saw the suggestion outlined in the second photo, which would make theta more comparable to a method-specific detection rate, but seriously complicates including covariates of occupancy, since I have some that vary by years. The third photo shows this same scenario, but where I would model each year separately (although I then wonder what this means for eDNA, which would only have one observation).

Is my best approach to simply not try to include both years in the same model (in which case I would only have three observations per year/model per site)?

And finally, I had been looking at implementing the multi-scale/multi-method models in RPresence, as I couldn't find a clear answer as to whether it can be implemented in unmarked - has that changed in the last few years?

Thanks,

Katie




To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

Jeffrey Royle

unread,
Sep 2, 2024, 9:26:28 PM9/2/24
to unma...@googlegroups.com
hi Katie,
 for your problem I would just stack the 2 years of data into 1 data set and then fit single-season occupancy models.  ("stacking" is what we often recommend with multi-year data sets where you don't care about the dynamics... lots of discussion of this on the email list).
 You can code "Method" as having 2 distinct values (value of "1" for the first 2 visual surveys and "2" for the eDNA survey). Then you can test that effect directly in the model. The only potential problem I see here is that having different observation covariates for the different methods... in this case you might have to code up some dummy variables for the 2 methods and then interact them with covariates. At any rate I don't think any of this is too difficult (famous last words....).  If you take the dummy variable approach make sure you remove the intercept from the detection model with a "-1".

  I think Ken was developing a multi-method occupancy function for unmarked and I don't know if that's in the github dev version of unmarked yet or not, perhaps he can say a word or two about that.

 Happy to discuss further but note I'm offline a lot this week so responses will be slow.
regards
andy


--
*** Three hierarchical modeling email lists ***
(1) unmarked (this list): for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology: for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.

Ken Kellner

unread,
Sep 3, 2024, 7:19:20 AM9/3/24
to unmarked
Hi Katie,

I agree with everything Andy said. unmarked does have a
multi-scale/multi-method occupancy model now (the function is called
goccu) but I do not think it would be useful in your situation, if I have
understood your design correctly. You need to be able to independently
estimate detection probability from each method (or scale) and in your
case you appear to only have at most 1 eDNA sample per site which I think
would make this impossible. So I think the covariate method Andy
describes would be the best option.

Ken

Jordan Heiman

unread,
Sep 3, 2024, 10:19:44 AM9/3/24
to unmarked
Hi Ken, 

As someone just lurking here to learn things, are there any vignettes or anything that talk more about the goccu function? I'm curious about how that can be used. Also does it happen to use a Bayesian framework? I am wondering if it might be a good option for some data that I am planning to be working with in the coming months. 

Thank you for entertaining my curiosity, 

Jordan

Ken Kellner

unread,
Sep 3, 2024, 11:05:56 AM9/3/24
to unmarked
Hi Jordan,

Unfortunately there is no vignette yet, that is something I would like to do when I find the time. The model is based on this paper:


The general idea is you have multiple primary sampling periods for the same sites within a single closure period, which allows estimation of an availability probability in addition to detection. This could take several forms. For example you could have multiple primary sampling periods each consisting of 2 or more secondary sampling periods in a temporal sequence (as with the way gpcount/gdistsamp/gmultmix are typically used). Or you could have the 'site' divided into multiple spatial sub-units and have a series of secondary sampling periods in each sub-unit. Or you could have multiple independent detection methods acting as the "primary periods" (e.g. repeated observations + repeated eDNA samples).

You can certainly fit this model in a Bayesian framework but unmarked uses maximum likelihood.

Ken

Jordan Heiman

unread,
Sep 3, 2024, 3:36:22 PM9/3/24
to unmarked
Thank you Ken that is really helpful information!

Jordan

gcsadoti

unread,
Sep 3, 2024, 9:27:25 PM9/3/24
to unmarked
As an aside, it's great to hear you've developed this model for unmarked, Ken. I've been using the Rmark version which does the trick but prefer the unmarked syntax/framework =)

Any chance you might also be  working on the multi-season version of this model? Thanks for all your contributions.

Giancarlo

Ken Kellner

unread,
Sep 4, 2024, 9:57:23 AM9/4/24
to unmarked
Hi Giancarlo,

Glad you find it useful. I think it is unlikely I will implement the multi-season version in unmarked, but if I do, I'll try to do a better job publicizing it.

Ken
Reply all
Reply to author
Forward
0 new messages