stratified sampling

152 views
Skip to first unread message

jcl...@utk.edu

unread,
Jun 14, 2018, 10:32:03 AM6/14/18
to secr
I am interested in estimating density of an elk population based on fecal DNA in an area where elk are heavily concentrated around open fields and occur in much lower densities in the surrounding forest.  I am considering a stratified sampling design whereby I use a higher density of transects for detecting scats in and adjacent to fields and a much lower density of transects in forests.  My hope is to combine all the scat samples into 1 sampling occasion/year (all collected within a period of a few months), with sampling over 3 years.  The 2 strata would be treated as sessions within secr. (or the new open version)

One concern I have is the fact that an individual elk might occasionally be sampled in both strata (sessions) during a given year.  That would violate the assumption that the strata have to be independent I suppose.  How serious a violation is that and can I simply delete one of the occasional cross-strata samples?  Some elk occasionally wander between distant fields so it will be hard to find a place in the forest with no chance of detecting elk that use the fields.  It is not known how many elk use forest exclusively but I suspect it is very few (if any).

Thanks for any guidance you may have!
Joe

Murray Efford

unread,
Jun 17, 2018, 8:42:10 AM6/17/18
to jcl...@utk.edu, secr
Hi Joe
I don't think strata have to be independent in that sense. I would work on other issues - like how to jointly model the strata when you have deliberately confounded sampling intensity and density (can be done, but be careful) and avoiding risks from linear detector arrays (bad news when home ranges may be aligned, so randomize direction and consider non-linear alternatives that yield more recaptures). Low sampling intensity in low-density areas also can yield negligible data, so may not be worth it.
Murray

--
You received this message because you are subscribed to the Google Groups "secr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to secrgroup+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jcl...@utk.edu

unread,
Jun 18, 2018, 9:33:16 AM6/18/18
to secr
Thanks Murray.  I assume p0 will not be affected by a variable number of traps in the 2 strata because the probability of an individual being sampled if the trap was at its activity center would not be affected by how many other traps were in the vicinity.  I understand there would be more samples in the high density stratum because there are more traps.  How would the confounding cause problems and what should I watch out for?
Joe.
To unsubscribe from this group and stop receiving emails from it, send an email to secrgroup+...@googlegroups.com.

Murray Efford

unread,
Jun 18, 2018, 12:41:57 PM6/18/18
to jcl...@utk.edu, secr
Suggest you try simulating that scenario...

jcl...@utk.edu

unread,
Jun 18, 2018, 1:51:31 PM6/18/18
to secr
Will do Murray.  Thanks.

Samundra Subba

unread,
Jul 5, 2018, 5:30:08 AM7/5/18
to Murray Efford, secr, Kanchan Thapa, Sabita Malla

Hi Murray,

 

Can you help us out urgently? We just wanted to know whether realized N (function- region.N(x)) using habitat mask of particular ETA is calculated by Derived density or by the real parameters (sigma and g0).

 

If it is then the calculated N is the N of the whole habitat mask or the MCP of the trap array?

 

We would be happy to elaborate if you have any confusion.

Best wishes,

Samundra

Murray Efford

unread,
Jul 5, 2018, 8:47:11 AM7/5/18
to Samundra Subba, secr, Kanchan Thapa, Sabita Malla
I cannot make sense of your question. MCP has nothing to do with it. Nor does ETA.

Samundra Subba

unread,
Jul 6, 2018, 12:08:26 AM7/6/18
to Murray Efford, secr, Kanchan Thapa, Sabita Malla

Hi,

 

Thank you so much for your response.

 

Perhaps I wasn’t clear and sorry to create confusion. We are estimating Nepal’s tiger population. We have the trapfile from the camera trap detectors, animal file and the habfile which is the habitat mask with ‘0’(non-habitat) and ‘1’ (habitat) value, similar to that of SPACECAP. Habfile is the ETA which is the buffer of certain value (half MMDM/4xsigma) over MCP (camera trap area)

 

We need to estimate the population of tiger’s in each National Park. So, after running the all the models (secr.fit) and selecting the suitable models of detection function (HN, HZ and EX), g0 and sigma we kept the function

“>region.N <-(Bestmodel, region= habitatmask)” to estimate the population. We were wondering what the function region.N uses to calculate N.

 

  1. Does it use the derived density to calculate N or the real parameters (sigma and g0)? Since everything happens inside the system we needed to know what happens inside after we put the function region.N.
  2. And is N the estimated population of whole habfile (ETA) or of MCP (camera trap area)? We only need of MCP by the way.

 

Hope you are clear this time,

Thank you,

John

unread,
Jul 6, 2018, 12:27:19 PM7/6/18
to secr
Hi Samundra,

The package documentation (https://www.rdocumentation.org/packages/secr/versions/3.1.6/topics/region.N) should help with q1 (for a more thorough description, see Efford and Fewster 2012).

The region.N function has an argument ("region=") that allows you to select the specific space (a mask object) within which N is estimated. 


Cheers,

John


 

Murray Efford

unread,
Jul 7, 2018, 9:51:42 AM7/7/18
to secr
Hi Samundra

John is right, and maybe that has answered your query (which should not have been posted in a 'stratified sampling' thread, but it's not easy to move now).

Also note -
1. It is wrong to refer to the habitat mask as ETA (effective trapping area). That concept (see Otis et al. 1978) is not relevant to spatially explicit capture-recapture.
2. The MCP of the cameras is also not relevant.
3. Although region.N may report a 'realised N' for a region smaller than the habitat mask, that number is almost meaningless as explained in Efford & Fewster: it is not reliable when some of the n detected animals may have been centred outside the region. 'expected N' is OK.
4. Whether region.N uses a derived density estimate (which, incidentally, uses g0 and sigma) or the parameter estimate D-hat from the fitted model, depends only on whether you fitted a model including D (i.e., CL = FALSE) or not (CL = TRUE).
5. In most cases the expected N is just the estimated density multiplied by the area of the region, so you can check the calculation by hand (the advantage of using region.N is that it gives a confidence interval).

I hope this helps.
Murray

Samundra Subba

unread,
Jul 10, 2018, 4:37:40 AM7/10/18
to Murray Efford, secr, Sabita Malla

Hi Murray and John,

 

Thank you so much for your response. It has really helped us in figuring out how N (realized and expected) are calculated.

 

For your point no. 3 (Murray’s email), our sampling effort has covered almost all tiger habitable areas so we believe that the activity centers lie within our sampling area. Therefore we’re going with realized N.

 

What we’re doing is we are trying to estimate N of each protected area of tiger habitable regions of Nepal. For that we’re running all possible secr models by first putting mask = buffer region and then estimating N of buffer and our sampling area. We have come across a problem and hope we get some insights from you.

 

When we ran the models (secr.fit) putting the mask= buffer region we got the estimate of realized and expected N with their S.E and CIs (Table 1 below). But when we put mask=our effective sampling area in region.N function, the values were not generated for  SE and CI for realized N. Please see below for better illustration.

 

Table 1: N estimate by putting mask = Buffer region

 

estimate

SE.estimate

lcl

ucl

n

E.N

120.5562

13.28566

97.20029

149.5242

85

R.N

120.5562

7.480136

108.6472

138.4628

85

 

Table 2: N estimate by putting mask = our sampling area

 

estimate

SE.estimate

lcl

ucl

n

E.N

74.51633

8.211922

60.07993

92.42159

85

R.N

91.36505

NA

NA

NA

85

 

The functions we used were: 
 
#Final selected model based on AIC fits
>Protected_area_model<-secr.fit (Protected Area, detectfn = 'EX', model = g0 ~ bk, mask=Buffer region, CL=TRUE, trace= TRUE, biasLimit=NA, verify=FALSE)
 
#N estimates of buffer and our sampling area
 

>N_Buffer<-region.N(Protected_area_model, region = Buffer region)

 

>N_our sampling area<-region.N(CNP2018_coreEXgbk, region = our sampling area)

 

 

Is this because of the datasets of the mask region? We need the estimate of R.N (all SE, lcl, ucl). The buffer size that we ran all the models first and estimate N of buffer was 3634.5 sq. km and for sampling area that we used to estimate N was 3189.7 sq. km. These mask were columned into “0”- non habitat and “1” habitat and the whole area were divided into mesh of 580x580 sq m mesh.

 

Another reason, we would like to use sampling area for estimating R.N is that our PA (protected area) is contiguous with the protected area in India as well as in Nepal and we do not want to overestimate the population including the overlapped region as well (please see attached map for your information). However we do think, it is a better step to run the whole contiguous protected areas together but for the management perspective of the government’s PA we need to do it this way.

 

Thank you for your patience and appreciate your help in this,

Best wishes,

Samundra

 

 

 

From: secr...@googlegroups.com [mailto:secr...@googlegroups.com] On Behalf Of Murray Efford
Sent: Saturday, July 7, 2018 7:37 PM
To: secr <secr...@googlegroups.com>
Subject: Re: stratified sampling

 

Hi Samundra

--

Reply all
Reply to author
Forward
0 new messages