Expected and Realized N differ

452 views
Skip to first unread message

Eric H

unread,
Feb 17, 2015, 12:36:36 PM2/17/15
to secr...@googlegroups.com
Good day group members
I'm hoping someone can help me to better understand "Realized N" from the region.N function.
In a recent analysis, expected N and realized N differed (281 and 248 respectively).  I understand that expected N is  the volume under the density surface over the area defined by the mask, but I'm not completely clear on how realized N is estimated or why the estimates might differ.  I suspect that densities are lower in (or animals are absent from) some of the area defined by the mask, which extends beyond the area sampled, but I don't understand realized N well enough to be sure that this is what is causing the discrepancy.

Thank you,
Eric

Murray Efford

unread,
Feb 17, 2015, 2:23:42 PM2/17/15
to secr...@googlegroups.com
Hi Eric

It's disconcerting, isn't it! It's a phenomenon that deserves attention, but I don't think anyone has given it much thought.

Realised N is simply the number of animals actually detected (n) plus a model-based estimate of the number of animals in the region of interest that remain undetected. The probability of remaining undetected is a function of location (high at the edges, low near detectors). The 'model-based estimate' is the integral (sum) over the mask of the product of (estimated) local density and the local probability of remaining undetected.

In a model with homogeneous density and detection we do not see the effect you hypothesise. Simulations may convince you (takes just a minute):
library(secr)
grid
<- make.grid() ## default 6 x 6
pop
<- sim.popn(D = 10, grid, buffer = 100)
CH
<- sim.capthist(grid, pop, detectpar=list(g0 = 0.2, sigma = 25))
## estimate population across regions of increasing size
## outer zone (>100m) has zero density
buffer
<- seq(100,250,50)
out <- vector('list')
for (b in 1:4) {
    fit
<- secr.fit(CH, buffer = buffer[b], biasLimit = NA)    
   
out[[b]] <- region.N(fit)
}
EN
<- sapply(out, '[[', 'E.N', 'estimate')
RN
<- sapply(out, '[[', 'R.N', 'estimate')
RN
/EN
## [1] 1 1 1 1
plot (buffer, RN/EN, ylim=c(0.8,1.2))

Models with inhomogeneous density are not so well behaved, but in the few trials I've run I do not see a difference between E.N and R.N when detection is homogeneous. Perhaps the difference happens when factors in the model that control detection (i.e. lambda0/g0/sigma) are confounded with those controlling local density (D). Could this apply in your case?

Murray

Eric H

unread,
Feb 17, 2015, 3:27:08 PM2/17/15
to secr...@googlegroups.com
Thank you Murray,

It is a bit!  I modeled homogeneous density (actually, I maximized the conditional likelihood, so I didn't model density at all) so I'm not sure if the explanation you proposed applies.  There are a few ways the data and analysis don't quite match the models as they were intended to be used.  The data come from area searches for feces, but specific search areas were not defined, areas searched on different days would have differed but partially overlapped in some cases, and some spatial subsets of the study area were searched more often than others. I used a 500 buffer around the collected samples to define a single search area polygon (inappropriate, I know), and collapsed all the data into a single occasion.  The genetic information allowed us to distinguish between different "sub-populations" that occupy distinct spatial subsets of the greater study area.  I used sub-population membership as an individual covariate for g0 in an attempt to to account for the spatially variable search effort (imperfect again I know).  However, I don't see why these (ahem) "adjustments" would cause realized and expected N to differ. 
Also, I performed duplicate analyses, one where the entire fragmented study area was considered available habitat, and one where we used spatial data describing forest cover to exclude cleared areas from the mask (the latter mask had closer spacing to ensure it accurately depicted forest cover). Nearly all the samples were collected within or very close to forest patches.  As expected, density within forested areas was higher than across the mosaic, but, associated population estimates were very similar (realized N = 251 where the mask had uniform point spacing across the whole study area (extending 3 km beyond the search area polygon), and realized N = 248 where points in cleared areas were excluded (same buffer size)). 

Could an inappropriate buffer size cause expected and realized N to differ?  E.g. realized N would be less than expected N if the buffer was too large, because expected N includes more animals at the periphery of the masked area?  Or does your example demonstrate that buffer size has no effect?  In our case, sigma (half normal) was about 1 km, and I used a 3 km buffer around the search polygon (and therefore a 3500 m buffer around sample locations) when defining the masks.  I don't think I would want to use a smaller buffer when fitting the model, and according to the help files, the definition of realized N doesn't really hold when the mask used with region.N is smaller than the one used when fitting the model...

Does any of the above give any clues to why expected and realized N differ in this case?

Thanks again,
Eric

Murray Efford

unread,
Feb 17, 2015, 4:58:37 PM2/17/15
to secr...@googlegroups.com
according to the help files, the definition of realized N doesn't really hold when the mask used with region.N is smaller than the one used when fitting the model...

I can't put my finger on this reference, but it's fairly obvious: realised N is based on the absolute assumption that all of the n observed animals come from the region of interest. If you make that assumption when the region of interest is small and animals are actually being detected from outside that region then you're in trouble.

Eric H

unread,
Feb 17, 2015, 5:19:39 PM2/17/15
to secr...@googlegroups.com
Yes, and the masks don't seem to be the issue in any case... I will experiment a bit.
Unfortunately, it's another one of these data sets where each model takes a day to fit.
Thanks again,
Eric

Murray Efford

unread,
Feb 17, 2015, 5:54:58 PM2/17/15
to secr...@googlegroups.com
I don't have the patience to work with models like that - there's usually a way to simplify things (coarser mask, collapsing to counts over occasions, discretizing polygons, dropping unnecessary covariates) that will fit in minutes rather than hours, especially for exploratory work.
Murray

Eric H

unread,
Feb 23, 2015, 10:29:10 AM2/23/15
to secr...@googlegroups.com
Good day,

The difference between Expected and Realized N appears to be related to how I modeled variation in g0.  EN and RN only differ when I allow g0 to vary among genetically recognizable "communities" of individuals, which occupy different spatial subsets of the greater study area.  I modeled this variation in g0 as an individual covariate in a CL model in an attempt to account for variable search effort among the different spatial subsets (not really an appropriate way to model the variation in search effort).  Differences in g0 among communities were strongly supported by AIC.  Example results below.

# a null model
> null.fc4 = secr.fit(ch, mask = fc.mask4, CL = TRUE, verify = FALSE)
> region.N(null.fc4)
    estimate SE.estimate      lcl      ucl   n
E.N 259.3761    19.43791 223.9903 300.3520 182
R.N 259.3761    10.88377 240.8115 283.8008 182       # estimates identical

# sigma varies between sexes
> sig.sex.fc4 = secr.fit((ch, model = list(g0~1, sigma ~sex), mask = fc.mask4, CL = TRUE, start = null.fc4, verify = FALSE)
> region.N(sig.sex.fc4)
    estimate SE.estimate      lcl      ucl   n
E.N 260.3173    19.50589 224.8075 301.4362 182
R.N 261.9360    10.96186 243.1727 286.4544 182       # estimates similar 

# g0 varies among communities
> g0comm.fc4 = secr.fit(ch, model = list(g0~comm, sigma~1), mask = fc.mask4, CL = TRUE, start = g0comm.fc, verify = FALSE)
> region.N(g0comm.fc4)
    estimate SE.estimate      lcl      ucl   n
E.N 283.0851    23.63826 240.4159 333.3273 182
R.N 249.0358    16.60368 223.5536 290.1447 182       # estimates differ

# g0 ~ community and sigma ~ sex
> g0comm.sig.sex.fc4 = secr.fit(ch, model = list(g0~comm, sigma~sex), mask = fc.mask4, CL = TRUE, start = g0comm.fc, verify = FALSE)
    estimate SE.estimate      lcl      ucl   n
E.N 284.9932    23.93293 241.8124 335.8849 182
R.N 253.2671    16.96443 226.9836 294.9078 182       # estimates differ

Murray, you had asked whether factors controlling detection could be confounded with those controlling density, and yes, the different communities may occur at different densities. I haven't yet come up with a good way to model this variation, because we don't have good information describing the territories of the different communities, and some communities were likely only sampled within a subset of their territory.  However, we are compiling spatial information describing search effort as a grid.  We hope to use that to better describe spatially variable search effort in the secr model.  I'm not sure whether or not this will help if we continue to model density as homogeneous and variation in g0 is confounded with variation in density.

Any suggestions or comments welcome.

Incidentally, I found that I was able to reduce the resolution of the mask somewhat to reduce processing time from days to hours, but with further decreases in resolution the actual distribution of habitat (forest) was not well-represented, and the associated area of habitat within the masked region changed.

Eric

Reply all
Reply to author
Forward
0 new messages