Choosing the right detector for study design

187 views
Skip to first unread message

cbpozz...@alaska.edu

unread,
Apr 3, 2015, 1:38:28 PM4/3/15
to secr...@googlegroups.com
Hello secr users,
I have genetic data from coyote and red fox scat data collected over 2 winters. The scats were collected in an opportunistic fashion while doing snow tracking surveys. We did not define strict sampling areas prior to beginning collections. Basically, while moving around our general study region counting tracks, we would collect any scats that we came across. My question is about choosing the correct detector method to use for this type of data. Currently, I am experimenting with the polygon detector type. I have drawn several polygons around the main clusters of scat locations, and have done so in such a way that I am confident that sampling intensity was consistent within each polygon. Is this type of post hoc detector selection appropriate? If not, can you suggest an alternative? Thanks

Casey

Christopher Sutherland

unread,
Apr 3, 2015, 1:42:28 PM4/3/15
to cbpozz...@alaska.edu, secr...@googlegroups.com

HI Casey,

 

It sounds like you didn’t keep a gps track log for the searches – just wanted to confirm that that is correct?

 

Chris

 

 

Chris Sutherland PhD

Postdoctoral Research Associate

Department of Natural Resources

New York Cooperative Fish and Wildlife Research Unit

Cornell University

W: webpage

E:  cs...@cornell.edu

T:  (607) 255 – 4654

--
You received this message because you are subscribed to the Google Groups "secr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to secrgroup+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

cbpozz...@alaska.edu

unread,
Apr 3, 2015, 1:47:55 PM4/3/15
to secr...@googlegroups.com, cbpozz...@alaska.edu, cs...@cornell.edu
That is correct. We had defined track logs for our snow tracking surveys, however the majority of our scats were collected away from those designated survey routes. This is why I decided that a transect detector type would not work

Christopher Sutherland

unread,
Apr 3, 2015, 1:55:27 PM4/3/15
to cbpozz...@alaska.edu, secr...@googlegroups.com

Okay, that’s what I thought.

I don’t really have any experience with doing this, but I remember some discussion about it in an earlier thread. Here, it seems somebody (Eric H), had done something similar, i.e. define a buffer around the scat clusters, and analysed the data the same way you are (it looks like!).

 

Here is the link to the thread: https://groups.google.com/forum/#!searchin/secrgroup/area$20search/secrgroup/FcB8f4rB1Qg/NmCKO-ikJ9QJ

 

Perhaps Eric H has something to add about that – good luck!

 

Chris

 

Chris Sutherland PhD

Postdoctoral Research Associate

Department of Natural Resources

New York Cooperative Fish and Wildlife Research Unit

Cornell University

W: webpage

E:  cs...@cornell.edu

T:  (607) 255 – 4654

 

Murray Efford

unread,
Apr 3, 2015, 3:40:40 PM4/3/15
to cbpozz...@alaska.edu, secr...@googlegroups.com, Christopher Sutherland
Hi Casey

A purist might say "these data lack the spatial control needed for analysis, don't do it", but that might waste some good, if imperfect, data. I think the key problem is not what you should do, but how you are going to convince a sceptical reader or reviewer. There is obvious potential for bias if you narrow the nominal search polygons down to places you found scat, as then no such polygon can be empty. This risk gets less as you zoom out and blur the boundaries, so I would tend toward delineating the largest polygons in which you felt scats might have been found (if you had wandered differently on the day). My comment about convincing sceptics leads on to the idea that you might need to simulate the effect on estimates of various polygon-construction rules. That's more work, but at least you'll be warm and dry in the office, and perhaps it's the price you pay... It might be hard to design convincing simulations as you have to capture the possible bias in your sampling.

As you might have noticed from other posts, I'm not so sure the strict 'polygon' detector types are the best solution because they require some special integration code in secr that I have not used much myself. The alternative is to discretize the searched area as a set of pixels (grid squares) and treat each as a point (proximity) detector located at its centre. If you use no more than a few hundred pixels this is likely to be faster and give essentially the same answer.

I'll be interested to hear what Eric has to say on this.

Murray

cbpozz...@alaska.edu

unread,
Apr 3, 2015, 4:47:23 PM4/3/15
to secr...@googlegroups.com, cbpozz...@alaska.edu, cs...@cornell.edu
Thanks for the responses Murray and Chris. I did indeed foresee the biggest challenge of adapting my data to this approach would be the justification of the method to skeptical reviewers. I had original considered using a proximity detector like Murray has suggested, but my biggest concern was related to the 2 fairly separate sampling locations we had. We basically had 2 sampling sites (which I currently have drawn polygons around) but a fair amount of open, un-surveyed space in between them. It doesn't make sense to put one large grid over the entire region because there would be too much areas where sampling effort was 0. Could I use 2 grids (one over each of the 2 distinct sampling sites) and still obtain a single density estimate? 

Murray Efford

unread,
Apr 3, 2015, 5:23:05 PM4/3/15
to cbpozz...@alaska.edu, secr...@googlegroups.com, Christopher Sutherland
No problem - just as an array can have any configuration, your collection of 'point detectors' (grid cells) can cluster in different parts of the landscape. For speed it helps to define the mask by buffering around the detectors, possibly excluding space between the clusters.

I'd emphasise that this question of whether to discretize or not is separate from your original question.
Murray

Eric H

unread,
Apr 7, 2015, 12:31:56 PM4/7/15
to secr...@googlegroups.com
Hi Casey et al.

I was happy to see this post, as we have been dealing with a similar situation.  Scats were collected opportunistically throughout a large (> 600 sqkm) study area.  Searching for scats was the main focus of field work, but a systematic survey design was not established prior to collecting the data.  GPS track logs were not used to delineate areas searched, however, records were kept describing whether or not, and how many times, any part of each (1km X 1km) cell in a grid covering the entire study area was visited while searching for scat.  The majority of cells were never searched, and there were many cells that were searched but where no scat was found. We defined post hoc search area polygons (polygon detector type, with detections modeled as counts during a single survey rather than as binary data within sampling occasions) as the perimeters of searched grid cells (in the case where adjacent cells were not searched), and as the perimeters of aggregations of cells where >1 adjacent cells were searched at least once.  This is different than Murray's suggestion of treating the centroid of each searched cell as a proximity detector, but might result in a very similar detection model if cells were sized appropriately (??)).  In short, because field staff recorded the cells that they searched within on each day that they searched, we were able to describe (approximately) the spatial distribution of search effort across the greater study area in the model.  Might there be any way you can recover similar information?

We also looked at merged buffers around individual samples collected, and MCPs around all, and around clusters of, collected samples as ways to approximate the area searched in an SECR model with polygon detectors.  Buffered areas were smaller than the search area polygons defined as described above because they excluded areas that were searched but no scat was found. MCPs were generally larger, because they included large areas between sample collection locations that were never actually searched.  We suspect that using a buffer or MCP around locations where samples were actually collected as the search area polygons could have introduced bias, in the first case (buffers - similar to your initial approach I think) because it would appear that samples were found everywhere they were searched for, and in the second case (MCPs) because it would appear as though large areas that were searched yielded no detections when in fact they were never searched.  We therefore preferred to use the perimeter of searched grid cells to define the the search area polygons, even though this is only an approximation because entire cells were not necessarily searched.  We were fortunate to have the gridded data describing search effort.  Casey, if you go with the polygon detector type, it might be useful to try to discriminate between areas where you might have found scat but didn't vs. those where you couldn't have found scat even if it was there.  Include all areas where you might have found scat if it was there (because you were walking and not driving? or because you were walking and not too busy to notice and collect scat?) as (part of) a search area polygon, and exclude other areas from search area polygons.  This would ensure your polygon detectors allow for the fact that some areas yielded no samples even though you "searched" there, whereas others yielded no samples because you didn't search there.

A second issue we had to deal with was the fact that (parts of) some cells were searched a large number of times, whereas other cells were searched rarely or only once.  We did have the data describing the number of times each cell was visited, so we were able to model variation in search effort among search area polygons using the method described in Efford, Borchers and Mowat (2013) and implemented in the secr package as the 'effort' attribute of the traps object.  We defined effort as the number of times a cell was visited for single-cell polygons, and as the mean number of times each cell in multi-cell polygons was visited.  We obtained lower AICc values and different density estimates when we modeled the variation in search effort across search area polygons vs. when we ignored it, and will present only estimates where we modeled the variation.

A third issue was severe habitat heterogeneity.  Forests comprised about 17% of the greater study area, distributed in fairly small fragments throughout it.  Animals relied on the forest fragments and all samples were collected in forests.  Therefore, (1) although we defined areas searched as (aggregations of) square 1km X 1km grid cells, only the forest fragments within these cells would have been searched; and (2) if areas searched contained a higher proportion of forest than areas not searched, we would overestimate population size on the greater study area if we ignored habitat heterogeneity and also extrapolated densities estimated from data collected on the areas searched across the greater study area. We therefore used spatial data describing forest cover to build the habitat masks used for density and population size estimation.  Density estimates were therefore specific to forested areas within and in the vicinity of the areas searched.  We assumed that density in forest fragments outside the areas searched was similar, and extrapolated across forested habitat within the greater study area to estimate population size (using the region.N function).  Because it was of interest for comparing with results of other studies, we also estimated density "across the fragmented landscape", i.e. assuming homogeneous habitat, using a mask with uniform point spacing throughout.  The associated density estimate was, as expected, much lower.  For my own interest, I estimated population size assuming habitat homogeneity, and the estimate was considerably higher than where we used the forest cover data to define areas where activity centers could occur.  I suspect this was because areas searched included a higher proportion of forest than the greater study area on average.  These differences may not be relevant to your study if you sampled within a large expanse of contiguous habitat, but it's always a good idea to carefully consider the area across which you extrapolate your density estimates if you estimate population size using "region.N".  Areas where activity centers could not occur (e.g. large bodies of water, cleared, built-up or urban areas) should usually be excluded from masks used for population size estimation, for example if you only searched for scat in areas used by foxes, but the greater study area includes areas not used by them.

Casey, I realize that this may not be very useful to you if you really have no data describing areas that you searched but found no samples, or have no information about areas/habitat types that foxes use vs those they don't.  Maybe it will help to see how we tried to cope with a somewhat similar situation ("opportunistic" fecal sampling with no a priori survey design), using the data available.

Our manuscript has not been submitted yet, but there's potential for reviewers to poke holes in how we used our data to fit an SECR model when really samples were not collected following a specific design required by the models.  I'd be grateful to hear any thoughts or comments about the approach we took, or about the general feasibility of "mashing" data from "opportunistically" or "haphazardly" collected scat into an SECR model.

All the best,
Eric

Eric H

unread,
Apr 7, 2015, 12:53:29 PM4/7/15
to secr...@googlegroups.com
Following up on the fact that Chris included a link to my previous post (Expected and Realized N differ):  When we modeled the variation in search effort across search area polygons using the gridded data and the methods of Efford, Borchers, and Mowat (2013) instead of as an effect of community on g0, expected and realized N were virtually identical.
Eric

Murray Efford

unread,
Apr 7, 2015, 1:21:38 PM4/7/15
to Eric H, secr...@googlegroups.com
Great stuff, Eric. I don't have your knowledge based on detailed examination of a real example, but I'll emphasise a couple of points from your main post. Your third issue relates to an Achilles heel of the SECR approach when the real concern is total number: what you get depends very much on the extent of suitable habitat and how you have sampled this. I don't rule out that non-spatial CR may be better in some cases.

And re-stating the obvious: if you (Casey) are going to make an arbitrary decision in the analysis to make up for weak data collection then you really need to present a spectrum of analyses to show how the decision impacts (or ideally does not impact) on the results. Appendices and online supplements were made for this!

To avoid confusion: the relevant attribute in secr is called 'usage' not 'effort'. Also, 'mash' is a technical term in secr with a meaning different to Eric's 'mashing' ;-)  I ran out of words one day...

Murray

--
Reply all
Reply to author
Forward
0 new messages