Hi Casey et al.
I was happy to see this post, as we have been dealing with a similar situation. Scats were collected opportunistically throughout a large (> 600 sqkm) study area. Searching for scats was the main focus of field work, but a systematic survey design was not established prior to collecting the data. GPS track logs were not used to delineate areas searched, however, records were kept describing whether or not, and how many times, any part of each (1km X 1km) cell in a grid covering the entire study area was visited while searching for scat. The majority of cells were never searched, and there were many cells
that were searched but where
no scat was found. We defined post hoc search area polygons (polygon detector type, with detections modeled as counts during a single survey rather than as binary data within sampling occasions) as the perimeters of searched grid cells (in the case where adjacent cells were not searched), and as the perimeters of aggregations of cells where >1 adjacent cells were searched at least once. This is different than Murray's suggestion of treating the centroid of each searched cell as a proximity detector, but might result in a very similar detection model if cells were sized appropriately (??)). In short, because field staff recorded the cells that they searched within on each day that they searched, we were able to describe (approximately) the spatial distribution of search effort across the greater study area in the model. Might there be any way you can recover similar information?
We also looked at merged buffers around individual samples collected, and MCPs around all, and around clusters of, collected samples as ways to approximate the area searched in an SECR model with polygon detectors. Buffered areas were smaller than the search area polygons defined as described above because they excluded areas that were searched but no scat was found. MCPs were generally larger, because they included large areas between sample collection locations that were never actually searched. We suspect that using a buffer or MCP around locations where samples were actually collected as the search area polygons could have introduced bias, in the first case (buffers - similar to your initial approach I think) because it would appear that samples were found everywhere they were searched for, and in the second case (MCPs) because it would appear as though large areas that were searched yielded no detections when in fact they were never searched. We therefore preferred to use the perimeter of searched grid cells to define the the search area polygons, even though this is only an approximation because entire cells were not necessarily searched. We were fortunate to have the gridded data describing search effort. Casey, if you go with the polygon detector type, it might be useful to try to discriminate between areas where you might have found scat but didn't vs. those where you couldn't have found scat even if it was there. Include all areas where you might have found scat if it was there (because you were walking and not driving? or because you were walking and not too busy to notice and collect scat?) as (part of) a search area polygon, and exclude other areas from search area polygons. This would ensure your polygon detectors allow for the fact that some areas yielded no samples even though you "searched" there, whereas others yielded no samples because you didn't search there.
A second issue we had to deal with was the fact that (parts of) some cells were searched a large number of times, whereas other cells were searched rarely or only once. We did have the data describing the number of times each cell was visited, so we were able to model variation in search effort among search area polygons using the method described in Efford, Borchers and Mowat (2013) and implemented in the secr package as the 'effort' attribute of the traps object. We defined effort as the number of times a cell was visited for single-cell polygons, and as the mean number of times each cell in multi-cell polygons was visited. We obtained lower AICc values and different density estimates when we modeled the variation in search effort across search area polygons vs. when we ignored it, and will present only estimates where we modeled the variation.
A third issue was severe habitat heterogeneity. Forests comprised about 17% of the greater study area, distributed in fairly small fragments throughout it. Animals relied on the forest fragments and all samples were collected in forests. Therefore, (1) although we defined areas searched as (aggregations of) square 1km X 1km grid cells, only the forest fragments within these cells would have been searched; and (2) if areas searched contained a higher proportion of forest than areas not searched, we would overestimate population size on the greater study area if we ignored habitat heterogeneity and also extrapolated densities estimated from data collected on the areas searched across the greater study area. We therefore used spatial data describing forest cover to build the habitat masks used for density and population size estimation. Density estimates were therefore specific to forested areas within and in the vicinity of the areas searched. We assumed that density in forest fragments outside the areas searched was similar, and extrapolated across forested habitat within the greater study area to estimate population size (using the region.N function). Because it was of interest for comparing with results of other studies, we also estimated density "across the fragmented landscape", i.e. assuming homogeneous habitat, using a mask with uniform point spacing throughout. The associated density estimate was, as expected, much lower. For my own interest, I estimated population size assuming habitat homogeneity, and the estimate was considerably higher than where we used the forest cover data to define areas where activity centers could occur. I suspect this was because areas searched included a higher proportion of forest than the greater study area on average. These differences may not be relevant to your study if you sampled within a large expanse of contiguous habitat, but it's always a good idea to carefully consider the area across which you extrapolate your density estimates if you estimate population size using "region.N". Areas where activity centers could not occur (e.g. large bodies of water, cleared, built-up or urban areas) should usually be excluded from masks used for population size estimation, for example if you only searched for scat in areas used by foxes, but the greater study area includes areas not used by them.
Casey, I realize that this may not be very useful to you if you really have no data describing areas that you searched but found no samples, or have no information about areas/habitat types that foxes use vs those they don't. Maybe it will help to see how we tried to cope with a somewhat similar situation ("opportunistic" fecal sampling with no a priori survey design), using the data available.
Our manuscript has not been submitted yet, but there's potential for reviewers to poke holes in how we used our data to fit an SECR model when really samples were not collected following a specific design required by the models. I'd be grateful to hear any thoughts or comments about the approach we took, or about the general feasibility of "mashing" data from "opportunistically" or "haphazardly" collected scat into an SECR model.
All the best,
Eric