Correct use of the x,y smooth in a DSM and a soap film

84 views
Skip to first unread message

Rachel Richardson

unread,
Aug 17, 2022, 1:06:10 PM8/17/22
to distance-sampling

Thank you in advance for the excellent information you all share with this group! 

Brief background on my study: I have two years of line-transect data on a bird species with over 2,000 detections combined. My survey area includes two islands. I pooled data for both years and my top model included the covariates cluster size, observer, and habitat; therefore, I used ‘abundance.est’ in the DSMs because the covariates change within and between segments. 

I fit DSMs for each year separately because I want to get overall abundance estimates for each year and create two maps highlighting changes in bird distribution between years and on both islands. An example of a DSM for one of the years looks like this:

mod1 <- dsm (abundance.est~s(x,y) +

s(Elevation) + s(Slope) + s(NDVI) + s(DisttoCoast) + s(RockTundra), 

hr.bestfit.both, segs, obs2003, family=tw(), select=TRUE, method="REML")

Here’s where I am getting stuck. I am concerned that the x,y smooth is smoothing across the northern half of one island to the southern half of the other island (across the open water in between), but I am not sure that I need to be concerned because the islands are two separate polygons. I also tried using s(x,y, by=Island) in the DSM to parse the x,y by island.

I also tried using a soap film smoother, but because choosing the internal knots for the soap film seems somewhat arbitrary, I do not feel experienced enough with this approach to correctly create the grid of knots. Depending on the number of internal knots and placement of those knots, the spatial patterns are vastly different from the other two approaches, and I am not sure how to choose knots that will capture the true  distribution of the species. Using a soap film is appealing, but changes the patterns in distribution (and overall abundance estimates) quite a bit from my original DSM making it difficult to know what is correct.

(1) Does my original DSM suffice for making predictions across both islands? (2) Should I use the ‘by=Island’ argument instead? (3) Is there a good recommendation for how to correctly choose the number of internal knots for a soap film smoother?    

Thank you for any advice you can provide!

P.S. I have extensively reviewed Dave Miller's available R code and resources for creating a soap film smooth and have properly applied it to my study area. I am most interested in hearing if anyone has figured out how best to choose a grid of knots (i.e., spacing between knots, knot distance from the border of the polygons, choosing knots based on number of detections, etc.).

David Lawrence Miller

unread,
Aug 18, 2022, 4:08:22 AM8/18/22
to Rachel Richardson, distance-sampling
Hi Rachel, hi listfolk,

Couple of things to think about here...

The main use of the soap film smoother will be to deal with "leakage":
when, due to the complex shape of the study area, density leaks from one
part to another. Depending on the shape of your islands you might be
worried about this happening within (if there is, say a fjord with
different densities on either side) or between (as you describe). Using
by= would separate between islands but would not deal with within-island
effects (which you may or may not care about depending on the shape of
your islands).

The soap film should be no different from other smooths when it comes to
selecting the number of knots, the big difference is (as you say) where
to put them. I think that using a regular grid (as described in my
examples) is as good a method as any. What you should be able to find is
that by increasing the density of the knots you eventually get to a
place where the EDF is relatively far from k (as you would want for a 1D
smoother). For other smooths we recommend doubling k each time and
seeing what the EDF is (see https://arxiv.org/abs/1602.06696). Since
you're increasing the number of knots in 2 dimensions (N-S and E-W, say)
you probably don't want to double both at the same time as you'll soon
run out of degrees of freedom, but increasing them should still lead to
a stable EDF.

Back to your first question: it's hard to say whether your current model
is "good enough". I would recommend looking at things like the obs_exp
function in the dsm package to see how well things fit. You should also
use your biological knowledge to see if there are "hotspots" in places
where there shouldn't be. Other than that, normal GAM diagnostics should
also help.


Hope this is useful, feel free to contact me off-list if you require
further help and we can report back here.

cheers,
--dave
> --
> You received this message because you are subscribed to the Google
> Groups "distance-sampling" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to distance-sampl...@googlegroups.com
> <mailto:distance-sampl...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/distance-sampling/31f51227-4dec-45bf-8904-1cf5208bfd91n%40googlegroups.com
> <https://groups.google.com/d/msgid/distance-sampling/31f51227-4dec-45bf-8904-1cf5208bfd91n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
I am a member of the University and College Union and am currently
participating in industrial action to improve UK higher education
staff pension, pay, equality and working conditions. For more
information, please see
https://www.ucu.org.uk/article/11896/Why-were-taking-action.
Reply all
Reply to author
Forward
0 new messages