Different data/obs_table for ds() and dht()

26 views
Skip to first unread message

Catarina T. Fonseca

unread,
May 8, 2024, 8:52:39 AMMay 8
to distance-sampling

Hello,


I am using distance sampling to estimate the abundance of several cetacean species using a line transect survey dataset. My goal is to obtain abundance estimates of each species for three regions (North, Centre and South).

However, I am quite limited in terms of number of sightings and to deal with this issue I am mostly using pooled detection functions.


I pooled the following data:

- sightings from another survey that was carried in almost the same conditions (same protocol, study area and boat; few observers participated in both surveys)

- incidental and off-transect sightings (given that the effort off-transect was the same as when we were on-transect)

- species that are expected to have similar detectability (small dolphins, beaked whales,…)


I also tested multiple covariates that may affect detectability and included them only if they improved model fit:

- environmental factors (sea state, cloud cover,…)

- observer

- cluster size

- species

- region (N, C or S)


My current approach (example below) is to use the pooled dataset to fit a detection function and then apply it with dht using an observation table containing only the sightings on-transect of a single species.


df_hn <- ds(data=jointdata, key="hn", truncation = 1.1, adjustment=NULL, convert_units = conversion)


mb_trunc <- subset(mb, distance <= 1.1) # remove truncated sightings


N_df_hn_l <- dht(model=df_hn$ddf,

                              region.table=AreaDf,

                              sample.table=lifeEffDf,

                              obs.table=mb_trunc)


where jointdata is the pooled data of all sightings across surveys, regions and beaked whale species, and mb_trunc only has the on-transect sightings of a single species with a distance equal to or inferior to the truncated distance.


dht OUTPUT:

Abundance and density estimates from distance sampling
Variance       : R2, N/L

Summary statistics

  Region      Area CoveredArea   Effort n  k          ER       se.ER     cv.ER
1      C  7990.444   2357.1526 1071.433 2  5 0.001866659 0.001656072 0.8871851
2      N 13596.382   1991.9152  905.416 3  5 0.003313394 0.001328045 0.4008110
3      S  2800.754    427.0376  194.108 0  3 0.000000000 0.000000000 0.0000000
4  Total 24387.580   4776.1054 2170.957 5 13 0.002458858 0.000000000 0.0000000

Summary for clusters

Abundance:
  Region Estimate       se        cv       lcl       ucl       df
1      C 11.91980 10.62617 0.8914721  1.448883  98.06285 4.077857
2      N 36.00223 14.76858 0.4102128 12.500859 103.68572 4.388206
3      S  0.00000  0.00000 0.0000000  0.000000   0.00000 0.000000
4  Total 47.92203 18.37310 0.3833957 20.459492 112.24720 8.156874

Density:
  Region    Estimate           se        cv          lcl         ucl       df
1      C 0.001491756 0.0013298593 0.8914721 0.0001813269 0.012272516 4.077857
2      N 0.002647927 0.0010862137 0.4102128 0.0009194254 0.007625979 4.388206
3      S 0.000000000 0.0000000000 0.0000000 0.0000000000 0.000000000 0.000000
4  Total 0.001965018 0.0007533793 0.3833957 0.0008389308 0.004602638 8.156874

Summary for individuals

Abundance:
  Region  Estimate       se        cv       lcl      ucl       df
1      C  53.63909 47.81775 0.8914721  6.519971 441.2828 4.077857
2      N 144.00892 64.65848 0.4489894 45.333788 457.4639 4.320323
3      S   0.00000  0.00000 0.0000000  0.000000   0.0000 0.000000
4  Total 197.64801 81.14837 0.4105701 79.759109 489.7840 8.137832

Density:
  Region    Estimate          se        cv          lcl        ucl       df
1      C 0.006712904 0.005984367 0.8914721 0.0008159711 0.05522632 4.077857
2      N 0.010591709 0.004755565 0.4489894 0.0033342538 0.03364600 4.320323
3      S 0.000000000 0.000000000 0.0000000 0.0000000000 0.00000000 0.000000
4  Total 0.008104453 0.003327446 0.4105701 0.0032704806 0.02008334 8.137832

Expected cluster size
  Region Expected.S se.Expected.S cv.Expected.S
1      C   4.500000      0.000000     0.0000000
2      N   4.000000      1.054093     0.2635231
3      S   0.000000      0.000000     0.0000000
4  Total   4.124367      0.803449     0.1948054


However, I saw in other posts that in your experience the data used in ds() should be the same used in dht2...

Therefore my question is if i can use this approach and if not, what are my alternatives given my limited number of sightings?


Thank you in advance for your time!

Eric Rexstad

unread,
May 8, 2024, 11:01:35 AMMay 8
to Catarina T. Fonseca, distance-sampling
Catarina

Thanks for joining the list. You are faced with a challenging situation. From the output you have shared, you are hoping to make inference about the population size in three regions based on five detections of the "mb" species. Based solely upon information from five sightings, you are going to struggle to produce defensible estimates.

Carrying out the analysis you describe, the confidence interval around the number of animals in the central region is (6, 441) and for the northern region the interval is (45, 457). The  coefficients of variation (0.89 and 0.45, respectively) also tell you there is little information from those 5 sightings to help you estimate the number of "mb" in your study area.

On to the finer points of your question: you are correct that I have cautioned against using the dht2​ function in the manner you have used dht​. This is because other writers to the list have encountered errors when doing so; and I suspect there maybe something in the depths of the dht2​ code that is different that in the dht​ code.

In summary, if you are intent on estimating abundance of "mb" from this survey, I suggest you present the confidence intervals along with the point estimates. Those intervals will indicate to the consumers of your report that there is extreme uncertainty regarding the number of individuals of that species in your study area.

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Catarina T. Fonseca <catarina...@gmail.com>
Sent: 08 May 2024 12:07
To: distance-sampling <distance...@googlegroups.com>
Subject: [distance-sampling] Different data/obs_table for ds() and dht()
 
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/d34e43fe-1063-447b-bc1b-ccf9941ecf7en%40googlegroups.com.

Catarina T. Fonseca

unread,
May 9, 2024, 12:04:43 PMMay 9
to distance-sampling
Hello Eric,

First, thank you for your response and advice.

I am thinking that maybe I should be a little less "greedy" and simplify my analysis.
Probably obtain estimates per group of species instead, e.g., small dolphins, by pooling the on-transect sightings of the same species I used for the detection functions. I have already tried it and this lowers significantly the coefficients of variation. Moreover, maybe I could still also present the results per species while stating that these aren't as reliable.

Another option I may test is to not stratify my analysis and just produce one single estimate for the whole study area. However I think that with this approach I still won't have enough sightings to produce defensible estimates for some species.

Reply all
Reply to author
Forward
0 new messages