Hello,
We are analyzing a multi-species marine bird dataset, where data is collected in 4 seasons over 3 years. We are using dht2 to estimate abundance of a subset of the observed species in each season. Our data is formatted as a flatfile, and segments without detections are represented by rows that includes Sample.Label and Effort, while NA for Species, size and distance.
This is the call to dht2:
qrt_dht2 <- dht2(ddf = df, # detection function using data for all seasons
flatfile = qrt_data, # filtered for species and season
strat_formula = ~Year,
stratification = "replicate",
convert_units = conversion.factor,
er_est = er_est_choice, # S2
sample_fraction = 0.5,
innes = FALSE)
We are not having problems with common species. Our issue occurs when trying to estimate abundance per season for less common species, specifically when that species has zero detections in one of the years. For example, in the “FebMar” season, we did not have any observations for Marbled Murrelets in 2020, with 34 observations in 2021, and 42 in 2022. We recognize that these sample sizes are low.
This is the error we are getting:
Error in `$<-`:
! Assigned data `diag(dm$variance)` must be compatible with existing data.
✖ Existing data has 3 rows.
✖ Assigned data has 2 rows.
ℹ Only vectors of size 1 are recycled.
Caused by error in `vectbl_recycle_rhs_rows()`:
! Can't recycle input of size 2 to size 3.
Backtrace:
1. Distance::dht2(...)
5. Distance::dht2(...)
6. base::lapply(ddf, varNhat, data = res)
7. Distance (local) FUN(X[[i]], ...)
9. tibble:::`$<-.tbl_df`(`*tmp*`, "df_var", value = `<dbl>`)
10. tibble:::tbl_subassign(...)
11. tibble:::vectbl_recycle_rhs_rows(value, fast_nrow(xo), i_arg = NULL, value_arg, call)
It seems that dht2 is expecting 3 replicate years (2020, 2021, 2022), but there are no observations in 2020. There are still rows in the dataset for 2020, representing the segments (samples) surveyed.
An important component of this issue is that we successfully ran dht2 on this same data in early 2022, when only 2 replicate years were completed (2020 and 2021). At that time, dht2 was able to compute average abundance over the 2 years in that season (with high uncertainty) without throwing an error so it seems that something has changed since the update to the Distance package in late 2022.
Thanks in advance for your input, happy to share the data off-list for troubleshooting if needed!
- Shanti
dht2 to perform estimation with a portion of the data used in fitting the detection function, does not seem correct to me. Nevertheless, you are the second person to describe this use case. I've re-read the documentation
for dht2 and I do not see this use case described. In my use of
dht2, the same data frame that was sent to ds is the data frame specified by the
flatfile argument in dht2. When the same data frame is not used in both cases, that gives rise to the errors you describe.season as a covariate in the detection function model. Duplicate the
season field as the Region.Label field before sending data frame to
ds, this way, the ds output will provide you with season-specific estimates (without the use of
dht2). If you want an average abundance across seasons, do that bit manually. The average is a weighted average, using season-specific effort as the weighting factor. Variability between surveys is calculated per formula below: