bootdht returns data.frame with 0 columns and 0 rows

Milou Groenenberg

unread,

Feb 20, 2023, 10:10:18 PM2/20/23

to distance-sampling

Dear group

I am trying out the code to derive the sampling distribution of pairwise differences in estimated density using the bootstrap as in this vignette. Unfortunately, I seem to be having a problem with the “bootdht” function when applying it to my data and model (it works fine using the example datasets of the vignette). My data is for one species and one area across 4 different survey years. If have used the different years as Region.Label to have stratified results. I am using formula = ~size to accommodate the cluster size bias that was clearly observed for this species. My model:

BSDPPWS_hn_t81_size <- ds (data=BSDPPWS,

truncation=81,

key="hn",

adjustment=NULL,

convert_units=conversion.factor,

formula=~size)

The problem arises when using the bootdht function, for example:

est.bootn <- bootdht(model=BSDPPWS_hn_t81_size,

flatfile=BSDPPWS,

summary_fun=bootdht_Nhat_summarize,

convert_units=conversion.factor,

sample_fraction=1,

nboot=10, cores=1)

*Note: nboot was set low for trying out the code

Est.bootn is then returned as a data.frame with 0 columns and 0 rows

I don't have any problems obtaining the results from my model using the conventional summary() function.

I can see a difference between my model object and those of the example datasets: my model$dht is a list[3}(S3:dht) and includes individuals, as well as clusters and Expected.S. The example dataset only include individuals. Could something be going wrong there when calling the boothdt function?

Finally, for another dataset where estimation was derived via multiple call to ds(), I would also like to compare density differences using bootstrapping. In the limitations sections of the vignette it is stated “However, based upon the provided code, it should be clear how to produce replicate density estimates via bootdht() and then difference them with a single line of code.” Would this be achieved by merging the bootdht() results from the two different models into one outcome dataframe and then running the remaining code as is?

Many thanks in advance for your support!

Milou

Eric Rexstad

unread,

Feb 21, 2023, 5:29:32 AM2/21/23

to Milou Groenenberg, distance-sampling

Milou

I'm trying to duplicate the output you report (data.frame with 0 columns and 0 rows). Thus far, I've not been able to produce that output when performing a bootstrap on a data set with multiple strata and group size (namely the data set ClusterExercise that ships with the Distance package).

To help pinpoint the issue can I ask

what version of the Distance package are you using (version 1.0.7 is the current CRAN version)
does your BSDPPWS data frame contain a column for area of the strata? If area is not provided, an abundance estimate cannot be produced (only a density estimate). Because you are harvesting the abundance estimates from the bootstrap via the bootdht_Nhat_summarize function, if those abundances do not exist, code might return a data.frame with 0 columns and 0 rows). Recognize there is a companion summary function bootdht_Dhat_summarize for use when inference is based around density estimates.

In answer to your second question, if you have separate bootstrap outputs for the comparison you wish to make, I think the remedy is only slightly more complicated than merging the two vectors of replicate estimates, but not much. From the merged data frame, compute the difference for each pair of estimates, make a histogram (an compute quantiles associated with the alpha level of interest). Finally you can compute the proportion of differences that are greater than or less than zero to produce your significance value. If that's not clear, we can follow up off line if you wish.

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Milou Groenenberg <groenenb...@gmail.com>
Sent: 21 February 2023 03:10
To: distance-sampling <distance...@googlegroups.com>
Subject: [distance-sampling] bootdht returns data.frame with 0 columns and 0 rows

--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/6d7868e9-023f-411c-9032-50870971921cn%40googlegroups.com.

Milou Groenenberg

unread,

Feb 21, 2023, 9:12:43 PM2/21/23

to Eric Rexstad, distance-sampling

Hi Eric

Thanks for the quick response.

To answer your questions:

- I am using Distance_1.0.7 (and mrds_2.2.8), I recently updated this using a code that I saw you posted in another email chain with Esperanza:

By updating the Distance package, I mean updating from Github, rather than updating from CRAN. For a Github update, use these two lines of code:

install.package("remotes")

remotes::install_github("DistanceDevelopment/Distance")

- My data includes a column for Area (in sq km) - see attached data extract

I did notice that Area was of type integer rather than numeric. I changed this, and I still get the zero dataframe response.

Using the summary() function on the ds model object, I get abundance with CVs etc. just fine

Thank you for the answer to my second question. I can move ahead with this and if I have any further questions, I will email you off-group.

Best wishes

Milou

BSDPPWS_Extract.csv

Eric Rexstad

unread,

Mar 15, 2023, 4:33:40 AM3/15/23

to Milou Groenenberg, distance-sampling

Milou/list

There has been a long journey to sort out this issue with bootstrapping with stratified surveys.

We have sorted most of the problems. Most recently we uncovered an exotic bug in the code (read https://github.com/DistanceDevelopment/Distance/issues/158 if you want all the details).

Stratum names which come after "Total" alphabetically cause NA's in bootstrap dht output · Issue #158 · DistanceDevelopment/Distance

Comment by Laura: the anomaly in the effort associated with transect PPWS24-2016 was due to the transect both being included in the data with observations and without as though it had been surveyed...

github.com

The resulting advice is this. If intending to bootstrap, give names to your


Region.Label

that begin with letters that appear in the alphabet before T. We hope to get the bug sorted, but meanwhile follow this advice when naming your strata for analysis.

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Milou Groenenberg <groenenb...@gmail.com>

Sent: 22 February 2023 02:12
To: Eric Rexstad <Eric.R...@st-andrews.ac.uk>
Cc: distance-sampling <distance...@googlegroups.com>
Subject: Re: [distance-sampling] bootdht returns data.frame with 0 columns and 0 rows

To view this discussion on the web visit https://groups.google.com/d/msgid/distance-sampling/CABYAi7zBTkM28uTmQ8KF7_voSQEOFyWPTcKdbF3erD9nXpoGYw%40mail.gmail.com.

Reply all

Reply to author

Forward