Error in covariate modeling functions in R package Distance

42 views
Skip to first unread message

Arrow Myers

unread,
Dec 10, 2024, 1:50:00 AM12/10/24
to distance-sampling
Hi all,
I am an undergrad student working on a project measuring the species-specific relationships between bird populations and bark beetle caused tree mortality in a wilderness area of Colorado. I have a fairly large dataset of 160 points in which distance sampled point counts were conducted over this past summer along with several field-collected and remotely sensed covariates corresponding to each point (and observation in the dataset). So far, I have had generally good success working through the analysis using the Distance package in R, but have continually encountered the following error when using the "formula=" argument in data subsetted for different species to model relationships with the other covariates: "Error in checkdata(data, region_table, sample_table, obs_table, formula) :Variable(s): obsRCKI are in the model formula but not in the data." This is just an example, but I continually get the same error in all other covariates (ndviclass, ndvi, temp, wind, elevation, timecategory, mortalityseverity, etc.) being modeled, and for all other species-subsetted datasets. I do not however get the error in the non-subsetted dataset (see screenshot) or when the analysis is run without any covariates in the subsetted data (see screenshot). I have ensured there are no missing values anywhere in the subsetted datasets, and have checked the data types to ensure numeric values are in fact numeric (as.numeric function) and character values are read as such (as.character function). 

I have also ensured that I am using the most current version of R, RStudio, and the Distance package for the analysis. I have included screenshots of my dataset, my code for subsetting the data and subsequent attempts for analysis at the species level. To rule out errors in subsetting, I have also attempted exporting the subsetted data to a csv, checking it over in Excel, and re-importing it with the code using the newly imported csv - same error. Potentially this is an easy fix as I am only in my first few months of using the distance package, but have spent substantial time attempting to resolve it with no luck so far. Any help or suggestions would be greatly appreciated!

Thanks so much,
Arrow Myers
Western Colorado University
NoCovariateHETHModels.png
HETHCovariateModelsERROR.png
HETHDataFrame.png
SubsettingHETH.png
NamingCovariatesHETHModels.png
NamingCovariatesAllSpeciesModels.png
AllSpeciesModels.png

Eric Rexstad

unread,
Dec 10, 2024, 2:57:50 AM12/10/24
to Arrow Myers, distance-sampling
Morning Arrow

Welcome to the list and thanks for your question and supporting screen shots.

I think the solution to your error is simple as I was able to duplicate your error with a "tame" dataset provided with the Distance package:

> library(Distance) > data("ETP_Dolphin") > cu <- convert_units("nautical mile", "nautical mile", "square nautical mile") > hr <- ds(ETP_Dolphin, key="hr", convert_units = cu) Starting AIC adjustment term selection. Fitting hazard-rate key function AIC= 3365.915 Fitting hazard-rate key function with cosine(2) adjustments AIC= 3367.912 Hazard-rate key function selected. > names(ETP_Dolphin) [1] "Region.Label"  "Area"          "Sample.Label"  "Effort"       [5] "object"        "distance"      "LnCluster"     "Month"         [9] "Beauf.class"   "Cue.type"      "Search.method" "size"         [13] "Study.Area"   > fred <- ~ETP_Dolphin$Beauf.class > fred ~ETP_Dolphin$Beauf.class > class(fred) [1] "formula" > unusual <- ds(ETP_Dolphin, key="hr", convert_units = cu, formula=fred)
Error in checkdata(data, region_table, sample_table, obs_table, formula) :
> justfine <- ds(ETP_Dolphin, key="hr", convert_units = cu, formula=~Beauf.class) Model contains covariate term(s): no adjustment terms will be included. Fitting hazard-rate key function AIC= 3367.768
The error is caused by your conversion of a field in the data frame to a formula. the "checkdata" function looks to see that the fields named in the "formula" argument are in the data frame passed to "ds". With your conversion of fields in the data frame to intermediate objects of type "formula" the "checkdata" function fails.

The easy solution is demonstrated in my code above: simply use the tilde operator within the call to the "ds" function; then the "checkdata" examination passes. No need to create lots of intermediate variables; just pass the relevant covariates directly to "ds" using the tilde.

On a personal note, I spent several cold winter weeks in the San Luis Valley back in the late 1980's banding overwintering mallards on the refuge.

From: distance...@googlegroups.com <distance...@googlegroups.com> on behalf of Arrow Myers <arrowm...@gmail.com>
Sent: 10 December 2024 01:21
To: distance-sampling <distance...@googlegroups.com>
Subject: [distance-sampling] Error in covariate modeling functions in R package Distance
 
--
You received this message because you are subscribed to the Google Groups "distance-sampling" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distance-sampl...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/distance-sampling/9a4d4677-7725-4ffa-80b8-bb3a67f01875n%40googlegroups.com.

Arrow Myers

unread,
Dec 12, 2024, 2:14:35 AM12/12/24
to distance-sampling
Dr. Rextad,

Thanks so much for looking into the issue, the fix appears to be working throughout my code now with models being produced at the species level. Looking forward to continuing working through the analysis.

p.s. Very cool that you worked at the MVNWR! - I actually grew up in the San Luis Valley and know the refuge very well.

Thanks so much!
Arrow Myers
Western Colorado University

Reply all
Reply to author
Forward
0 new messages