Errors in confidence intervals when generating plots in Unmarked using top models and the predict function

89 views
Skip to first unread message

Sarah Grimes

unread,
Jul 22, 2024, 2:10:18 PM (7 days ago) Jul 22
to unmarked
Hi everyone,
I am creating single season single species occupancy models and have ran into problems when I try to create plots with occupancy probability on the y-axis and a covariate of interest that is contained in my model on the x-axis. I had no issues graphing detection covariates, but when I moved to occupancy covariates whenever I create a plot (using the code below) the lower bound confidence interval is essentially 0, and the upper bound confidence interval is essentially 1.

For context, my top model has date for the detection covariate, and habitat type (a categorical variable with 6 levels), number of fires, high development percentage, and open water percentage for the occupancy covariates.

Code:
# Calculate the mean Julian date for detection covariates (ignoring NA values)
mean_date <- 140.1

# Generate a sequence of TotalFires values from 0 to 25 (example range)
total_fires <- seq(0, 12, length.out = 100)

# Calculate the mean values for other covariates (ignoring NA values)
mean_HighDevelopment_800m <- mean(data1$HighDevelopment_800m, na.rm = TRUE)
mean_OpenWater_800m <- mean(data1$OpenWater_800m, na.rm = TRUE)

# Get the unique levels of HabitatType
habitat_levels <- unique(data1$HabitatType)

# Initialize storage for predictions
all_predictions <- data.frame(
  TotalFires = total_fires,
  Predicted = rep(0, length(total_fires)),
  lower = rep(0, length(total_fires)),
  upper = rep(0, length(total_fires))
)

# Loop over each habitat type to generate predictions and accumulate them
for (habitat in habitat_levels) {
  new_data <- data.frame(
    date = mean(mean_date),  # Using the overall mean of mean_date
    HabitatType = factor(rep(habitat, length(total_fires)), levels = habitat_levels),
    HighDevelopment_800m = mean_HighDevelopment_800m,
    OpenWater_800m = mean_OpenWater_800m,
    TotalFires = total_fires
  )
 
  pred_prob <- predict(bestmodel, type = "state", newdata = new_data)
 
  all_predictions$Predicted <- all_predictions$Predicted + pred_prob$Predicted
  all_predictions$lower <- all_predictions$lower + pred_prob$lower
  all_predictions$upper <- all_predictions$upper + pred_prob$upper
}

# Average the predictions across all habitat types
all_predictions$Predicted <- all_predictions$Predicted / length(habitat_levels)
all_predictions$lower <- all_predictions$lower / length(habitat_levels)
all_predictions$upper <- all_predictions$upper / length(habitat_levels)

# Plot the predicted occupancy probabilities against TotalFires
plot(total_fires, all_predictions$Predicted, type = "l",
     xlim = c(0, 12), ylim = c(0, 1),
     xlab = "Total Fires", ylab = "Occupancy Probability")

# Add confidence interval lines
lines(total_fires, all_predictions$lower, lty = 2, lwd = 1)
lines(total_fires, all_predictions$upper, lty = 2, lwd = 1)



This is my first time doing anything in R so I am definitely a beginner and have been relying on many tutorials to get to the point, so any help is greatly appreciated. Thank you!

Jeffrey Royle

unread,
Jul 22, 2024, 2:28:10 PM (7 days ago) Jul 22
to unma...@googlegroups.com
hi Sarah,
 Can you show the output for your top model?  And also what is the naive occupancy rate for your data set and also the number of sites? 
regards
andy


--
*** Three hierarchical modeling email lists ***
(1) unmarked (this list): for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology: for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/469a484d-2866-4c62-84dc-141f13c563fen%40googlegroups.com.

Sarah Grimes

unread,
Jul 22, 2024, 2:36:53 PM (7 days ago) Jul 22
to unmarked
Hi,
The naive occupancy rate for my dataset was 0.89 and I had 93 sites with 12 replicates at each site. Below is a screenshot of my top model. Thanks!
Screenshot 2024-07-22 143610.png

Jeffrey Royle

unread,
Jul 22, 2024, 4:52:03 PM (6 days ago) Jul 22
to unma...@googlegroups.com
hi Sarah,
 I think you have a lot of occupancy parameters for only 93 sites -- too many. Furthermore, for most of the habitat classes you have estimated occupancy of 1.0 although those are not well estimated (extremely high values and high SEs).  The MLEs are very unstable, and I would recommend getting rid of the habitat classes or else combine them into 2 classes only.  At any rate, it makes sense that the SE for psi-hat is extremely large, since this is just a function of the SEs for the parameters themselves. So you have to simplify the model, or collect more data.
regards
andy


Sarah Grimes

unread,
Jul 23, 2024, 11:01:27 AM (6 days ago) Jul 23
to unmarked
Hi again,
Thanks so much for your help, I really appreciate it! I did as you suggested and simplified the model by removing habitat type as a covariate. After going through the same model selection procedure based on AICc score, I was left with my top model including date for detection, and elevation, %forest, and total fires for occupancy. I've attached a picture of the summary below. This model estimates occupancy at 0.89, which is almost identical to my naive occupancy. When I use the predict function, the CI's are still almost always spanning 0-1, and the model almost always estimates occupancy=1 when graphing. Is this an issue with the model fitting, my dataset, or how I am predicting/graphing? I had 93 sites with 12 replicates each. model.png

Ken Kellner

unread,
Jul 23, 2024, 9:39:30 PM (5 days ago) Jul 23
to unmarked
Hi Sarah,

It's hard to tell just from looking at the summary what the problem might be. If you email me your code/data separately I will take a look.

Ken
Reply all
Reply to author
Forward
0 new messages