One tMsPGOcc vs. Many msPGOcc: Model Choice and Sample Gaps

41 views
Skip to first unread message

Alexandria Shockney

unread,
May 7, 2026, 4:51:01 PMMay 7
to spOccupancy and spAbundance users
Hi All,

I am interested in learning more about choosing between a single multi-season, non-spatial occupancy model (tMsPGOcc) vs. running multiple single-season non-spatial occupancy models (msPGOcc) and comparing those results. In both instances, multiple species would be considered. I am interested in learning how species occupancy changes within my survey area by season. Of note, some species have uncertain seasonal presence across a year, and as such, I want to be careful in how I interpret occupancy model results and better understand which method is most appropriate for my system. I tried to design the survey in a way that increased certainty in population stability within a survey season by limiting each survey to ~4-6 weeks of time, however, there is some inherent uncertainty in stable presence between seasons.

I sampled 8 “transects” which were divided into multiple 1km sites (n=201). These sites were sampled across 4 seasons (Summer, Fall, Winter, Spring) and two years (2024-2025, 2025-2026). Each season had 2 replicates for each site, totaling 4 replicates for the entire dataset across two years of sampling. However, there are some notable gaps where not all sites were sampled each season or year because additional sites were added as our sampling capabilities grew.

8 Sampled Transects: 1A, 1B, 2A, 2B, 3A, 3B, 4A, and 4B. Each transect contains various quantities of sites (avg. ~25 sites per transect). Sampling gaps:

# 3A not sampled in Summer 2024
# 3B not sampled in Summer 2024, Winter 2025
# 4A/4B not sampled in Summer 2024, Fall 2024, Winter 2025, or Spring 2025 (Year 1)
# 2B only sampled for one replicate of Fall 2025

Given the data gaps, I feel compelled to analyze the data together in a single multi-species, multi-season, non-spatial occupancy model (tMsPGOcc) to increase the sample size and model performance. However, given the potential uncertainty in species presence across an annual cycle, I also feel the need to be cautious and perform 4 independent multi-species, single-season, non-spatial occupancy models (msPGOcc). Would performing 4 independent msPGOcc models yield similar results as a tMsPGOcc model? What cautious should be taken when comparing seasons that independently went through msPGOcc modeling? I saw a previous thread that had a similar question about accounting for seasonal trends in occupancy, and they were encouraged to consider “year” and “site_year” as random effects in the model (eg., "occ.formula = ~ (1 | year) + (1 | site_year) + ... would that not be appropriate in my instance because of potential species presence fluctuations between seasons?

Lastly, does anyone have recommendations on managing an occupancy dataset with a notable amount of missed repetitions? I had thought of considering "Season" as a repetition (ie., Summer = rep 1, 2; Fall = rep 3, 4; Winter = rep 5, 6; Summer = rep 7, 8) and condensing data across years, but would appreciate any insight on if there is a foundational issue with that approach when working with occupancy modeling.

I am quite new to occupancy modeling and am still trying to make sense of the best step forward to handle my dataset. Any help is much appreciated! If anyone needs any more information to help, please do not hesitate to ask!

With sincere gratitude,

Alexandria

Jeffrey Doser

unread,
May 19, 2026, 8:55:47 AM (12 days ago) May 19
to Alexandria Shockney, spOccupancy and spAbundance users
Hi Alexandria, 

Apologies for the delay. I'll try to give you some thoughts related to your different questions. 
  • If you fit a msPGOcc model separately for each season, the estimates of all regression coefficients (i.e., intercepts and slopes) would be estimated only using data from that season. If you fit a full multi-season, multi-species model, all of those parameter estimates are informed by data from all seasons, and unless you specify otherwise, the effects of any covariates are assumed to be the same across seasons. So, the similarities between fitting multiple msPGOcc models and fitting a single tMsPGOcc model would depend on what you included in the tMsPGOcc model. If you estimated all regression coefficients in tMsPGOcc as constant across years, your results will likely be a bit different than msPGOcc (unless all parameters are truly exactly the same across years). In general, fitting multiple msPGOcc models would require more bookkeeping, and I would generally recommend fitting a tMsPGOcc model in this case. 
  • Regarding the random effects, I would recommend including those same sort of random effects in the tMsPGOcc model. The idea is that you want to account for dependencies in the data structure that arise from the repeated measures at each site (e.g., sampling a site across multiple seasons). If you include a fixed or random effect of year, this also would allow you to have a different intercept for each species. If you are worried about covariate effects being different across different seasons, you could stratify the effects of the covariate such that each season has a different estimated regression coefficient (you can do this by including an interaction in the model between the covariate and something like factor(season)). Overall, fitting a single tMsPGOcc model gives you more data, less potential problems that can arise when fitting multiple models, and also leverages information from across the entire extent of sampling to help inform parameter estimates. 
  • If you do fit separate msPGOcc models and compare covariate effects, you'll have to keep in mind the values of the covariates that are sampled in each season. If there are different ranges of the covariates sampled in each season (e.g., due to different sites being sampled), this could potentially lead to differences in the covariate effects depending on how/if you standardized the coefficients. 
  • In regards to having uneven data, the impacts of uneven sampling over time on parameter estimates are often negligible unless there are very strong patterns in the missingness. For example, if sites that weren't sampled also happened to be sites that in the given year had very high occupancy for some reason, this could potentially lead to bias. 
  • In terms of what is used as a "replicate", this all depends on the closure assumption and how that applies to the specific species you're working with. Violating the closure assumption is not completely deterimental, but it does influence how you interpret the results. See this paper for a bit of a discussion on that topic. 
Hope that helps, 

Jeff

--
You received this message because you are subscribed to the Google Groups "spOccupancy and spAbundance users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spocc-spabund-u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/spocc-spabund-users/97a8edcc-a069-4637-be6d-8be0fb74b723n%40googlegroups.com.


--
Jeffrey W. Doser, Ph.D.
Assistant Professor
Department of Forestry and Environmental Resources
North Carolina State University
Pronouns: he/him/his

Alexandria Shockney

unread,
May 20, 2026, 1:33:41 PM (11 days ago) May 20
to spOccupancy and spAbundance users

Hi Jeff,

Thank you for your response. Your clarity on choosing between an appraoch of multiple msPGOcc’s vs. one tMsPGOcc has helped a lot, along with the attached paper. I have proceeded with tMsPGOcc and the model is performing much better than when I previously tried individual msPGOcc models. However, I have a final question about the 4D data structure of a tMsPGOcc model:

For my given survey periods (n=8) between Summer 2024-Spring 2026, each of the four seasons (Summer, Fall, Winter, Spring) were sampled a total of four times (twice per survey year). I am currently asking the model to set “Year 1” of sampling as Rep 1 and 2 for each respective season, while “Year 2” for each survey season is being set to Rep 3 and 4 (eg., Summer 2024: survey 1 = rep 1, survey 2 = rep 2; Summer 2025: survey 1 = rep 3, survey 2 = rep 4. Summer is given 4 reps). My intention of this is so that all 4 reps in a given season can be collapsed, producing 4 time periods and hopefully permitting a comparison of seasonal occupancy (eg., all summer surveys vs. all fall surveys). Is this how the tMsPGOcc function would process my data in the way I currently have it prepared [2 species, 201 sites, 4 time periods, 4 reps], or would the tMsPGOcc function need me to define the time periods to reflect the true 8 surveys to accurately compute a multiseason occupancy model [2 species, 201 sites, 8 time periods, 2 reps]?

Lastly, do you have any recommendations on additional readings for better understanding how tMsPGOcc works behind the scenes aside from what is provided in the associated R Documentation?

Please let me know if any additional information about my model set up would assist in answering this question. I sincerely appreciate the guidance!

Kind regards,

Alexandria

Alexandria Shockney

unread,
May 27, 2026, 11:25:48 AM (4 days ago) May 27
to spOccupancy and spAbundance users
Hi Again!

I ran a multi-season multi-species model for my two years of data while collapsing by season (n=4), where each season had 2 surveys ("Year 1" = reps 1 and 2 for each respective season, "Year 2" = reps 3 and 4 for each respective season). The community-level results struggled a bit (likely because there are only two species considered in the community), but the species-level results really excel! My MCMC convergence plots and freeman-tukey results look pretty great overall (Rhat ~1.01 and ESS >400). As such, my intention is to only discuss the species-level variables.

I then have a question regarding species-level result interpretation. In the attached plot, I have several variables: "Building Height" appears to have a positive impact on occupancy, "Season: Winter" appears to have a negative impact, and the interaction of "Building Height * Winter" appears to have a negative impact on occupancy. My inclination is to interpret these items individually as stated. 

(1) However, when interactions are involved (eg., Building Height * Winter), are individual interpretations (eg., of Building Height / Season:Winter) not possible? ie., can I not make any inferences on the individual "Building Height" and "Season: Winter" results individually because the results are also bound up in an interaction? 
(2) In the attached example, Building Height has a positive impact on occupancy, Season: Winter had a negative impact, and Building Height*Winter had a negative impact. Is this interpreted as Season:Winter (negative impact) more heavily driving the Building Height*Winter results (also a negative impact), or is the interpretation "In winter, the occupancy of this bat species is negatively impacted by building height"?

I appreciate any guidance/resource recommendations on result interpretation for the outputs of spOccupancy, thank you!

Best,
Alexandria

MCMC_BetaExample.png

Jeffrey Doser

unread,
May 29, 2026, 5:57:19 AM (2 days ago) May 29
to Alexandria Shockney, spOccupancy and spAbundance users
Hi Alexandria, 

Here are some thoughts below on your questions and some additional comments. 
  • From my understanding of your data, you have 8 primary time periods (i.e., combination of season and year) and then 2 secondary time periods (i.e., replicates within a given season/year combination). If you are assuming the population is closed within a given season/year but then can change from one season to the next, then your format should be  [2 species, 201 sites, 8 time periods, 2 reps]. I don't think the other formulation makes sense. 
  • I would strongly recommend against using any multi-species occupancy model in spOccupancy with only 2 species. In these models, species-specific effects are treated as random effects, and so with only two species you would be estimating random effects variances with only two values, which is very difficult (if not impossible). Thus, I would switch to fitting a model with tPGOcc separately for each species. 
  • There is better documentation for tPGOcc compared to tMsPGOcc, so you can take a look at this vignette for more details on that model
  • The interpretations of interaction effects for a model in spOccupancy is exactly analogous to the interpretation of interactions in any other standard modeling approach (e.g., regression). You can interpret the main effects, but they have to be done so in the context of the interaction. For example, the positive main effect of building height means there is a positive effect of building height on occupancy when winter has a value of 0. The negative interaction means that the effect of building height will be smaller when the value of the winter variable is larger. I am not sure if winter is categorical or continuous, which also slightly impacts the way in which you would think about the interaction. All that to say, I would suggest looking into some general resources about how to interpret interactions in regression models, which will then help you interpret the results from your model. 
Hope that helps!

Jeff

Reply all
Reply to author
Forward
0 new messages