Convergence issues with sfMsAbund and questions about spatial settings

Gabriel Andrade

unread,

Nov 13, 2025, 3:32:37 PM11/13/25

to spOccupancy and spAbundance users

Hi all,

I’m working on a project modeling native bumblebee abundance, and I’m running into some convergence issues with sfMsAbund. I wanted to ask for advice from people with more experience fitting the spatial version of the model.

We’re modeling counts of five Bombus species with very unbalanced abundances (one species with >6000 captures and the rarest with only 13).
We have 196 sites, each operating an average of ~14 days. Sites were not active at the same time, and distances between them range from 0.17 m to >60 km.

Our goal is to predict species abundances using six environmental variables.

I first fit an msAbund model with a negative binomial distribution and included month as a random intercept. This model converged well and passed posterior predictive checks. After that, I fitted an lfMsAbund model with 2 latent factors, which had better predictive performance based on WAIC. Again, convergence looked good.

As a final step, I fit an sfMsAbund model to account for possible spatial dependence due to the sampling design. In this model, most parameters converge, but a few do not (e.g., kappa for one species has Rhat ~3.1).

I tried reordering species and inspecting single-chain diagnostics, but nothing seems to fix the issue.

Since every attempt takes several hours to run, I would really appreciate some guidance:

Is this a common issue when species abundances are extremely unbalanced?
(One dominant species + several rare ones.)
Should I consider removing the species with only 13 captures?
Could the covariance function affect convergence?
I used the default (exponential), but is there any practical reason to prefer another function if I do not expect long-distance spatial correlation?
About the phi prior (phi.unif)
If I don’t expect long-range spatial dependence, should I tighten the prior?
For example, would something like this be appropriate?

phi.unif = list(4 / max.dist, 3 / min.dist)

Or am I misunderstanding how to set reasonable priors for the spatial decay parameter?
Any general tips for improving convergence in sfMsAbund?
I’m still fairly new to spatial msAbund models, so any guidance or recommended readings would be very helpful.

I can share the full summary, code, or data structure if needed.
Thanks in advance for any help!!

Best,
Gabriel

Jeffrey Doser

unread,

Nov 19, 2025, 5:22:04 AM11/19/25

to Gabriel Andrade, spOccupancy and spAbundance users

Hi Gabriel,

Thanks for the questions. The sfMsAbund model can be very difficult to get to converge with highly overdispersed data as you're working with. I'll try to provide some insights for each of your individual questions below:

Yes, the model in particular can struggle with extreme overdispersion (e.g., if most counts are in the single digits but there are a few very large counts). Having species with too few detections can also cause challenges to the model. For these spatial models, the ability to get an estimate for rare species is more so tied to the number of distinct spatial locations where the species was observed at rather than the total number of observations. Were the 13 observations for the rare species only at 1 or 2 sites? If so, you may find more success when removing that species from the model, but I would guess you'll still find problems due to the large amount of overdispersion. However, I will note that your number of species (5) is a pretty small number of species to use with the factor modeling approach (particularly the spatial factor modeling approach) and I would hesitate using these models for anything less than that. The spatial factor model is harder to estimate compared to the nonspatial factor model, and so you may just not have enough information to in your data set to reliably fit a spatial factor model.
In general, I would not expect the covariance function to have much of an effect on convergence relative to other potential issues. The exception would be if you use the Matern covariance function, which is harder to estimate because it has three parameters as opposed to two parameters like the other three functions that are supported in spAbundance. If you're using the exponential function and finding convergence problems, it is likely you would find convergence problems with a spherical or Gaussian covariance function as well. There of course could be exceptions, but I think it's highly unlikely that this is the cause of your issues.
If you have an a priori reason to believe that there is not long-range spatial autocorrelation in your system (e.g., based on dispersal patterns of the bees) then yes it is completely reasonable to restrict the prior of phi to only be able to explain fine-scale spatial variation. To set a more restrictive prior on phi, see my suggestions here (which are in the context of a different spatial model, but still apply to the spatial factor abundance models).
It seems like you have already seen the guidance on the spOccupancy website for exploring convergence of spatial models, so beyond that I don't have too many broad recommendations for obtaining convergence in these models. As I mentioned above, your situation with only 5 species is on the low end of how many species you might be able to fit in the sfMsAbund model. One thing you may explore is to estimate the beta parameters for each species independently from each other. Instead of assuming the species-specific effects come from a common distribution, you can set them to be estimated independently by setting "independent.betas = TRUE" in the prior list. This will set the prior for each species-specific effect to the initial value that you set for beta.comm (the mean) and tau.sq.beta (the variance). See this thread in the HMEcology group from a while back on how to do that. That could potentially help with convergence problems if there are very large differences between the different species. If that doesn't work, you may simply not have enough information to estimate a spatial model in this case. I will mention that if you expect any spatial autocorrelation to be very fine scale, then estimates from the spatial model would not be all that different from estimates in the latent factor model.

Hope that helps!

Jeff

--
You received this message because you are subscribed to the Google Groups "spOccupancy and spAbundance users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spocc-spabund-u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/spocc-spabund-users/f190fca3-9d77-455b-855b-8219eb1f034en%40googlegroups.com.

--

Jeffrey W. Doser, Ph.D.

Assistant Professor

Department of Forestry and Environmental Resources

North Carolina State University

Statistical Ecology and Forest Science Lab

Pronouns: he/him/his

Gabriel Andrade

unread,

Nov 21, 2025, 3:35:56 PM11/21/25

to Jeffrey Doser, spOccupancy and spAbundance users

Hi Jeff,

Thank you very much for your previous response.

Before keeping trying with the spatial models, I decided to explicitly explore the spatial scale of the residual dependence, instead of assuming it based only on trap spacing. My question for you is whether the way I defined residuals and the reasoning I am following is appropriate for lfMsAbund models.

Since the model is a multivariate negative-binomial GLMM, I defined Pearson residuals using the usual NB formula:

where and come from the posterior samples.

Below is the R code I used:


# Observed counts: [species, site]
y_obs <- drop(lf_fit$y)

# Posterior means of mu and kappa
mu_hat    <- apply(lf_fit$mu.samples,    c(2, 3), mean)   # [species, site]
kappa_hat <- apply(lf_fit$kappa.samples, 2,     mean)     # [species]

# Negative-binomial variance
var_hat <- matrix(NA, nrow(mu_hat), ncol(mu_hat))

for (i in 1:nrow(mu_hat)) {
  mu_i    <- mu_hat[i, ]
  kappa_i <- kappa_hat[i]
  var_hat[i, ] <- mu_i + (mu_i^2) / kappa_i
}

# Pearson residuals
pearson_res <- (y_obs - mu_hat) / sqrt(var_hat)



Using these Pearson residuals, I calculated spline-based correlograms (bootstrap = 999) for each species. Four species show essentially no residual spatial structure. 
One species (B. pensylvanicus) shows a small peak of correlation (~0.5) at distances < 100 m, but it decays immediately and bands are wide.





Given these results, my interpretation is that the remaining spatial dependence is minimal and confined to a scale so small that the latent factor model likely absorbs the relevant shared structure, 
and therefore a spatial factor model would probably not change the parameter estimates in any meaningful way (consistent with what you mentioned).
Before moving forward, I wanted to check with you if my procedure (defining Pearson residuals in this way) and my reasoning are appropriate for lfMsAbund.

Thank you again for your time. Your guidance has been incredibly helpful.

Regards,

Gabriel.

--

----

Gabriel P. Andrade-Ponce, Ph.D

Postdoctoral Research Associate - Arthur Temple College of Forestry and Agriculture, Stephen F. Austin State University, U.S.

M.Sc.and Ph.D - Instituto de Ecología A.C., Mexico

Biologist - Universidad Nacional de Colombia, Colombia

Nacogdoches, Texas

Associate Editor, THERYA - Journal of the Mexican Association of Mammalogy (AMMAC)

Web site • Twitter • Google scholar • Researchgate • GitHub

Jeffrey Doser

unread,

Nov 24, 2025, 6:50:15 AM11/24/25

to Gabriel Andrade, spOccupancy and spAbundance users

Hi Gabriel,

Thanks for the clear description of your approach. Your procedure and the conclusions you draw from it seem very reasonable to me. Based on those results, lfMsAbund and sfMsAbund will likely give very similar results.