INLA_MODELING ABUNDANCE OF MOSQUITOES

11 views
Skip to first unread message

Shirin Taheri

unread,
Feb 10, 2026, 2:23:13 PM (2 days ago) Feb 10
to R-inla discussion group
Dear all, 

I am modeling the abundance of Culex mosquitoes (vectors of West Nile virus) in southern Spain using INLA. The dataset covers two years of sampling, with repeated mosquito counts at spatial locations. The response variable is the number of mosquitoes captured per sampling event.

The data are highly overdispersed and zero-heavy: many locations have zero counts (especially outside rice-growing areas), while in some sites mosquito abundance reaches values above 4,000. Ecologically, mosquito presence and abundance are strongly linked to rice fields, which explains the large spatial heterogeneity and structural zeros. I am fitting a spatio-temporal model using an SPDE spatial random field and seasonal effects, with a formulation such as:

  • ns(DOY, knots = c(120, 200, 280)) for seasonality

  • f(spatial, model = spde) for spatial structure

  • Zero-inflated negative binomial (zeroinflatednbinomial1) as the response distribution

Although the model converges, I still observe very strong overdispersion in the fitted results, and the estimated zero-inflation parameter suggests that only ~6% of the zeros are explained by the zero-inflated component.

My questions are:

  1. Is this behavior expected in highly heterogeneous ecological abundance data like this?

  2. How should I interpret a small estimated zero-inflation probability in the presence of many observed zeros?

  3. Would alternative strategies (e.g. standard negative binomial, hurdle models, additional random effects, or different seasonal structures) be more appropriate in this context?

Any advice or suggestions would be greatly appreciated.

Thank you.

Helpdesk (Haavard Rue)

unread,
Feb 11, 2026, 3:54:50 AM (yesterday) Feb 11
to Shirin Taheri, R-inla discussion group
Hi there,

its hard to tell in general. If the overdispersion is 'to high' it might be a
sign that the model does not fit very well, and it chose to obsorb model-error
into this term.

In general, I would consider this model

inla.doc("0poisson")

as it allow for its own model in the overdispersion. I know, this is not
neg.binomial, but neg.binomial is essentially just poisson + iid term for each
observation, so ...

seems like if you let the prob(zero) be dependent on covariates you might
progress a little, let us know how this goes. if you still have issues, let us
know and we take it from there

Best
Havard




On Tue, 2026-02-10 at 11:08 -0800, Shirin Taheri wrote:
> Dear all, 
> I am modeling the abundance of Culex mosquitoes (vectors of West Nile virus)
> in southern Spain using INLA. The dataset covers two years of sampling, with
> repeated mosquito counts at spatial locations. The response variable is the
> number of mosquitoes captured per sampling event.
> The data are highly overdispersed and zero-heavy: many locations have zero
> counts (especially outside rice-growing areas), while in some sites mosquito
> abundance reaches values above 4,000. Ecologically, mosquito presence and
> abundance are strongly linked to rice fields, which explains the large spatial
> heterogeneity and structural zeros. I am fitting a spatio-temporal model using
> an SPDE spatial random field and seasonal effects, with a formulation such as:
>  * ns(DOY, knots = c(120, 200, 280)) for seasonality
>  * f(spatial, model = spde) for spatial structure
>  * Zero-inflated negative binomial (zeroinflatednbinomial1) as the response
> distribution
> Although the model converges, I still observe very strong overdispersion in
> the fitted results, and the estimated zero-inflation parameter suggests that
> only ~6% of the zeros are explained by the zero-inflated component.
> My questions are:
>    1. Is this behavior expected in highly heterogeneous ecological abundance
> data like this?
>    2. How should I interpret a small estimated zero-inflation probability in
> the presence of many observed zeros?
>    3. Would alternative strategies (e.g. standard negative binomial, hurdle
> models, additional random effects, or different seasonal structures) be more
> appropriate in this context?
> Any advice or suggestions would be greatly appreciated.
> Thank you.
> --
> You received this message because you are subscribed to the Google Groups "R-
> inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to r-inla-discussion...@googlegroups.com.
> To view this discussion, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/e064ba54-4afe-4112-8a19-aacf2945cd43n%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Bob O'Hara

unread,
Feb 11, 2026, 7:27:02 AM (yesterday) Feb 11
to Shirin Taheri, R-inla discussion group
To answer your questions:
1. it is usual to have horribly high over dispersion in ecological data. To the point where a Gamma distribution with a shape parameter close to zero is typical, and mosquitoes have exactly the right live history to do this.
2. A small inflation probability with lots of zeroes is because you have typical ecological data - a Gamma with a shape parameter below 1 has a lot of zeroes, so it’s difficult to distinguish between this and a zero inflated model. Both zero inflation and overdispersion give rise more zeroes, so it’s often not worth having both in the model.
3. The alternative strategy of just using a negative binomial is the one I would start with (or a Poisson log normal: essentially a Poisson with a random effect on each observation): it’s simpler. You can then test whether you are correctly predicting the number of zeroes: if you aren’t predicting enough then you should follow Håvard’s advice, and make sure the Poisson part has random effects, so you can add overdispersion there.

Good luck. 

Bob



--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
Reply all
Reply to author
Forward
0 new messages