I am modeling the abundance of Culex mosquitoes (vectors of West Nile virus) in southern Spain using INLA. The dataset covers two years of sampling, with repeated mosquito counts at spatial locations. The response variable is the number of mosquitoes captured per sampling event.
The data are highly overdispersed and zero-heavy: many locations have zero counts (especially outside rice-growing areas), while in some sites mosquito abundance reaches values above 4,000. Ecologically, mosquito presence and abundance are strongly linked to rice fields, which explains the large spatial heterogeneity and structural zeros. I am fitting a spatio-temporal model using an SPDE spatial random field and seasonal effects, with a formulation such as:
ns(DOY, knots = c(120, 200, 280)) for seasonality
f(spatial, model = spde) for spatial structure
Zero-inflated negative binomial (zeroinflatednbinomial1) as the response distribution
Although the model converges, I still observe very strong overdispersion in the fitted results, and the estimated zero-inflation parameter suggests that only ~6% of the zeros are explained by the zero-inflated component.
My questions are:
Is this behavior expected in highly heterogeneous ecological abundance data like this?
How should I interpret a small estimated zero-inflation probability in the presence of many observed zeros?
Would alternative strategies (e.g. standard negative binomial, hurdle models, additional random effects, or different seasonal structures) be more appropriate in this context?
Any advice or suggestions would be greatly appreciated.
Thank you.
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.