Colext - High extinction probability and suspect C-hat

Skip to first unread message

Mitch H

Aug 19, 2022, 7:20:16 AMAug 19
to unmarked

Hello all, 

I am new to both occupancy modelling, and the unmarked package. I would like to run several questions past those more knowledgeable for your feedback.

Firstly, the study question:

I am interested in investigating how the occupancy of a snake species varies in response to burn intensity post bushfires. 

The data:

I have a detection history across two distinct seasons ( ~ 1 year post fire, and ~2 year post fire) for 70 sites that are high intensity burn, low intensity burn or unburnt. These sites are across four separate national parks. Detection histories were constructed from standardised active searches to locate the snakes. In the first year due to logistical constraints (both site access post-fire and covid), we only had 3-4 four surveys per site, the second season we sampled sites between 6-8 times. For one park (9 sites in total) we only have occupancy for the second season as we were not able to access it the first year.

To account for the different sampling lengths within years and across years I’ve added ‘NA’ values to the detection history so that all sites had equal secondary sampling of 8 (maximum number of visits that occurred at a single site in a season) instances, for both seasons.


The code:

Given we sampled across years I ran a dynamic model using the colext() function of unmarked. The data frame was the detection history combined with site covariates of burn intensity (factor) and park/reserve(factor). Year was included as a yearly covariate.

dynamic_occ_mustard <- colext(

  psiformula= ~ intensity + park ,

  gammaformula = intensity,

  epsilonformula = intensity,

  pformula = ~ intensity + park + year,

  data = mustarddata,

  method = "BFGS")

Model selection indicated to drop all factors, expect park and year from the detection formula.

 I checked model fit with mb.gof.test and the chi squared statistic was >0.05. Chat was = 0 though.

The concerns:

1)      Is the inclusion of NAs to account for different secondary sampling lengths appropriate?

2)      Can I run an colext model given for one park (nine sites) there was no sampling in the first year?

3)         Why is chat = 0?

3)          Are there any general recommendations or further reading you could suggest?


The reason I am seeking feedback is that when I account for different detection across parks and between years, I see a significant drop in the mean among site occupancy between year 1 and 2. From ~0.50 to 0.26. I also see a very high among site extinction probability 0.74. I should note I determined occupancy for each year using the smoothed function, and extracted among site probabilities using predict.

I just want to confirm that I have not made an error in the model as these results, while possible, seem extreme.

Thanks for your time.




Marc J. Mazerolle

Aug 19, 2022, 3:41:46 PMAug 19
Hi Mitch,

To answer your questions:

1) Your detection data and observation covariates must have the same
number of columns for each year. So yes, you should include NA's to pad
for the missing visits in the year with the least number of visits.

2) Yes, you can include NA's in your sites that were not sampled. They
won't contribute much to extinction or colonization, but will inform
detection probability.

3) Regarding the cause for the c-hat of 0 for this model with your
data, it is difficult to say without seeing the model output. Do you
have any indications that there are issues in estimating certain
parameters in your model? For example, do you get warning messages
about convergence, the Hessian matrix, or do you have parameters that
have very large SE's relative to the estimate? You can send me the
model output saved as an R object offlist so I can take a look at it
i.e., save(your.model.object, file = "your.model.object.Rdata").



Marc J. Mazerolle
Professeur agrégé
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V 0A6, Canada
Tel: (418) 656-2131 ext. 407120

Veuillez noter que je suis en télétravail. Le meilleur moyen de me
rejoindre est par courriel.

-------- Message initial --------
De: Mitch H <>
Répondre à:
À: unmarked <>
Objet: [unmarked] Colext - High extinction probability and suspect C-
Date: Thu, 18 Aug 2022 22:29:53 -0700

Mitch H

Aug 21, 2022, 10:14:33 PMAug 21
to unmarked
Hi Marc, 

Thanks for the quick reply. It is apprecaited. I will email you the model object now. 

No warning messages appeared when running the models. However, there are several parameters where the SE's are high relative to the estimate. This is likely the source of the issue. 

All the best, 


Marc J. Mazerolle

Aug 24, 2022, 8:47:20 AMAug 24

Here was my response offlist to Mitch regarding the issues he was
having with the fit of a dynamic model, which might be useful to

I took a look at the data and model output. Here are a few

1) First off, there are few detections, with roughly 12% of sites
having at least 1 detection in a given year. To help summarize this
info, you can look at the detHist( ) function in the AICcmodavg
package. It will summarize the data in terms of the different detection
histories observed and will also tabulate the number of sites sampled
each year and the number of these sites with a detection.

2) I would not use the dynamic model you sent for inferences, given
that all parameters for psi in the first year and extinction have very
large SE's, which indicate problems in estimating these parameters.
This is a consequence of fitting a model that is too complex for the
data at hand. Keep in mind that the null model has 4 parameters
estimated. In your case, the null model could be an option (no issues
of estimation), although might not help test your hypotheses regarding
park or intensity. You might want to add the effect of intensity or
park on a single parameter (psi, gamma, or epsilon), but not on all
three in the same model. Even doing so, you have to check if this
introduces estimation problems.

3) Given that you only have 2 years of data, there is no great gain in
fitting a dynamic occupancy model, especially if you have a low number
of detections. I suggest combining the two years and running a single-
season occupancy model with year as an explanatory variable. To do so,
you can simply stack the data from the first year and the second year.
So if you have 70 sites sampled during 2 seasons, the new stacked data
set would have 140 rows. Note that because 10 of your sites were not
sampled in 2021, then the stacked data set in your case would have 130
rows. You would then use occu( ) to fit this single-season model.

4) You might want to include a random effect of park on occupancy,
because you have multiple sites within a given park. You can do this by
using a syntax similar to lme4:

m3 <- occu(~ year ~ intensity + (1 | park), data = newOccu)

5) If you have other variables that could affect detection probability,
you should consider adding them (e.g., temperature, month, trapping

6) My suggestion is not to use dredging, but to specify a limited
number of candidate models that are meaningful to you and determine
which one of these models have the most support, based on biological
hypotheses you have.

7) You can find great advice for these types of models in Marc Kéry and
Andy Royles two volume set Applied hierarchical modeling in ecology:
analysis of distribution, abundance and species richness in R and BUGS.

I hope this helps,

Marc J. Mazerolle
Professeur agrégé
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V 0A6, Canada
Tel: (418) 656-2131 ext. 407120

Veuillez noter que je suis en télétravail. Le meilleur moyen de me
rejoindre est par courriel.

-------- Message initial --------
De: Mitch H <>
Répondre à:
À: unmarked <>
Objet: Re: [unmarked] Colext - High extinction probability and suspect
Date: Sun, 21 Aug 2022 19:14:33 -0700
Reply all
Reply to author
0 new messages