occu and goodness of fit (mb.gof.test)

1,334 views
Skip to first unread message

Danielle Rappaport

unread,
Oct 24, 2018, 2:52:27 PM10/24/18
to unmarked
Dear all:

I'm having trouble interpreting my goodness of fit and c-hat scores.

I have a set of top-ranked single season occupancy models fit in unmarked - occu() - selected using AIC-based model ranking (AIC ≤ 2) following a dredge procedure that fit all combinations of biologically viable covariates (package: MuMIn). I am looking to calculate/compare goodness of fit of the top-ranked models. As suggested elsewhere in this forum, I have applied the goodness of fit procedure (mb.gof.test) to the subset of top-fitting models because the global model has too many covariates for gof to be evaluated. All of the models yielded gof p-values that are less than 0.05, indicating a rejection of H0 (H0=model fits the data). However, my understanding is that a c-hat of less than 3 indicates adequate fit with minor overdispersion. My questions are as follows: For those models with a c-hat of  less than 3, is it justifiable to move forward with inferences once I inflate the SEs by a factor of the c-hat? Or does the p-value of 0 render the c-hat score irrelevant, removing any grounds for advancing with inferences for a model with a c-hat of 2.4 over a model with a c-hat of 6?  
Thanks much in advance, 
Danielle

Marc J. Mazerolle

unread,
Oct 24, 2018, 8:55:03 PM10/24/18
to unmarked

Hi Danielle,


if your goodness-of-fit test indicates lack of fit (P < alpha), it means that the data and model "disagree". The difficult part is determining the source of the lack of fit. If your c-hat estimate is not too large (say < 3, but hopefully closer to 1), the disagreement is probably due to data varying slightly more than expected for the distribution at hand. In this case, then you could use your c-hat estimate to inflate the variances and conduct (conservative) inferences. However, the disagreement could also be due to having a lousy model to begin with (lack of fit due to wrong model structure), and this lousy model structure would yield a large c-hat. In your case, I would pursue with the c-hat = 2.4. To know how large a c-hat is too large would require a small simulation study.


Best,


Marc


____________________
Marc J. Mazerolle
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V OA6, Canada
Tel: (418) 656-2131 ext. 7120
Email: marc.ma...@sbf.ulaval.ca

De : unma...@googlegroups.com <unma...@googlegroups.com> de la part de Danielle Rappaport <daniell...@gmail.com>
Envoyé : 24 octobre 2018 14:52
À : unmarked
Objet : [unmarked] occu and goodness of fit (mb.gof.test)
 
--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Danielle Rappaport

unread,
Oct 25, 2018, 3:48:51 PM10/25/18
to unmarked
Marc, 
Thanks for your speedy reply and help.
There are a small subset of top-ranked models (delta AIC < 2 & c-hat < 3). In terms of selecting the one upon which I should advance my inferences, do you advise I select the model with the lowest AIC or the lowest c-hat score (they don't always correspond)? I'm not sure whether the difference in c-hat is negligible or not (e.g. 2.4 vs 2.5 vs 2.6). 
Best,
Danielle

Marc J. Mazerolle

unread,
Oct 25, 2018, 8:20:58 PM10/25/18
to unmarked

Danielle,


In their book, Burnham and Anderson (2002: Model selection and multimodel inference: a practical information-theoretic approach. Springer., section 6.5.1, p. 306) recommend that if you cannot run a single global model, to estimate the c-hat of different "sub-global" models and use the smallest estimate. Is this what you have been doing? Thus, in your case, I would use the smallest c-hat among your models and conduct multimodel inference. Regarding your question on the influence of the c-hat, you can check it easily by changing the c-hat and seeing how the model ranking changes with increasing values of c-hat.


Best,


Marc


____________________
Marc J. Mazerolle
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V OA6, Canada
Tel: (418) 656-2131 ext. 7120
Email: marc.ma...@sbf.ulaval.ca
Envoyé : 25 octobre 2018 15:48
À : unmarked
Objet : Re: [unmarked] occu and goodness of fit (mb.gof.test)
 

Danielle Rappaport

unread,
Oct 26, 2018, 8:18:46 AM10/26/18
to unma...@googlegroups.com
Marc, 
I'm grateful for your assistance (and patience in providing multiple responses). 
That all makes sense. However, to be sure that I understand your suggestion regarding the influence of the c-hat: you say that for every instance where I can't evaluate the goodness of fit performance of the global model, I estimate the c-hat of the "sub-global" models and see how the corresponding AIC, and thus model ranking, changes with the value of the c-hat. 1) Must I calculate the c-hat for ALL of the sub-global models? This would be computationally unviable for me considering that I have arrived at my model set through a dredging procedure that yields thousands of possible models. Is it justifiable to reassess the model ranking/AIC in light of c-hat values just for the best-ranking models (delta AIC<2)? Also, to confirm, I am to inflate the standard error by sqrt(c-hat), correct? 2) Is there a built-in function within unmarked or AICcmodavg to programmatically perform the process described above--inflate the variance of the top-ranking sub-global models by using their c-hat scores and then re-evaluate model ranking? Apologies if this is in the literature; I just couldn't find it. 3) Lastly, it just came on my radar that I may want to opt for AICc instead of AIC; however, I don't reckon my n (>1536) to be small and therefore don't think I need a sample size bias correction. That said, I read that regardless of sample size, AICc is more general and is thus often used in place of AIC. Do you have any recommendations? I already calculated AIC and would prefer to stick with that criterion if there is no added benefit of a sample size bias correction in my case.
Thanks a million, and I will try my best not to pester you with further questions,
Danielle

You received this message because you are subscribed to a topic in the Google Groups "unmarked" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/unmarked/XfL6czLSjG0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to unmarked+u...@googlegroups.com.

John Clare

unread,
Oct 26, 2018, 10:37:57 AM10/26/18
to unmarked
Maybe just to quickly clarify this:

1) Must I calculate the c-hat for ALL of the sub-global models? This would be computationally unviable for me considering that I have arrived at my model set through a dredging procedure that yields thousands of possible models. Is it justifiable to reassess the model ranking/AIC in light of c-hat values just for the best-ranking models (delta AIC<2)? 
 
Imagine the full model has 4 terms associated with predictors and say predictors 1 and 2 are strongly collinear, and including both in the same model is problematic. The global sub-models are then a) coefficients for predictors 1, 3, 4, and intercept, and predictor coefficients 2, 3, 4 + intercept, and you could operationally treat which ever sub-model exhibits less overdispersion as the global model. A model, with, say, predictor 1 and an intercept might be a candidate model, but it is not a "global" or "sub-global" model. The main point is that the global model (or sub-global models) is used to determine whether the candidate model pool includes something that will fit "good enough". If the most saturated model exhibits poor fit, this induces adjustments to SE and the model selection process (QAIC vs. AIC). So rather than use AIC to determine which models to use to test for overdispersion, the measure of overdispersion or lack of fit is used to determine how to rank varied models. 

Regarding your statement ("global model has too many covariates for gof to be evaluated"), almost always, the global model is overfit. For some capture-recapture models, the global model is sometimes thought of as a full interaction between group and time--many, many more terms than are likely to be practical. I don't quite know what you mean specifically here, but there are several things that might be considered undesirable in a selected model (e.g., uninformative parameters or even inability to estimate standard errors for all coefficients) that are less problematic here.

Marc J. Mazerolle

unread,
Oct 26, 2018, 11:01:12 AM10/26/18
to unma...@googlegroups.com
Danielle (sorry in advance, for the long-winded answer),

In my previous post, I mentioned one way of proceeding to decide on
which c-hat to use when you check c-hat of each sub-global model, when
the global model is too complex to be fit à la Burnham and Anderson.
When you mentioned that you had several c-hat, I assumed that you
obtained them from such sub-global models (sorry if I misread). Yes, it
is difficult (computer-intensive) to implement if you have several
"sub-global" models. Another approach is to use the c-hat of the top-
ranking model. 

If you used a "dredging approach", did you get these multiple c-hats
from each model considered or did you dredge several times (i.e.,
different sets of candidate models)? Fitting a limited number of models
supported by biological hypotheses is harder to do than using an
automated approach such as dredging, but in my opinion the former is
superior (but others may disagree). 

Sensitivity of model selection to c-hat:
Regarding the sensitivity of model selection to c-hat, this is a
general exercise that you can do as soon as you have overdispersion. If
the ranking of the models change substantially when you increase c-hat
from 1 to the observed value, this suggests that there is a lot of
model selection uncertainty.

AIC vs AICc:
If you treat the number of sites as the number of observations (>
1536), then in your case, both will yield very similar values.

Adjusting model selection and inference for overdispersion:
Yes, in each model, the adjustment is sqrt(c-hat) * SE. You can do it
manually by using the equations in Burnham and Anderson (2002) or use
different functions in AICcmodavg with the c.hat argument (i.e.,
?aictab, ?modavgPred). I suggest you check up on Burnham and Anderson
(2002) or Anderson (2008:  Model-based inference in the life sciences:
a primer on evidence. Springer, New York, NY, USA.) which address these
topics.

Marc
--
____________________________________
Marc J. Mazerolle
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V 0A6, Canada
-------- Message initial --------
De: Danielle Rappaport <daniell...@gmail.com>
Reply-to: unma...@googlegroups.com
À: unma...@googlegroups.com
Objet: Re: [unmarked] occu and goodness of fit (mb.gof.test)
Date: Fri, 26 Oct 2018 08:18:07 -0400

Marc, 
I'm grateful for your assistance (and patience in providing multiple responses). 
That all makes sense. However, to be sure that I understand your suggestion regarding the influence of the c-hat: you say that for every instance where I can't evaluate the goodness of fit performance of the global model, I estimate the c-hat of the "sub-global" models and see how the corresponding AIC, and thus model ranking, changes with the value of the c-hat. 1) Must I calculate the c-hat for ALL of the sub-global models? This would be computationally unviable for me considering that I have arrived at my model set through a dredging procedure that yields thousands of possible models. Is it justifiable to reassess the model ranking/AIC in light of c-hat values just for the best-ranking models (delta AIC<2)? Also, to confirm, I am to inflate the standard error by sqrt(c-hat), correct? 2) Is there a built-in function within unmarked or AICcmodavg to programmatically perform the process described above--inflate the variance of the top-ranking sub-global models by using their c-hat scores and then re-evaluate model ranking? Apologies if this is in the literature; I just couldn't find it. 3) Lastly, it just came on my radar that I may want to opt for AICc instead of AIC; however, I don't reckon my n (>1536) to be small and therefore don't think I need a sample size bias correction. That said, I read that regardless of sample size, AICc is more general and is thus often used in place of AIC. Do you have any recommendations? I already calculated AIC and would prefer to stick with that criterion if there is no added benefit of a sample size bias correction in my case.
Thanks a million, and I will try my best not to pester you with further questions,
Danielle

On Thu, Oct 25, 2018 at 8:21 PM Marc J. Mazerolle <marc.ma...@sbf.ulaval.ca> wrote:
> Danielle,
>
> In their book, Burnham and Anderson (2002: Model selection and multimodel inference: a practical information-theoretic approach. Springer., section 6.5.1, p. 306) recommend that if you cannot run a single global model, to estimate the c-hat of different "sub-global" models and use the smallest estimate. Is this what you have been doing? Thus, in your case, I would use the smallest c-hat among your models and conduct multimodel inference. Regarding your question on the influence of the c-hat, you can check it easily by changing the c-hat and seeing how the model ranking changes with increasing values of c-hat.
>
> Best,
>
> Marc
>
> ____________________
> Marc J. Mazerolle
> Département des sciences du bois et de la forêt
> 2405 rue de la Terrasse
> Université Laval
> Québec, Québec G1V OA6, Canada
> Tel: (418) 656-2131 ext. 7120
> Email: marc.ma...@sbf.ulaval.ca
> De : unma...@googlegroups.com <unma...@googlegroups.com> de la part de Danielle Rappaport <daniell...@gmail.com>
> Envoyé : 25 octobre 2018 15:48
> À : unmarked
> Objet : Re: [unmarked] occu and goodness of fit (mb.gof.test)
>  
> Marc, 
> Thanks for your speedy reply and help.
> There are a small subset of top-ranked models (delta AIC < 2 & c-hat < 3). In terms of selecting the one upon which I should advance my inferences, do you advise I select the model with the lowest AIC or the lowest c-hat score (they don't always correspond)? I'm not sure whether the difference in c-hat is negligible or not (e.g. 2.4 vs 2.5 vs 2.6). 
> Best,
> Danielle
>
>
> On Wednesday, October 24, 2018 at 8:55:03 PM UTC-4, Marc Mazerolle wrote:
> > Hi Danielle,
> >
> > if your goodness-of-fit test indicates lack of fit (P < alpha), it means that the data and model "disagree". The difficult part is determining the source of the lack of fit. If your c-hat estimate is not too large (say < 3, but hopefully closer to 1), the disagreement is probably due to data varying slightly more than expected for the distribution at hand. In this case, then you could use your c-hat estimate to inflate the variances and conduct (conservative) inferences. However, the disagreement could also be due to having a lousy model to begin with (lack of fit due to wrong model structure), and this lousy model structure would yield a large c-hat. In your case, I would pursue with the c-hat = 2.4. To know how large a c-hat is too large would require a small simulation study.
> >
> > Best,
> >
> > Marc
> >
> > ____________________
> > Marc J. Mazerolle
> > Département des sciences du bois et de la forêt
> > 2405 rue de la Terrasse
> > Université Laval
> > Québec, Québec G1V OA6, Canada
> > Tel: (418) 656-2131 ext. 7120
> > Email: marc.ma...@sbf.ulaval.ca

Danielle Rappaport

unread,
Oct 26, 2018, 9:12:33 PM10/26/18
to unma...@googlegroups.com
Marc and John, thank you for your very helpful replies. I will proceed with your suggestions.

I now understand what constitutes a "sub-global" model, and realize that I had indeed been using the c-hat score for the top-ranking model, and not the sub-global models (I got mixed up with vernacular). (When I had mentioned that I had several c-hat values, I was clumsily referring to scores from different sets of candidate models).  

Best,
Danielle

Reply all
Reply to author
Forward
0 new messages