assessing model fit with parametric bootstrapping

1,320 views
Skip to first unread message

Matt Giovanni

unread,
Feb 26, 2010, 1:41:32 PM2/26/10
to unmarked
Hi everyone-

To begin with, I'd like to thank Andy, Ian, Richard, Elise for
developing this modeling package for R. It's a fantastic resource for
ecologists like myself that do not have a lot of programming skills.
I also want to thank Ian for his input thus far and for facilitating
discussion on Unmarked and the models therein.

I'm trying to assess goodness-of-fit for a set of n-mix abundance
models, and have followed the example of parametric bootstrapping in
the Unmarked overview/tutorial. My understanding is that each MC
simulation produces a mean squared error value for comparison with the
MSE value from the original model. Is this accurate? The MSE's are
then plotted into a histogram, and, this is where I am more unclear,
the original MSE is compared to the distribution of the simulated
MSE's with a t-test to produce a p-value?

Thanks for you input everyone. Have a great weekend!

Matt

Ian Fiske

unread,
Feb 28, 2010, 11:00:54 PM2/28/10
to unmarked

On Feb 26, 1:41 pm, Matt Giovanni <matthewgiova...@gmail.com> wrote:
> Hi everyone-
>
> To begin with, I'd like to thank Andy, Ian, Richard, Elise for
> developing this modeling package for R.  It's a fantastic resource for
> ecologists like myself that do not have a lot of programming skills.
> I also want to thank Ian for his input thus far and for facilitating
> discussion on Unmarked and the models therein.
>

Glad you're enjoying unmarked!

> I'm trying to assess goodness-of-fit for a set of n-mix abundance
> models, and have followed the example of parametric bootstrapping in
> the Unmarked overview/tutorial.  My understanding is that each MC
> simulation produces a mean squared error value for comparison with the
> MSE value from the original model.  Is this accurate?  

This is nearly correct -- we're comparing root-mean-squared error
(RMSE), though the difference shouldn't matter.

> The MSE's are
> then plotted into a histogram, and, this is where I am more unclear,
> the original MSE is compared to the distribution of the simulated
> MSE's with a t-test to produce a p-value?
>

The p-value is just a bootstrap estimate of the probability of RMSE
being larger than the observed one under the null hypothesis that the
fitted model is correct. So it can be interpreted similarly to a chi-
square GOF p-value, but there is no parametric distribution (chi-
square, t, normal) being used to compute the p-value. It comes
directly from the bootstrap distribution.

Hope that helps,
Ian

Richard Schuster

unread,
Apr 6, 2010, 3:08:21 PM4/6/10
to unmarked
Hello everyone,

I am trying to assess model fit (GOF) for some occu models at present
following the recommendations of Burnham and Anderson to test the GOF
on the global model as the nested models selected by AIC should also
fit the data.

The results from using the parboot function seem ok and the models fit
the data (H0 is not rejected). Just to see if I am using the routine
correctly and if I use the correct assumption (H0 = model fits the
data) I was trying out some "empty" models with "occu(~1 ~1, data)".
As those ones are not rejecting H0 either I was wondering if this
could be correct or if I am most likely not aware of a problem I am
having with my models/ interpret the parboot results wrong?

As the GOF test does not give any estimates on the 'power' of the
models I was wondering what measurements you are using to give
estimates of model strength/explanatory power? So far I am using ROC
but am not sure if that is really the way to go with the occupancy
models so I was hoping you might have some suggestions or could
comment on the methods you are using.

Thanks again for your help,
Richard

rcha...@nrc.umass.edu

unread,
Apr 6, 2010, 5:29:32 PM4/6/10
to unma...@googlegroups.com
Hi Richard,

The documentation I wrote for parboot isn't very good in v0.8-5. It
will be better in the next version which should be released soon. In
the meantime, I will try and clarify.

parboot simulates nsim number of datasets from a fitted model. It then
refits the model to each dataset and calculates a fit-statistic for
each simulation. The distribution of these values is an approximation
of the fit-statistic's sampling distribution for a given model. The
null hypothesis is that the observed fit-statistic is a random outcome
of this sampling distribution. If the fit-statistic is something like
the sum-of-squared-residuals, H0 will be rejected if the data are
under or overdispersed. In the next release, you will be able to
specify any fit-statistic you want.

There is no reason to be concerned if both your global model and your
null model fit the data well. In some cases, the null model might be a
good one. The formula ~1~1 specifies a model in which neither psi nor
p are affected by covariates. The estimates from this model are the
logit-scale estimates of overall psi and p.

As for your last question, I think you are asking about measures of
variation explained not true statistical power. The one metric
currently implemented in unmarked is Nagelkerke's R-squared index. See
the example on the modSel help page. We might ditch this, however,
because it is not clear how to determine the effective sample size
(ESS) for these models. Currently we define ESS as the number of sites
surveyed.

Richard


Richard Schuster

unread,
Apr 6, 2010, 5:58:43 PM4/6/10
to unmarked
Hi Richard,

Thanks for your reply.

Great to hear that I don't need to be concerned with a fit of both
null and global model. The availability of specifying the fit
statistic in the next release sounds great.

Yes I am concerned about variation explained, sorry for the rather
unclear formulation. Thank's for pointing out the Nagelkerke's R-
squared example. Are there any plans to possibly implementing a
different metric in future? And would you recommend against using the
R-squared calculated in the current package for publications?

Thanks,
Richard

rcha...@nrc.umass.edu

unread,
Apr 7, 2010, 8:20:58 AM4/7/10
to unma...@googlegroups.com
As far as I know, there is no generally accepted R-squared analogue
for these models. You could use explained deviance, but see this
discussion: http://finzi.psych.upenn.edu/R/Rhelp02/archive/75502.html.
I would just make sure that you clearly describe the metric you use
when you go to publish. To see the calculation in unmarked, look at

unmarked:::nagR2


Richard

Quoting Richard Schuster <ric.sc...@gmail.com>:

UMass Amherst
Natural Resources Conservation
nrc.umass.edu/index.php/people/graduate-students/chandler-richard/

Richard Schuster

unread,
Apr 7, 2010, 12:57:08 PM4/7/10
to unmarked
Thanks for your suggestions and the discussion link Richard.

Richard

On Apr 7, 5:20 am, rchand...@nrc.umass.edu wrote:
> As far as I know, there is no generally accepted R-squared analogue  
> for these models. You could use explained deviance, but see this  
> discussion:http://finzi.psych.upenn.edu/R/Rhelp02/archive/75502.html.  
> I would just make sure that you clearly describe the metric you use  
> when you go to publish. To see the calculation in unmarked, look at
>
> unmarked:::nagR2
>
> Richard
>

Reply all
Reply to author
Forward
0 new messages