Hello,
I ran parboot on a dynamic occupancy model after using mb.gof.test because it never gave a result for p.global, only a p value which equals 0. My dataframe, model, and parboot code is below, showing strong p values for all 3 fit statistics. Should I have run it on each season separately instead to get p values? Or does anyone have insight on why I didn’t get p. global values which would have shown p values for all 4 seasons? Thanks
>coyoteMS.umf <- unmarkedMultFrame(y= coyoteMS.data[,2:329],
siteCovs = coyoteMS.data[,c("natveg", "dev", "psize", "dist",”hp”)],
obsCovs=NULL,
yearlySiteCovs=NULL,
numPrimary=4)
>coyoteMS_global<-colext(psiformula= ~1, gammaformula = ~ 1, epsilonformula = ~ 1, pformula = ~natveg+dev+psize+dist+hp, coyoteMS.umf, se=TRUE)
>mb.gof.test(coyoteMS_global,nsim=1000)
Goodness-of-fit for dynamic occupancy model
Number of seasons: 4
Chi-square statistic:
Season 1 Season 2 Season 3 Season 4
7.619132e+09 2.132495e+44 1.193411e+48 2.267485e+22
Total chi-square = 1.193624e+48
Number of bootstrap samples = 1000
P-value = 0
Quantiles of bootstrapped statistics:
0% 25% 50% 75% 100%
6.3e+13 4.7e+17 3.9e+18 4.6e+19 1.8e+24
Estimate of c-hat = 1.577709e+26
> fitstats <- function(coyoteMS_global) {
+ observed <- getY(coyoteMS_global@data)
+ expected <- fitted(coyoteMS_global)
+ resids <- residuals(coyoteMS_global)
+ sse <- sum((resids^2),na.rm=TRUE)
+ chisq <- sum(((observed - expected)^2 / expected),na.rm=TRUE)
+ freeTuke <- sum(((sqrt(observed) - sqrt(expected))^2),na.rm=TRUE)
+ out <- c(SSE=sse, Chisq=chisq, freemanTukey=freeTuke)
+ return(out)
+ }
> parboot(coyoteMS_global,fitstats,nsim=1000,report=1)
t0 = 378.6994 8585.066 674.4327
Running in parallel on 7 cores. Bootstrapped statistics not reported.
Call: parboot(object = coyoteMS_global, statistic = fitstats, nsim = 1000, report = 1)
Parametric Bootstrap Statistics:
t0 mean(t0 - t_B) StdDev(t0 - t_B) Pr(t_B > t0)
SSE 379 -49.1 48.2 0.836
Chisq 8585 -44.0 408.8 0.513
freemanTukey 674 -16.5 64.8 0.608
t_B quantiles:
0% 2.5% 25% 50% 75% 97.5% 100%
SSE 288 333 395 428 458 519 586
Chisq 7583 7866 8343 8609 8886 9482 10188
freemanTukey 504 566 651 691 734 815 936
Hi Christina,
When you have a dynamic occupancy model fit with mb.gof.test( ), the function returns the global chi-square statistic and corresponding P-value. The global chi-square is computed by summing the chi-squares obtained for each season. This is exactly what you have in the output you included in your original message. If you look into the object a little further, you can also get the summary of observed vs predicted frequencies for each season separately. For example, if you stored the result of mb.gof.test( , nsim = 5000) in an object called outGOF, then you can access these season-specific chi-square tables with:
> dd$chisq.table$tables
These tables can help identify the seasons that are associated with the largest chi-square values. That being said, the huge values you are getting for each season suggest strong lack-of-fit. Also note that the second part of your code with the different test statistics you tried are not appropriate for a model based on detections/non-detections. For the chi-square to work with detection/non-detection data, you need to summarize information (e.g., for detection histories, or rows, or columns).
Good luck,
Marc
Chi-square statistic = 2.70343e+152
Number of bootstrap samples = 1000
P-value = 0
Quantiles of bootstrapped statistics:
0% 25% 50% 75% 100%
6.8e+37 1.3e+46 3.3e+48 6.0e+50 1.2e+61
Estimate of c-hat = 1.944875e+94
thank you
Christina,
The chi-square is still huge and so is the c-hat estimate. I would not use models with such lack-of-fit for inferences. I think you should back up a bit and do some reading on the approaches you are using. There are many resources available online and in print.
Best,
Marc