GOF test with contrasting results for the same Royle-Nichols model

102 views
Skip to first unread message

Nielson Pasqualotto

unread,
Jun 29, 2021, 11:46:55 AM6/29/21
to unmarked
Hi all, 

I've been performing a goodness-of-fit test for Royle-Nichols occupancy models using the same approach suggested by Dr. Jeffrey Royle a few years ago (https://groups.google.com/g/unmarked/c/ICKiDUp_sZM/m/Y-8zy0F-iSkJ) and also used by Paolino et al. 2018 (https://doi.org/10.1093/jmammal/gyy075). However, for the same global model and data, the result obtained with the approach aforementioned is a little bit different from that returned by the function mb.gof.test() from the AICcmodavg package. See below:

#### Global model ####
> global_RN_rNull

Call:
occuRN(formula = ~1 ~ ArMn_2k + SHDI_0.8k + NatF_2k + Sav_2k + 
    ManF_2k + Past_0.8k + Agr_0.8k, data = umf, K = 550)

Abundance:
            Estimate    SE      z P(>|z|)
(Intercept)   1.4873 0.817  1.820 0.06880
ArMn_2k       0.4514 0.193  2.337 0.01945
SHDI_0.8k     0.6341 0.207  3.056 0.00224
NatF_2k      -0.3254 0.169 -1.931 0.05350
Sav_2k       -0.7562 0.276 -2.738 0.00617
ManF_2k      -0.0706 0.117 -0.606 0.54476
Past_0.8k    -0.0444 0.168 -0.264 0.79160
Agr_0.8k     -0.3126 0.194 -1.608 0.10779

Detection:
 Estimate    SE     z P(>|z|)
    -2.41 0.884 -2.73 0.00639

AIC: 278.817 
####################


#### GOF without AICcmodavg ####
fitstats <- function(global_RN_rNull) {
  observed <- getY(umf)
  expected <- fitted(global_RN_rNull)
  resids <- residuals(global_RN_rNull) 
  n.obs <- apply(observed,1,sum,na.rm=TRUE)
  n.pred <- apply(expected,1,sum,na.rm=TRUE)
  sse <- sum(resids^2,na.rm=TRUE)
  chisq <- sum((observed - expected)^2 / expected,na.rm=TRUE)
  freeTuke <- sum((sqrt(observed) - sqrt(expected))^2,na.rm=TRUE)
  freeTuke.n <- sum((sqrt(n.obs)-sqrt(n.pred))^2,na.rm=TRUE)
  sse.n <- sum( (n.obs -n.pred)^2,na.rm=TRUE)
  chisq.n <- sum((n.obs - n.pred)^2 / expected,na.rm=TRUE)
  out <- c(SSE=sse, Chisq=chisq, freemanTukey=freeTuke,
           SSE.n = sse.n, Chisq.n = chisq.n, freemanTukey.n=freeTuke.n)
  return(out)
}

pb_global_RN_rNull <- parboot(global_RN_rNull, fitstats, nsim=10000, report=1, parallel=F)
pb_global_RN_rNull

> pb_2nd_rNull

Call: 
parboot(object = global_2nd_step_rNull, statistic = fitstats_2nd_rNull, nsim = 10000, report = 1, parallel = F)

Parametric Bootstrap Statistics:
                  t0 mean(t0 - t_B) StdDev(t0 - t_B) Pr(t_B > t0)
SSE             45.2          2.278             2.75        0.206
Chisq          147.3        -30.612            44.97        0.794
freemanTukey    56.8         -0.951             2.96        0.606
SSE.n           51.8         -8.443             4.38        1.000
Chisq.n        792.7       -276.973           399.61        0.843
freemanTukey.n  19.2         -1.337             1.42        0.832

t_B quantiles:
                0% 2.5% 25% 50%  75% 97.5%  100%
SSE             32   37  41  43   45    48    52
Chisq          111  128 151 168  193   283  1510
freemanTukey    49   52  56  58   60    64    74
SSE.n           52   54  57  59   63    71    87
Chisq.n        606  694 844 978 1181  1974 16148
freemanTukey.n  17   18  20  20   21    24    29

t0 = Original statistic computed from data
t_B = Vector of bootstrap samples

c_hat <- pb_global_RN_rNull@t0[2]/mean(pb_global...@t.star[,2])
names(c_hat) <- "c-hat"
c_hat

> c_hat
    c-hat 
0.8279319 
####################


#### GOF AICcmodavg ####
gof_global_RN_rNull <- mb.gof.test(global_RN_rNull, nsim = 10000, plot.hist = T,
                                  ncores = 10, parallel = T)
gof_global_RN_rNull

> gof_global_RN_rNull

MacKenzie and Bailey goodness-of-fit for Royle-Nichols occupancy model

Pearson chi-square table:

Cohort Observed Expected Chi-square
0000      0       15    15.34       0.01
0001      0        8     4.11       3.69
0010      0        3     4.11       0.30
0011      0        2     2.25       0.03
0100      0        4     4.11       0.00
0101      0        1     2.25       0.70
0110      0        3     2.25       0.25
0111      0        2     1.87       0.01
1000      0        3     4.11       0.30
1001      0        2     2.25       0.03
1010      0        1     2.25       0.70
1011      0        3     1.87       0.68
1101      0        4     1.87       2.42
1110      0        4     1.87       2.42

Chi-square statistic = 15.9886 
Number of bootstrap samples = 10000
P-value = 0.2667

Quantiles of bootstrapped statistics:
  0%  25%  50%  75% 100% 
 2.2  9.7 12.7 16.3 46.3 

Estimate of c-hat = 1.2 
####################

Since the chi-square statistic value in the first approach (147.3) is very different from that obtained with mb.gof.test function (15.9886), I suspect the main difference between these two approaches is related to how detection histories are used to calculated de chi-square statistic, but I'm not sure of this.

I guess mb.gof.test() is today our best option to perform a GOF test for RN models but I would like to better understand the main difference between the two approaches that I mentioned here. I tried to read the mb.gof.test() code (https://github.com/cran/AICcmodavg/blob/master/R/mb.gof.test.R) to get some insights but I haven't figured it out yet.

If anyone can help clarify this I would be deeply grateful.

All the best,

Marc J. Mazerolle

unread,
Jun 30, 2021, 9:07:23 AM6/30/21
to unma...@googlegroups.com
Hi Nielson,

The difference in the results between the goodness of fit tests in your
example is due to the difference in the test statistic used. The
mb.gof.test( ) function is based on the MacKenzie and Bailey goodness
of fit test, which computes the observed and expected frequencies for
each detection history. The chi-square is then conducted on these
observed and expected frequencies. In contrast, the test statistics in
your fitstats( ) function are based on the raw values. For binary data,
such as in detection/non detection, you should use test statistics
based on summarized information instead of raw values. The test
statistics in your fitstats( ) function are appropriate for count data.

Best,

Marc
--
____________________________________
Marc J. Mazerolle
Professeur agrégé et directeur du bac. en environnements naturels et aménagés
Département des sciences du bois et de la forêt
2405 rue de la Terrasse
Université Laval
Québec, Québec G1V 0A6, Canada
Tel: (418) 656-2131 ext. 407120
Email: marc.ma...@sbf.ulaval.ca

Veuillez noter que je suis en télétravail. Le meilleur moyen de me rejoindre est par courriel.

-------- Message initial --------
De: Nielson Pasqualotto <pasqu...@alumni.usp.br>
Répondre à: unma...@googlegroups.com
À: unmarked <unma...@googlegroups.com>
Objet: [unmarked] GOF test with contrasting results for the same Royle-
Nichols model
Date: Tue, 29 Jun 2021 08:43:49 -0700

[Externe UL*]
Reply all
Reply to author
Forward
0 new messages