Estimate the GOF and c-hat of RN model

574 views
Skip to first unread message

Sun, Ge

unread,
Jul 7, 2015, 2:11:01 PM7/7/15
to unma...@googlegroups.com
Hello,
I wonder is there any methods to estimate the GOF and c-hat of Royle-Nichols model? I used to use the following code to estimate the GOF of single-season occupancy model and RN model:

chisq=function(fm){
  umf=getData(fm)
  y=getY(umf)
  y[y>1]=1
  fv=fitted(fm,na.rm=T)
  y[is.na(fv)]=NA
  sum((y-fv)^2/(fv*(1-fv)),na.rm=T)
  }
pb=parboot(object=fmoR, statistic=chisq, nsim=10)

But after read the comments of Marc in this thread "Problems to assess the Goodness of fit in R&N models" (https://groups.google.com/forum/?hl=zh-CN#!topic/unmarked/ICKiDUp_sZM), I learnt that this method was inappropriate and what I should compare was the probabilities of different 1-0 detection histories rather than the "y-fv".
Thus I use the mb.gof.test function of AICcmodavg to estimate the GOF and c-hat of single-season occupancy model, but I haven't found a function which can estimate the GOF and c-hat of RN model.
Therefore I wonder whether it is appropriate to use the MacKenzie and Bailey (2004) method to estimate the GOF and c-hat of RN model? I am thinking about calculating the psi and p of each site from its lamda and r first, then calculating the probability of each 1-0 combination, finally using a method similar to the mb.gof.test method to calculate the chi-square statistic.
But since I am a beginner of R, it means a great challenge. On the other side, my main purpose of calculating c-hat is to estimate the fit of my camera-trap data before analysis. Thus could I borrow the c-hat value calculated from ordinary occupancy analysis to estimate the data-fit roughly when I conduct the RN model analysis?
Look forward to all your opinions. Thank you very much!
Best wishes,
Sun, Ge

Jeffrey Royle

unread,
Jul 7, 2015, 6:34:23 PM7/7/15
to unma...@googlegroups.com
Dear Sun,

 I think you're right that the MacKenzie and Bailey approach is, in principal, the right thing to do. I have not seen anyone implement that although computing the encounter history probabilities I don't think is a huge ordeal, just someone has to roll up their sleeves and get it done.
 Given that it doesn't exist (unless Marc Mazzerole has done it!) then I think your idea of using the ordinary occupancy model as a test is a good idea.  I think it has to be the case that if the occupancy model fits then so too must the RN model and the c-hat should be less under the RN model. This would be a nice simulation study to try out....  [note: I think the result is implied by the Royle (2006, Biometrics) paper].

regards
andy


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kery Marc

unread,
Jul 8, 2015, 2:38:13 AM7/8/15
to unma...@googlegroups.com
Dear Sun,

I also think that using the MB GOF test for the occupancy model will probably give conservative results when you want to use the RN model, since the RN probably fits better, because it eliminates part of the site-by-site heterogeneity in p.

Another cite that may be relevant for the relationship among 3 hierarchical models that give estimates of occupancy is Dorazio (2007, Ecology).

Regards  --  Marc


From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of Jeffrey Royle [jar...@gmail.com]
Sent: 08 July 2015 00:34
To: unma...@googlegroups.com
Subject: Re: [unmarked] Estimate the GOF and c-hat of RN model

Marc Mazerolle

unread,
Jul 8, 2015, 2:48:40 AM7/8/15
to unma...@googlegroups.com
Dear Sun,

Unfortunately, I have not yet implemented the MacKenzie and Bailey GOF test for the RN model yet.

Best,

Marc Mazerolle

Sun, Ge

unread,
Jul 8, 2015, 4:15:32 AM7/8/15
to unma...@googlegroups.com

Thanks a lot for all of your comments, Andy, Marc and Marc Mazerolle ! I would continue to do more readings :)
Best regards,
Sun, Ge

paol...@usp.br

unread,
Jul 9, 2015, 2:57:21 PM7/9/15
to unma...@googlegroups.com
Hello,

Recently, I've also assessed the model fit of my RN models with the function parboot and the graphical analysis of the values of chi-squere, but with a different code, as recommended by Fiske & Chandler (2011):

set.seed(1234)

chisq2 <- function(fm) {
   observed <- getY(umfRN_b200_leg2)
   expected <- fitted(fm)
   sum((observed - expected)^2/expected, na.rm=T)   
  }

pb <- parboot(rDES_lamGl, statistic = chisq2, nsim = 500)

plot(pb, main = "Ajuste do modelo", ylab="Frequência", xlab="qui-quadrado", breaks=50, ylim=c(0,150),)

It look similar than the method you said that is inappropriate, so now I'm in doubt if I did right.

I'm gonna read the papers Andy and Marc suggested, but does anyone know if I can trust in the results of this analysis?

Thank you everyone.

Best regards,

Roberta.


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Att.

Roberta Montanheiro Paolino
Bióloga
Mestranda em Biologia Comparada - FFCLRP / USP-RP
Laboratório de Ecologia e Conservação de Mamíferos

(16) 99182 3123

Jeffrey Royle

unread,
Jul 11, 2015, 11:53:29 AM7/11/15
to unma...@googlegroups.com
hi Roberta,

 I think for this GoF assessment to work properly you would have to base the fit statistic on the aggregate total detections per site or else the encounter history frequencies.  Using the fit statistic function you have you can compute the row sum of both the observed and predicted values and use the same chi-square statistic for that.
regards
andy

Sun, Ge

unread,
Jul 22, 2015, 11:10:49 AM7/22/15
to unmarked
Hi, Dear andy,
Do you mean I could use the mean value (the row sum divided by the No. of replications) of each row of the data frame as the Chi-square statistic and conduct the GoF?
Just as below:

chisqRN=function(fm){

  umf=getData(fm)
  y=getY(umf)
  y[y>1]=1
  y=apply(y,1,mean,na.rm=T)
  fv=fitted(fm,na.rm=T)
  fv=apply(fv,1,mean,na.rm=T)
  y[is.na(fv)]=NA
  sum((y-fv)^2/(fv*(1-fv)),na.rm=T)
  }
pb=parboot(object=fmoFit, statistic=chisqRN, nsim=50)

Thanks a lot! Best regards,
Sun, Ge

Jeffrey Royle

unread,
Jul 22, 2015, 12:32:35 PM7/22/15
to unma...@googlegroups.com
I was thinking you would use the TOTAL number of detections, not the mean. Of course this should come out the same if you're also using the mean of the fitted values which you seem to be doing.

I would highly recommend doing some simulation evaluations of the approach no matter what you do.

regards
andy

paol...@usp.br

unread,
Jul 23, 2015, 11:52:39 AM7/23/15
to unma...@googlegroups.com
Hi Andy and Sun,

Thank you very much for your suggestions, they are very helpful. I'm gonna try the total numbers of detections doing the sum of observed and predicted values.

Thanks again.

Best regards,
Roberta



Sun, Ge

unread,
Jul 26, 2015, 10:52:56 PM7/26/15
to unmarked, jar...@gmail.com
Thank you very much, andy! I have done the simulation evalution and it seems trustful to me.
Reply all
Reply to author
Forward
0 new messages