Estimating detection probability with categorical covariates

charlotte brown

unread,

Jan 9, 2015, 5:19:02 PM1/9/15

to unma...@googlegroups.com

Dear Unmarked Community,

I am using a pcountOpen model with 4 continuous and 1 categorical covariate in the detection model to estimate the effects of those covariates on detection probability. My question concerns how to treat the categorical covariate when estimating the effects of the continuous ones on p. I have set the covariate of interest as a sequence of numbers between its max and min scaled values, set other continuous covariates at their mean values, and the categorical covariate at one of the three possible levels. Code below:

>fm_best <- pcountOpen(~1, ~1, ~1, ~Temp + I(Temp^2) + TAS + I(TAS^2)+ Topo + DOY + MRain , cnti.umf, K=70, mixture = "P", dynamics = "trend", se = TRUE)

>topos<-as.character(c("valley floor","rocky slope","bajada"))

>new.dat12<-data.frame(TAS=seq(-2.3467, 3.197,length=32), Temp=0,MRain=0,DOY=0, Topo=factor("valley floor",levels=c(unique(topos))))

>TAS.dat12<-predict(fm_best,type="det",newdata=new.dat12,appendData=T)

I have had no trouble predicting p like this but am unsure if I should be computing average p for each continuous covariate among all levels of the categorical covariate (e.g., different Topo levels).

(1) Is it best to average the 3 predicted values of p at each possible levels (in other words, should I use the output of rowMeans(Avg) as seen below)?

>DOY.M=seq(-3.163,2.598,length=100)

>new.datDOY<-data.frame(DOY=DOY.M,MRain=0,TAS=0,Temp=0, Topo=factor("valley floor",levels=c(unique(topos))))

>DOY.dat<-predict(fm_best,type="det",newdata=new.datDOY,appendData=T)

>new.datDOY2<-data.frame(DOY=DOY.M,MRain=0,TAS=0,Temp=0, Topo=factor("rocky slope",levels=c(unique(topos))))

>DOY.dat2<-predict(fm_best,type="det",newdata=new.datDOY2,appendData=T)

>new.datDOY3<-data.frame(DOY=DOY.M,MRain=0,TAS=0,Temp=0, Topo=factor("bajada",levels=c(unique(topos))))

>DOY.dat3<-predict(fm_best,type="det",newdata=new.datDOY3,appendData=T)

>Avg=cbind(DOY.dat[,1],DOY.dat2[,1],DOY.dat3[,1])

>rowMeans(Avg)

(2) If so, how would I calculate the CIs? In Fig. 2 of Kery (2009), “Trend estimation in populations with imperfect detection”, predictions of detection probability were calculated by averaging over the categorical covariates, but how were the 95% CI calculated?

Any help on this would be appreciated!

Charlotte

Kery Marc

unread,

Jan 10, 2015, 9:24:32 AM1/10/15

to unma...@googlegroups.com

Dear Charlotte,

I think I would simply refit the model without the factor and then you're fine with the predict() function.

In the paper you cite we used BUGS and one of the greatest things about a Bayesian MCMC-based analysis is that you can forget about the delta method: getting SE's and other measures of estimation uncertainty are coming for free.

Kind regards --- Marc

______________________________________________________________

Marc Kéry
Tel. ++41 41 462 97 93
marc...@vogelwarte.ch
www.vogelwarte.ch

Swiss Ornithological Institute | Seerose 1 | CH-6204 Sempach | Switzerland
______________________________________________________________

*** Introduction to Bayesian statistical modeling: Kéry (2010), Introduction to WinBUGS for Ecologists, Academic Press; see www.mbr-pwrc.usgs.gov/pubanalysis/kerybook
*** Book on Bayesian statistical modeling: Kéry & Schaub (2012), Bayesian Population Analysis using WinBUGS, Academic Press; see www.vogelwarte.ch/bpa
*** Upcoming workshops: www.phidot.org/forum/viewforum.php?f=8

From: unma...@googlegroups.com [unma...@googlegroups.com] on behalf of charlotte brown [charlott...@gmail.com]
Sent: 09 January 2015 23:19
To: unma...@googlegroups.com
Subject: [unmarked] Estimating detection probability with categorical covariates

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

charlotte brown

unread,

Jan 13, 2015, 4:17:38 PM1/13/15

to unma...@googlegroups.com

Hi Marc,

Thank you for the help you have already provided me. As a follow up, I removed the Topo factor from the new dataframe I used to predict detection probabilities. However, when I do this the detection probability curve is no longer smooth (most likely because the maximum detection probabilities are much smaller for some factors of Topo compared to others?). Is there a way to mitigate this? Or am I doing something incorrectly in my code?

Attached is my R code and a picture of the detection probability plot

>fm_best <- pcountOpen(~1, ~1, ~1, ~Temp + I(Temp^2) + TAS + I(TAS^2)+ Topo + DOY + MRain , cnti.umf, K=70, mixture = "P", dynamics = "trend", se = TRUE)

>new.dat12<-data.frame(TAS=seq(-2.3467, 3.197,length=32), Temp=0,MRain=0,DOY=0)

>TAS.dat12<-predict(fm_best,type="det",newdata=new.dat12,appendData=T)

#unstandardizing

>TAS<-as.matrix(CNTI[,256:455])

>sd.TAS<-sd(c(TAS),na.rm=T)

>mean.TAS<-mean(TAS,na.rm=T)

>TAS<-(TAS-mean.TAS)/sd.TAS

>ORIG.TAS<-TAS.dat12$TAS*sd.TAS + mean.TAS

#plotting

> op<-par(mfrow=c(1,1))

>with(TAS.dat12,{

> plot(ORIG.TAS, Predicted, type="l", ylim=c(0,.13),

> xlab="TAS",ylab="Detection Probability")

> lines(ORIG.TAS, upper,col="gray")

> lines(ORIG.TAS, lower,col="gray")

>})

>par(op)

Thank you for your help.

Best,

Charlotte

Screen Shot 2015-01-13 at 2.06.30 PM.png

Kery Marc

unread,

Jan 14, 2015, 3:26:52 AM1/14/15

to unma...@googlegroups.com

Dear Charlotte,

I actually meant remove TOPO from the model, i.e., refit the model without the factor, and not simply removing if from the data frame (which I am not even sure why this works at all). I think that this will result in predictions for the continuous covariates that are automatically averaged over the effects of the TOPO factor (in the same sense as a model always implicitly averages over any factor/covariate that is not in it).

One comment about forming predictions, which is notoriously confusing when transformations are involved. Here is how I do it in a way that I get less easily confused.

(1) I first compute the "prediction covariate", which contains values in the range of the continuous covariate over which I want to predict:

TAS.original <- seq(-2.3, 3.2, length = 100) # More values result in smoother plots

(2) Then, I transform the values of the prediction covariate in the exact same way as I did in my analysis:

TAS <- (TAS.original – mean.TAS) / sd.TAS # Where mean.TAS and sd.TAS are computed from your actual data

(3) Then, I predict using the TAS covariate (which is standardised as in the analysis), i.e., I create the prediction data frame and use it for prediction:

new.dat <- data.frame(TAS=TAS, Temp=0,MRain=0,DOY=0)

predi.p.for.tas <- predict(fm_best, type="det",newdata=new.dat,appendData=T)

(4) Finally, when plotting, I use the original (unstandardised) values of TAS, e.g.

plot(TAS.original, predi.p.for.tas[1,])

Kind regards ---- Marc

______________________________________________________________

Marc Kéry
Tel. ++41 41 462 97 93
marc...@vogelwarte.ch
www.vogelwarte.ch

Swiss Ornithological Institute | Seerose 1 | CH-6204 Sempach | Switzerland
______________________________________________________________

*** Intro book on Bayesian statistical modeling: Kéry, 2010, Introduction to WinBUGS for Ecologists, Academic Press; see www.mbr-pwrc.usgs.gov/pubanalysis/kerybook
*** Book on Bayesian statistical modeling: Kéry & Schaub, 2012, Bayesian Population Analysis using WinBUGS, Academic Press; see www.vogelwarte.ch/bpa
*** Upcoming workshops: http://www.phidot.org/forum/viewforum.php?f=8

Kery Marc

unread,

Jan 14, 2015, 3:28:40 AM1/14/15

to unma...@googlegroups.com

.... where the last code line should rather be:

plot(TAS.original, predi.p.for.tas[,1])

charlotte brown

unread,

Jan 14, 2015, 6:01:21 PM1/14/15

to unma...@googlegroups.com

Hi Marc,

Thank you for clearing those things up! It was extremely helpful!

Best,

Charlotte

Paul Burr

unread,

Sep 5, 2018, 11:15:20 AM9/5/18

to unmarked

Hello Marc,

I am currently trying to do the exact same thing Charlotte did back when she posted this question. Specifically, I want to graph occupancy probability over the measured range of a continuous variable (Tree_dist). However, I have included a categorical variable in my occupancy model as well. Similar to what Charlotte was looking for, I want to predict occupancy over all values of Tree_Dist while averaging over the categorical covariate. Just like you would use the mean of another continuous variable.

I followed your suggestion by just fitting a new model without the categorical variable and used that model to predict occupancy over all values of Tree_dist. I am curious if this is still the best way to do this? When I include the categorical variable in my model, Tree_dist is NOT significant (p=0.2). However, when I remove the categorical variable and refit the model, Tree_dist IS significant (p < 0.0001).

If there were some way to predict occupancy over the range of Tree_dist while holding the categorical variable 'at its mean', I imagine the graph would look very different from the one created using the model that just didn't include the categorical variable all together.

Hopefully that all makes sense. Any thoughts or suggestions are much appreciated!