Group and season-specific occupancy estimates using colext

404 views
Skip to first unread message

Billy

unread,
Mar 23, 2020, 7:45:24 AM3/23/20
to unmarked
Hi all, 

I have a question that I haven't been able to find an answer for yet on the forum. 

I've got a multi-season dataset (21 seasons, 150 sites, uneven effort at each site) that I'm running colext models for, so that I can model occupancy dynamics over time. Summary as follows:

unmarkedFrame Object

150 sites
Maximum number of observations per site: 252 
Mean number of observations per site: 79.17 
Number of primary survey periods: 21 
Number of secondary survey periods: 12 
Sites with at least one detection: 147 

Tabulation of y observations:
    0     1  <NA> 
10620  1255 25925 

My question that I am trying to answer is how occupancy, colonisation and extinction differ between two treatments. So my sitecov matrix looks like this:

sitecovs = data.frame(Treatment = matrix(c(rep("T",25), rep("C", 25), rep("T", 25), 
                                           rep("T", 25), rep("C", 25), rep("C",25)), nrow=150))

And I also want to test for differences across seasons, so this is in the yearlysitecovs: 

season = matrix(c("1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21"), 150, 21, byrow=TRUE)


And my formula looks like this:

m1 <- colext(psiformula = ~Treatment-1, # Occupancy
              gammaformula = ~ Treatment*season-1, # Colonization
              epsilonformula = ~ Treatment*season-1, # Extinction
              pformula = ~ 1, # Detection
              data = umf)

I am wanting to calculate Occupancy at each time step at each treatment. I have been able to follow the example in the dynamic occupancy model vignette and produce an occupancy estimate over time across all my sites, but I now want to look at the two treatments specifically. 

My understanding is that I need to calculate this manually using the initial occupancy, and colonisation and extinction estimates and iterate through each time step using a function, and then use nonparboot to get the CIs. I've modified a function from a similar question which gets me most of the way there (https://groups.google.com/forum/#!topic/unmarked/xXAYH19pLvg), but I can't work out how best to modify it to capture the Treatment-specific and season-specific colonisation and extinction estimates (presumably they also require some for loops?):
psi.year<-function(fm){
    psi.hat<-plogis(c(sum(coef(fm, type='psi')),coef(fm, type='psi')[1]))
    gam.hat<-plogis(c(sum(coef(fm, type='col')),coef(fm, type='col')[1]))
    eps.hat<-plogis(c(sum(coef(fm, type='ext')),coef(fm, type='ext')[1]))
    psi.mat<-matrix(NA,21,2) #21 'seasons' of data, 2 columns - 2 treatments
    psi.mat[1,]<-psi.hat
    for(t in 2:21){
      psi.mat[t,]<-(psi.mat[t-1,]*(1-eps.hat))+((1-psi.mat[t-1,])*gam.hat)
    }
    return(c(psi.mat)) #Had to turn it into a vector so parboot would run
}

So, my question is, how should I modify the above function to get treatment and season-specific estimates of Occupancy? Or is there a better approach to achieving the same result that i'm missing (e.g. is it better to just use a stacked approach)? 

Thanks in advance, 
Billy

Billy

unread,
Mar 30, 2020, 6:32:57 AM3/30/20
to unmarked
Anyone?

Ken Kellner

unread,
Mar 30, 2020, 4:41:34 PM3/30/20
to unmarked
Hi,

If I am understanding your question correctly, a potential alternative approach would be to use the empirical Bayes posterior predictive distribution. For some discussion see here:


I wrote up a quick example based on the `colext` frog example below. In the future some of the functions I used will be built-in to unmarked (they already are in the dev version).

library(unmarked)

data
(frogs)
umf
<- formatMult(masspcru)
obsCovs
(umf) <- scale(obsCovs(umf))
     
## Use 1/4 of data just for run speed in example
umf
<- umf[which((1:numSites(umf)) %% 4 == 0),]

#Add fake site covariate as an example
#takes values 'a' or 'b'
siteCovs
(umf) <- data.frame(fakeCov=factor(sample(c('a','b'), numSites(umf),
                                    replace
=T)))
fm
<- colext(psiformula = ~ fakeCov,
             gammaformula
= ~ fakeCov,
             epsilonformula
= ~ fakeCov,
             pformula
= ~ JulianDate + I(JulianDate^2), umf)

# Empirical Bayes estimates of number of sites occupied in each year
re
<- ranef(fm)

########################################################################
#These two functions below will be built-in in next version of unmarked
#Sample from posterior
postSamples
<- function(object, nsims=100){

  N
<- dim(object@post)[1]
  K
<- dim(object@post)[2]
  T
<- dim(object@post)[3]

 
out <- array(NA, c(N, T, nsims))

 
for (n in 1:N){
   
for (t in 1:T){
       
out[n, t, ] <- sample(0:(K-1), nsims, replace=TRUE,
                              prob
=object@post[n,,t])
   
}
 
}
 
out
}

#predict method for ranef
prRanef
<- function(object, func, nsims=100){

  ps
<- postSamples(object, nsims=nsims)
  s1
<- func(ps[,,1])
  nm
<- names(s1)
  row_nm
<- rownames(s1)
  col_nm
<- colnames(s1)

 
if(is.vector(s1)){
    out_dim
<- c(length(s1), nsims)
 
} else{
    out_dim
<- c(dim(s1), nsims)
 
}

  param
<- apply(ps, 3, func)

 
out <- array(param, out_dim)
 
 
if(is.vector(s1)){
    rownames
(out) <- nm
 
} else {
    rownames
(out) <- row_nm
    colnames
(out) <- col_nm
 
}

  drop
(out)
}
###############################################################################

#Custom function to calculate mean occupancy for each group fakeCov='a' or 'b'
#in each time step, for a particular sample from posterior pred distribution
avgOcc
<- function(x){ #x takes as input M x T matrix (M sites, T years)
 
  nyr
<- umf@numPrimary
 
#output is matrix: 2x7, 2 groups (a and b), 7 years
 
out <- matrix(NA, nrow=2, ncol=nyr)
  colnames
(out) <- paste0('yr',1:nyr)
  rownames
(out) <- c('fakeCovA', 'fakeCovB')

 
#For each year, calculate the mean occupancy for each of the 2 groups
 
for (t in 1:nyr){
   
out[1,t] <- mean(x[siteCovs(umf)$fakeCov=='a',t])
   
out[2,t] <- mean(x[siteCovs(umf)$fakeCov=='b',t])
 
}

 
out

}

#Take 100 samples from posterior predictive distribution
#and apply the function each sample
set.seed(123)
pr
<- prRanef(re, func=avgOcc, nsims=100)

#Output is 2x7x100
pr

#Mean occupancy for each treatment group for each time period
apply
(pr, c(1,2), mean)

#Distribution of values for treatment a in year 1
hist
(pr[1,1,])

#Fancy plot of mean occupancy for each treatment over time
library
(tidyverse)
library
(ggplot2)

#Calculate means and 95% CI lower and upper bounds and tidy
mns
<- data.frame(apply(pr, c(1,2), mean)) %>%
  rownames_to_column
('treatment') %>%
  gather
(key='year',value='occ', yr1:yr7)

lower
<- data.frame(apply(pr, c(1,2), quantile, 0.025)) %>%
  rownames_to_column
('treatment') %>%
  gather
(key='year',value='lower', yr1:yr7)

upper
<- data.frame(apply(pr, c(1,2), quantile, 0.975)) %>%
  rownames_to_column
('treatment') %>%
  gather
(key='year',value='upper', yr1:yr7)

plot_dat
<- mns %>% left_join(lower) %>% left_join(upper)

plot_dat
%>%
  mutate
(year = as.numeric(gsub("yr", "", year))) %>% #convert yr to numeric
  ggplot
(aes(x=year, y=occ)) +
  geom_ribbon
(aes(ymin=lower, ymax=upper,fill=treatment),alpha=0.4) +
  geom_line
(aes(col=treatment)) +
  geom_point
(aes(col=treatment), size=2) +
  labs
(x='Year',y='Mean occupancy') +
  theme_bw
() +
  theme
(panel.grid.major=element_blank(), panel.grid.minor=element_blank())
   


occ_fig.png

Billy

unread,
Apr 6, 2020, 6:43:35 AM4/6/20
to unmarked
Hi Ken, 
Thanks so much for this and such a clear explanation/worked example, it worked perfectly when applied to my data. 

Cheers 
Billy

Billy

unread,
Aug 10, 2020, 10:25:17 PM8/10/20
to unmarked
Hi Ken and others, 

I have a follow up question. Is there a similar set of functions that could be used to calculate group-specific abundance estimates and CIs using the empirical Bayes posterior predictive distribution?

I am trying to do this using a capture recapture dataset with multinomPois models, but getting stuck at the prediction stage as the CIs coming out of the predict function (both SE*1.96 and the upper/lower) are very uniform, which doesn't seem right to me. 

Thanks in advance


On Tuesday, March 31, 2020 at 7:41:34 AM UTC+11, Ken Kellner wrote:

Ken Kellner

unread,
Aug 11, 2020, 9:56:47 AM8/11/20
to unmarked
The same approach should work for abundance models also. Assuming you have the latest unmarked, see the following example with multinomPois:

library(unmarked)

data(ovendata)

set.seed(123)
sc <- as.data.frame(scale(ovendata.list$covariates[,-1]))
#Add dummy factor covariate
sc$facCov <- factor(sample(c("a","b"), nrow(sc), replace=TRUE))

#Create umf
ovenFrame <- unmarkedFrameMPois(ovendata.list$data,
  siteCovs=sc,
  type = "removal")

fm1 <- multinomPois(~ 1 ~ ufc + facCov, ovenFrame)

r <- ranef(fm1)

#Function to apply to each simulation
#Mean abundance by group
group_func <- function(x){
  out <- numeric(2)
  out[1] <- mean(x[sc$facCov=="a"])
  out[2] <- mean(x[sc$facCov=="b"])
  out
}

pr <- predict(r, func=group_func, nsims=100)
#output is 2 x nsims

#Get mean and 95% CIs
plot_inp <- data.frame(group=c("a","b"), mean=rowMeans(pr),
                       lower=apply(pr, 1, quantile, 0.025),
                       upper=apply(pr, 1, quantile, 0.975))

#Plot
library(ggplot2)
library(tidyverse)
plot_inp %>%
  ggplot(aes(x=group, y=mean)) +
  geom_errorbar(aes(ymin=lower, ymax=upper), width=0.2) +
  geom_point(size=2) +
  labs(y="Mean abundance", x="Group")

Plot:

mpois.png


It's possible that your dataset or model have some other issues that are giving you strange results, but it's hard to help with that without seeing the code and output.

Ken

Richard Chandler

unread,
Aug 11, 2020, 10:18:43 AM8/11/20
to Unmarked package
Hi everyone,

I'd just like to emphasize that Ken's addition of a `predict` method for `ranef` objects is a great new feature that many users may not have noticed. It allows for posterior prediction using Empirical Bayes methods. In other words, you can compute predictive distributions for functions of latent variables such as abundance or occurrence. This is very useful for generating point estimates and confidence intervals for derived quantities such as total abundance in the surveyed region or the proportion of sites occupied. 

Unlike the `predict` methods that have always been available, which focus on the *expected* values of abundance or occurrence, the Empirical Bayes `predict` methods focus on *realized* values of abundance and occurrence, which are specific to the sampled plots and conditional on the observed data. Both types of prediction are important for different objectives, but it wasn't until recently that the posterior prediction methods were added. Thanks Ken!

Richard


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/c522cd27-8a5d-41b5-b661-10b89eb30dd6o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages