why does scaling covariate change occupancy estimate?

696 views
Skip to first unread message

Chris Baker

unread,
Jun 5, 2018, 1:01:05 PM6/5/18
to unmarked
I am estimating occupancy models with an observation covariate and do not understand why the scaling of the covariate affects the occupancy estimate in the way that it does. I understand that scaling/centering the covariate may affect the interpretation of the coefficient on the covariate, but that coefficient is not of direct interest to me -- I am including the covariate only to obtain better estimates of occupancy.

To illustrate, consider the following example with a fake covariate. For small covariate values, the occupancy estimate comes out close to 1. For 'large' covariate values, e.g. 10^5, the occupancy estimate comes out close to 0.5.

Why does the multiplier on the covariate affect the occupancy estimate, and not just get absorbed into the estimated coefficient on the covariate?

library("unmarked")
data(frogs)
umf <- unmarkedFrameOccu(pfer.bin)
boxplot(
  sapply(10^(1:6), function (multiplier) {
    sapply(1:10, function (X) {
      obsCovs(umf) <- data.frame(obsvar1 = multiplier * rnorm(numSites(umf) * obsNum(umf)))
      fm <- occu(~ obsvar1 ~ 1, umf)
      backTransform(fm["state"])@estimate
    })
  }),
  xlab="power on obsCov multiplier", ylab="occupancy estimate"
)


Richard Chandler

unread,
Jun 5, 2018, 2:02:04 PM6/5/18
to Unmarked package

Hi Chris,

This isn’t a problem in general, but you are using very extreme covariate values. When a covariate takes on such extreme values, it causes numerical problems on the logit scale, making it hard for the optimizer to find the MLEs. For example, if you fiddle around with this code you will see how sensitive the response is to small changes in beta1:

beta0 <- -1
beta1 <- 0.01
x <- 1e6*rnorm(100)      # covariate with extreme values
plogis(beta0 + beta1*x)

The solution is to avoid extreme covariate values by using a z-transformation or similar. This is true for most GLM-type models, not just the models in unmarked.

Richard


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Richard Chandler
Assistant Professor
Warnell School of Forestry and Natural Resources
University of Georgia

Jim Baldwin

unread,
Jun 5, 2018, 2:50:38 PM6/5/18
to unma...@googlegroups.com
Just to follow up on Richard's good advice...Your code produces the following:


(Note that the estimate of occupancy without any observational covariates is very close to 1.0)

​You could also use your example to show the necessity to standardize.  I've added some lines to standardize the observational covariates:

library("unmarked")
data(frogs)
umf <- unmarkedFrameOccu(pfer.bin)

# Get estimate of occupancy with no observational covariates
fm <- occu(~ 1 ~ 1, umf)
psi0 <- backTransform(fm["state"])@estimate


set.seed(12345)
g = boxplot(
  sapply(10^(1:6), function (multiplier) {
    sapply(1:10, function (X) {
      y <- multiplier * rnorm(numSites(umf) * obsNum(umf))
      y <- (y-mean(y))/sd(y)
      obsCovs(umf) <- data.frame(obsvar1 = y)
      fm <- occu(~ obsvar1 ~ 1, umf)
      backTransform(fm["state"])@estimate
    })
  }),
  xlab="power on obsCov multiplier", ylab="occupancy estimate"
)
lines(c(0,7), psi0*c(1,1), col="red", lwd=3)


​The red line is the estimate of occupancy when there are no covariates.

Jim

Chris Baker

unread,
Jun 5, 2018, 3:19:28 PM6/5/18
to unmarked
Ah, that makes sense. Thanks Richard and Jim for your quick responses!

C
Reply all
Reply to author
Forward
0 new messages