Null model occupancy error

284 views
Skip to first unread message

Shona Macaffer

unread,
Feb 14, 2022, 9:30:15 AM2/14/22
to unmarked
Hello all, 

I am very new to r and unmarked but I am trying to run a single season, site occupancy model for my master thesis. 
I have data from 89 sites with 23 surveys of each. There were only 6 detections in 6 different sites. When I am trying to run the Null model, without covariates, the occupancy always comes out at 1. I am wondering if there is a problem with the small sample size or perhaps the code? 

I am running the code as follows: 

>df<- read.csv("detect_data.csv",
                     row.names = "X")

>umf<- unmarkedFrameOccu(y = as.matrix(df))

unmarkedFrame Object

89 sites
Maximum number of observations per site: 23
Mean number of observations per site: 23
Sites with at least one detection: 6

Tabulation of y observations:
   0    1
2041    6 

>fm<- occu(formula = ~1
                ~1,
                data = umf)

Call:
occu(formula = ~1 ~ 1, data = umf)

Occupancy (logit-scale):
 Estimate   SE       z P(>|z|)
     23.4 3537 0.00662   0.995

Detection (logit-scale):
 Estimate    SE     z  P(>|z|)
    -5.83 0.409 -14.3 3.99e-46

AIC: 85.97085
Number of sites: 89
optim convergence code: 0
optim iterations: 24
Bootstrap iterations: 0 

>backTransform(fm, type = "state")

Backtransformed linear combination(s) of Occupancy estimate(s)

 Estimate       SE LinComb (Intercept)
        1 2.41e-07    23.4           1

Transformation: logistic 

> backTransform(fm, type = 'det')
Backtransformed linear combination(s) of Detection estimate(s)

 Estimate      SE LinComb (Intercept)
  0.00293 0.00119   -5.83           1

Transformation: logistic

 > boot::inv.logit(coef(fm)[1])
psi(Int)
       1 


Any help would be really appreciated! 
Thanks, 
Shona


Marc Kery

unread,
Feb 14, 2022, 10:37:38 AM2/14/22
to unmarked
Dear Shona,

even though the optim convergence code is 0, which means "good", you seem to have a numerical failure here ..... I think. Or at least you have what is called a boundary estimate for occupancy, i.e. one which is at the bound of the space where a parameter is defined (i.e., 1). With only a single detection out of 23 visits, for that handful of 6 sites, it looks as if detection prob is minute (plogis(-5.83) = 0.003). Even for the 23 visits combined, this is still only 1 - (1-0.003)^23 = 0.07. A simplistic estimator of the number of occupied sites is the observed number divided by detection prob. You have 6 sites with observed presences, and dividing that by 0.07 is essentially equal to your number of sites (up to rounding error). Hence, all sites are guessed to be occupied. This may be reasonable result, but more likely it is not and due to some instability of these very sparse data.

In general, and I am afraid to say, you have only a minute amount of information about your species. With even the simplest possible model providing results of doubtful use, I don't think there is any statistical modeling that can be done with these data.

Happy to hear anybody saying something more optimistic !

Best regards  --- Marc


From: unma...@googlegroups.com <unma...@googlegroups.com> on behalf of Shona Macaffer <chompm...@gmail.com>
Sent: Monday, February 14, 2022 14:52
To: unmarked <unma...@googlegroups.com>
Subject: [unmarked] Null model occupancy error
 
--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/f79c8c5a-b0d0-4017-a5a1-ce8019d133abn%40googlegroups.com.

Jim Baldwin

unread,
Feb 14, 2022, 1:14:06 PM2/14/22
to unma...@googlegroups.com
Not more optimistic but just a confirmation of the border estimate of 1 for the occupancy probability.

The log of the likelihood for the null model is

image.png

That is maximized with the estimate of psi being 1 and p being 0.00293112 = 6/(23*89) and the maximum of the log of the likelihood being -40.985425.

A contour plot of the log of the likelihood also confirms this (where the red dot is where the maximum likelihood occurs):

image.png
Plotting p on a logit scale would probably make this look a little clearer but that might be unnecessary.  Here's a 3D plot:

image.png

"One who lives by maximum likelihood dies by maximum likelihood."  (And I think you can substitute AIC and AICc for maximum likelihood, too.)

Jim


Shona Macaffer

unread,
Feb 15, 2022, 7:49:26 AM2/15/22
to unmarked
Dear Marc and Jim, 

Thank you so much for your very clear explanations. It makes a lot of sense. 
I will have to make another plan with the data. 

Cheers, 
Shona 
Reply all
Reply to author
Forward
0 new messages