Hessian is singular error for Null model- help!

255 views
Skip to first unread message

Vishnu r menon

unread,
Nov 1, 2021, 6:19:17 PM11/1/21
to unmarked
Hello team,

I'm running a single season single species occupancy model, but my null model keeps giving the error 'Hessian is singular'. Naive occupancy for the species is fairly high (ranging between 0.26 to 0.65). I tried following some previous threads, but have been unable to solve the issue. I haven't had this issue when I've run very similar models for a different season.

Capture.JPG

umf <- unmarkedFrameOccu(y= fox_det[,37:71], 
                         siteCovs = veg_combined_postfire[,c('veg20_50cm','treatment','burn','canopy','edge_dist')],
                         obsCovs = NULL)


null <-occu(~1~1,data=umf)

Capture.JPG

Giving starts= c(1,0) gives me this:
Capture.JPG

Any help would be much appreciated, thanks a lot!

Regards,
Vishnu

Jeffrey Royle

unread,
Nov 1, 2021, 8:35:07 PM11/1/21
to unma...@googlegroups.com
Dear Vishnu,
 you're getting a singular Hessian warning there because the MLE is on the boundary of the parameter space (psi = 1). Sadly, your two-dimensional likelihood lives only in one dimension.
 The good news is you can use a logistic regression to study the variables that affect detection probability!
regards
andy


--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unmarked/ad9ecd05-aae1-4dc3-80a9-8efa3b5468d1n%40googlegroups.com.

Vishnu r menon

unread,
Nov 2, 2021, 9:16:14 PM11/2/21
to unmarked
Dear Andy,

Thanks for your reply, that makes sense. This might be a silly question- but why is the standard error so high when psi is almost 1 and given that the naive occ is high?

Regards,
Vishnu

Jim Baldwin

unread,
Nov 3, 2021, 3:27:59 AM11/3/21
to unma...@googlegroups.com
I wonder if the issue is about starting values.  If you try start=c(0.3, -3), do you get reasonable results?

I suspect that the estimate should not be on the border because the number of detections is greater than

n*visits * (1 - (n0/n)^(1/v))

where n is the number of sites (147), v is the number of visits to each site (which seems to be pretty consistent at either 34 or 35 visits per site), and n0 is the number of sites with no detections (which is 147 - 70 = 77).  The number of detections is 145 and the above formula results in

 147*35*(1-(77/147)^(1/35)) = 94.2

We have 145 > 94.2 and so the "real" maximum likelihood estimate shouldn't be 1.  (This formula works just for the model with no covariates.)

Jim


Jim Baldwin

unread,
Nov 3, 2021, 2:04:54 PM11/3/21
to unma...@googlegroups.com
Here's an update on why I think it's just a matter of better starting values.  Below is a made-up dataset that matches all of the summary statistics you gave and gives almost identical results.  (I've compacted the dataset to take up much less space than listing out all of the elements.  That's why it looks so odd.)  But then using different starting values than what you used gives the correct results.

But more importantly:  you should follow Andy's advice about using logistic regression and not worrying about less-than-perfect detection for this dataset.  And I'd argue you want to use all of the sites for studying occupancy but just the sites with at least one detection to study the probability of detection using logistic regression.  If you choose to continue with occu, then trying multiple starting values would be essential and settling on the starting values giving you the smallest AIC value.  It appears that while the detection probability for an individual visit is low, you have typically 35 visits to each site and the probability of at least one detection in 35 visits is very close to 1.

y = matrix(c(1,1,1,1,1,rep(0,29),NA,1,1,1,1,1,rep(0,29),NA,1,1,1,1,rep(0,30),NA,
  rep(c(rep(0,34),NA),32),rep(c(1,1,rep(0,33)), 64),1,rep(0,34),1,rep(0,34),1,
  rep(0,1609)), nrow=147, byrow=TRUE)

ufo = unmarkedFrameOccu(y = y)
summary(ufo)
# unmarkedFrame Object
#
# 147 sites
# Maximum number of observations per site: 35

# Mean number of observations per site: 34.76
# Sites with at least one detection: 70
#
# Tabulation of y observations:
#    0    1 <NA>
# 4965  145   35


occu(~1 ~1, ufo, start=c(1,0))
# Call:
# occu(formula = ~1 ~ 1, data = ufo, starts = c(1, 0))
#
# Occupancy:
#  Estimate   SE      z P(>|z|)
#      20.8 1712 0.0122    0.99
#

# Detection:
#  Estimate     SE     z P(>|z|)
#     -3.53 0.0842 -41.9       0
#
# AIC: 1322.89


So the results match up pretty well with what you presented.  However, using different starting values gives one more reasonable (and I think correct) results:

occu(~1 ~1, ufo, start=c(0.3,-3))
# Call:
# occu(formula = ~1 ~ 1, data = ufo, starts = c(0.3, -3))
#
# Occupancy:
#  Estimate    SE   z P(>|z|)
#     0.316 0.225 1.4    0.16
#
# Detection:
#  Estimate    SE     z   P(>|z|)

#     -2.97 0.108 -27.5 2.34e-166
#
# AIC: 1286.282
   (
The moral is anytime you use a program dependent on iterative methods, be prepared to give reasonable (or multiple) starting values.  (This advice doesn't just apply to occu.)

Jim

Reply all
Reply to author
Forward
0 new messages