narrowly non-significant difference in occupancy between habitat types yet large overlapping error bars

416 views
Skip to first unread message

Michael Wysong

unread,
Jun 24, 2015, 6:27:20 AM6/24/15
to unma...@googlegroups.com
Hello unmarked group,

I am using the occu function to estimate occupancy and detection probabilities for three species from camera trap data discretized daily over 20 days at 78 sites with camera spaced an average of 2.75 km apart over an area of 2,200 km^2.  I have one site level covariate (habitat) for occupancy and one observation level covariate (camera placement: on vs off road) to explain probability of detection: p(road)_psi(habitat). This model performed best or as good as the other three models: p(.)_psi(habitat), p(road)_psi(.), and p(.)_psi(.) and I would now like to look at the effect of road and habitat on detection and occupancy respectively.  

When I run the model for one of the species, I get a narrowly non-significant p-value (0.053) for habitat:

Call:
occu(formula = ~road ~ habitat, data = frame)

Occupancy:
            Estimate   SE     z P(>|z|)
(Intercept)     1.97 1.18  1.66  0.0969
habitatS       -2.48 1.28 -1.93  0.0533

Detection:
            Estimate   SE     z  P(>|z|)
(Intercept)    -6.13 1.02 -6.03 1.68e-09
roadon          3.86 1.02  3.77 1.67e-04

However, when I compute the confidence intervals using the predict function the confidence intervals seem to overlap quite a bit:
 
> newData <- data.frame(habitat = 1:0)
> round(predict(pc4, type = "state", newdata = newData, appendData=TRUE), 4) #spin = 1, mulga = 0
  Predicted     SE  lower  upper habitat
1    0.3752 0.1435 0.1532 0.6659       1
2    0.8772 0.1276 0.4121 0.9865       0

It seems to me that such a narrowly non-significant p-value would generate confidence intervals that did not overlap as much and I am just wondering why this isn't the case or if I am doing something wrong.  I did try to calculate profile confidence intervals but I get a -inf value for the lower interval and a warning that the lower endpoint of the boundary is on the boundary.

I have two questions:

1) Do I just need to live with these wide overlapping CIs or is there perhaps something different I should try?

2) The species in question is dingo and like other large predators they generally have low detection (predicted=0.093 on road and 0.002 off roads) yet large home ranges (~500 km^2) and the long deployment was seen as needed to get enough detections for analysis; though a sensitivity test suggested that perhaps I could go as low as 15 days.  I am concerned however that even at 15 days I may be violating closure.  I've tried to address the closure assumption in the design by deploying cameras over a large area relative to home range size.  Would anyone be able to comment on whether violation of closure might be a concern in my case and if so could this be the reason for high CIs (or is it more likely due to insufficient data)?  Is there anyway to test for violation of closure in occu models?

Thanks in advance for any input

Sincerely,
Michael Wysong
PhD Candidate

Ecosystem Restoration & Intervention Ecology Research Group
School of Plant Biology (M090)
The University of Western Australia
35 Stirling Highway, Crawley WA 6009 Perth, Australia

Kery Marc

unread,
Jun 24, 2015, 6:34:39 AM6/24/15
to unma...@googlegroups.com

Dear Michael,

 

I have two comments, although they may not be directly helpful.

 

(1) for GLMs such as occupancy models you should better use likelihood ratio tests, rather than the Wald z tests reported in the summary.  The latter assume a symmetric sampling distribution of the estimates and I think this is often not the case. The LRT is more robust in this sense I believe and should use as a default significance test in GLMs.

 

(2) that thing with non-overlapping versus overlapping CI's is merely apocryphal and I am quite certain that you can have statistically significant differences between two groups and yet their 95% CI's may overlap. --- Surely some mathematician can prove that analytically, but for the rest of us, simulation would be the method to confirm or reject this belief.

 

Kind regards  --- Marc

--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Chandler

unread,
Jun 24, 2015, 7:36:02 AM6/24/15
to Unmarked package
Hi Michael,

To follow up on what Marc said, you shouldn't use overlapping CIs of two means to test a null hypothesis at the alpha=0.05 level. But you could determine if the CI of the *difference* in means overlaps zero (on the logit scale). You can do this in unmarked using:

confint(fm, type="state", method="normal")

or

confint(fm, type="state", method="profile")


Richard


--
Richard Chandler
Assistant Professor
Warnell School of Forestry and Natural Resources
University of Georgia

John Clare

unread,
Jun 24, 2015, 9:20:52 AM6/24/15
to unma...@googlegroups.com
Michael,

The closure assumption generally is relaxed for camera occupancy studies, since the area in front of a camera is never permanently occupied for most species sampled using the technique.  A common interpretation of psi in camera studies is 'probability of use' over the sampling duration.  In principle, an easy way to assess closure is by splitting your data up into separate primary occasions and using either a separate single season model to determine if there were changes in in the state parameter. But it would be tricky to interpret what was going on given the patch size sampled, I think: animals actually no longer available for detection (i.e. different seasonal use patterns) vs. animals using different trails in the same area vs. animals dying/relocating, etc.  I think I'd stick with the full sampling duration and the easier interpretation.
 
Re. CI, you've sampled a bunch of small points for a long time, and if you are worried about model predictive ability, one thing you might consider thinking about is the value of the habitat factor.  If the value is drawn from a very small spatial scale (a 30m pixel or something directly around the camera), it may not be very useful: if cameras are left out for a very long time, individual animals will visit substantial portions of their home ranges (including small patches of non-habitat) and you might better discriminate between used and non-used cameras by thinking about covariates at 2nd order of selection scale.       

Michael Wysong

unread,
Jun 25, 2015, 9:54:24 AM6/25/15
to unma...@googlegroups.com
Hi Marc,

Thank you for your feedback and yes both points are helpful.  Just to be clear, you are suggesting the following to test for the significance of habitat on occupancy:

pc1<-occu(~road~habitat, mydata)
pc2<-occu(~road~1, mydata)
LRT(pc1,pc2)

this gives me the following:
    Chisq DF  Pr(>Chisq)
1 6.70069  1 0.009637563

which now suggests that there is a significant effect of habitat on occupancy at the alpha=0.05 level (recall above the Wald test produced a p-val of 0.053).

confidence intervals of pc1 still produce overlapping values using the confint function as Richard suggested:

confint(pc1, type="state"):
                             0.025      0.975
psi(Int)          -0.3551019 4.28831325
psi(habitatS) -4.9884447 0.03501107

or as extracted from the linear comb:
linearComb(pc1, type="state", coefficients=matrix(c(1,1,1,0),2,2,byrow=T)):
       0.025     0.975
1 -1.7099478 0.6897255
2 -0.3551019 4.2883133

but what I am really interested in is the back transformed occupancy estimates for graphical presentation in the paper but these are even more equivocal (or at least misleading given the significance of the LRT test):

newData <- data.frame(habitat = 1:0)
round(predict(pc1, type = "state", newdata = newData, appendData=TRUE), 4):
  Predicted     SE      lower     upper   habitat
1    0.3752  0.1435  0.1532  0.6659       1
2    0.8772  0.1276  0.4121  0.9865       0

So it seems to me misleading to report habitat as significant but present a figure that seems to suggest otherwise (see attached figure for dingo). Perhaps this is o.k. and I just need to clarify why this is the case in the the discussion of this result.  Or would you suggest a different approach?

Also, thank you John for your very helpful input on the closure assumption with respect to camera trap studies.

Michael Wysong
2013_occupancy_results.jpeg

Kery Marc

unread,
Jun 25, 2015, 10:23:52 AM6/25/15
to unma...@googlegroups.com

Dear Michael,

 

my understanding is that in GLMs, one best tests for significance using a likelihood ratio test (LRT). An occupancy model is a sort of combination of two linked logistic regressions (= Bernoulli GLMs), so I assume this is the same in this hierarchical GLM. Therefore, yes, this is the test I would base my conclusions on about an effect of habitat.

 

Now about the plotting of the effect, with associated uncertainty interval: I think that your interval is based on some sort of normality assumption of the estimator (as is the Wald test) and so this may not be adequate. Richard suggested this:

 

confint(fm, type="state", method="profile")

 

I think that profiled intervals would be better and might better reflect the result from the LRT.

 

So, my interpretation of your discrepancy between the numbers and the plot would be that your current plot is somewhat inadequate. And, again, you may still have overlapping Cis, because they can overlap even if the difference is significantly different from zero.

 

Kind regards  - Marc

Michael Wysong

unread,
Jun 26, 2015, 6:40:58 AM6/26/15
to unma...@googlegroups.com
Hi Marc,

I have two problems with the profile confidence intervals and I am wondering if you or  anyone else has come across this:

1) when I run the code as Richard suggested above I get a warning: "Lower endpoint of profile confidence interval is on the boundary"  and the lower end of the CI collapses to -Inf.  This doesn't happen with method="normal". Example script and result:

> confint(pc4, type="state", method="profile")
Profiling parameter 1 of 2 ... done.
Profiling parameter 2 of 2 ... done.
                  0.025      0.975
psi(Int)      0.5028372 12.7659776
psi(habitatS)      -Inf -0.5551657
Warning message:
Lower endpoint of profile confidence interval is on the boundary. 

2) I can calculate profile confidence intervals for the other species but I have no idea of how to back transform them into meaningful CI intervals of occupancy on the 0:1 scale (or for state ="det" for that matter).  The predict function gives CIs but as far as I can tell these are just "normal" CIs and not profile CIs.  BackTransform of the linearComb also gives CIs but these are normal as well and as far as I can tell there doesn't appear to be a way to get profile CIs out of this either.  Of course I am still fairly new to this so I could be missing something.

On a final point, I am wondering what the CIs actually represent.  Are they 95% confidence around the predicted mean or are they 95% confidence  that an estimated occupancy value will fall with this range? I am assuming that it is the latter and hence the large CIs that I am getting (I've seen this in other published studies as well).

Thanks in advance for any advice you might have,
Mike

Kery Marc

unread,
Jun 26, 2015, 7:04:24 AM6/26/15
to unma...@googlegroups.com

Hi Mike,

 

let me make a stab, interspersed with your email below.

 

Kind regards  --- Marc

 

 

Von: unma...@googlegroups.com [mailto:unma...@googlegroups.com] Im Auftrag von Michael Wysong
Gesendet: Freitag, 26. Juni 2015 12:41
An: unma...@googlegroups.com
Betreff: Re: [unmarked] narrowly non-significant difference in occupancy between habitat types yet large overlapping error bars

 

Hi Marc,

 

I have two problems with the profile confidence intervals and I am wondering if you or  anyone else has come across this:

 

1) when I run the code as Richard suggested above I get a warning: "Lower endpoint of profile confidence interval is on the boundary"  and the lower end of the CI collapses to -Inf.  This doesn't happen with method="normal". Example script and result:

 

> confint(pc4, type="state", method="profile")

Profiling parameter 1 of 2 ... done.

Profiling parameter 2 of 2 ... done.

                  0.025      0.975

psi(Int)      0.5028372 12.7659776

psi(habitatS)      -Inf -0.5551657

Warning message:

Lower endpoint of profile confidence interval is on the boundary

 

MK: I am not sure that the –Inf is a problem. Since the interval is on the logit scale, that corresponds to 0 on the probability scale. So the 95% profile CI for the difference between the two habitats (it's the difference, right ?) would be (plogis(-Inf), plogis(-0.555)) and I would interpret that as (0, 0.364). I think that to get the CI on the probability scale, you simply backtransform to the two endpoints.

 

2) I can calculate profile confidence intervals for the other species but I have no idea of how to back transform them into meaningful CI intervals of occupancy on the 0:1 scale (or for state ="det" for that matter).  

MK: see above

 

The predict function gives CIs but as far as I can tell these are just "normal" CIs and not profile CIs.  BackTransform of the linearComb also gives CIs but these are normal as well and as far as I can tell there doesn't appear to be a way to get profile CIs out of this either.  Of course I am still fairly new to this so I could be missing something.

MK: It's just one method of computing a frequentist 95% CI. The interval means the usual thing: i.e., the probability doesn't say anything about the parameter, but is a statement about the reliability of the method of computing the CI: if you computed 100 such intervals in a similar situation, you'd expect 95 of them to contain the true value of the parameter.

Reply all
Reply to author
Forward
0 new messages