Dear Michael,
I have two comments, although they may not be directly helpful.
(1) for GLMs such as occupancy models you should better use likelihood ratio tests, rather than the Wald z tests reported in the summary. The latter assume a symmetric sampling distribution of the estimates and I think this is often not the case. The LRT is more robust in this sense I believe and should use as a default significance test in GLMs.
(2) that thing with non-overlapping versus overlapping CI's is merely apocryphal and I am quite certain that you can have statistically significant differences between two groups and yet their 95% CI's may overlap. --- Surely some mathematician can prove that analytically, but for the rest of us, simulation would be the method to confirm or reject this belief.
Kind regards --- Marc
--
You received this message because you are subscribed to the Google Groups "unmarked" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
unmarked+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dear Michael,
my understanding is that in GLMs, one best tests for significance using a likelihood ratio test (LRT). An occupancy model is a sort of combination of two linked logistic regressions (= Bernoulli GLMs), so I assume this is the same in this hierarchical GLM. Therefore, yes, this is the test I would base my conclusions on about an effect of habitat.
Now about the plotting of the effect, with associated uncertainty interval: I think that your interval is based on some sort of normality assumption of the estimator (as is the Wald test) and so this may not be adequate. Richard suggested this:
confint(fm, type="state", method="profile")
I think that profiled intervals would be better and might better reflect the result from the LRT.
So, my interpretation of your discrepancy between the numbers and the plot would be that your current plot is somewhat inadequate. And, again, you may still have overlapping Cis, because they can overlap even if the difference is significantly different from zero.
Kind regards - Marc
Hi Mike,
let me make a stab, interspersed with your email below.
Kind regards --- Marc
Von: unma...@googlegroups.com [mailto:unma...@googlegroups.com]
Im Auftrag von Michael Wysong
Gesendet: Freitag, 26. Juni 2015 12:41
An: unma...@googlegroups.com
Betreff: Re: [unmarked] narrowly non-significant difference in occupancy between habitat types yet large overlapping error bars
Hi Marc,
I have two problems with the profile confidence intervals and I am wondering if you or anyone else has come across this:
1) when I run the code as Richard suggested above I get a warning: "Lower endpoint of profile confidence interval is on the boundary" and the lower end of the CI collapses to -Inf. This doesn't happen with method="normal". Example script and result:
> confint(pc4, type="state", method="profile")
Profiling parameter 1 of 2 ... done.
Profiling parameter 2 of 2 ... done.
0.025 0.975
psi(Int) 0.5028372 12.7659776
psi(habitatS) -Inf -0.5551657
Warning message:
Lower endpoint of profile confidence interval is on the boundary
MK: I am not sure that the –Inf is a problem. Since the interval is on the logit scale, that corresponds to 0 on the probability scale. So the 95% profile CI for the difference between the two habitats (it's the difference, right ?) would be (plogis(-Inf), plogis(-0.555)) and I would interpret that as (0, 0.364). I think that to get the CI on the probability scale, you simply backtransform to the two endpoints.
2) I can calculate profile confidence intervals for the other species but I have no idea of how to back transform them into meaningful CI intervals of occupancy on the 0:1 scale (or for state ="det" for that matter).
MK: see above
The predict function gives CIs but as far as I can tell these are just "normal" CIs and not profile CIs. BackTransform of the linearComb also gives CIs but these are normal as well and as far as I can tell there doesn't appear to be a way to get profile CIs out of this either. Of course I am still fairly new to this so I could be missing something.
MK: It's just one method of computing a frequentist 95% CI. The interval means the usual thing: i.e., the probability doesn't say anything about the parameter, but is a statement about the reliability of the method of computing the CI: if you computed 100 such intervals in a similar situation, you'd expect 95 of them to contain the true value of the parameter.