Using significance level to covariate selection (or reduction)

142 views

Skip to first unread message

Abid Rahman

unread,

May 18, 2016, 3:15:36 AM5/18/16

to unmarked

Dear All,

Greetings! I am using Unmarked package to estimate single season occupancy. I have 10 site covariates and only one pair is correlated. So, I still have 9 left. With all these covariates I'd have to run a huge number of models for model selection. In the run, there would be many nonsense models. I am looking for ways to choose covariates/models in conjunction with AIC.

So far, I am thinking of running univariate models and check the significance level. Would it be alright if I reduce covariates that are not significant?

Please see this image of two univariate models for example-

Here, the Area covariate (on the first model) is not significant and dfSanc covariate (on the second model) is significant. Now, the question again, should I remove Area covariate from further analysis, i.e. not using in further models? And proceed with dfSanc?

Thanks a lot in advance!

Cheers

Abid

John Clare

unread,

May 19, 2016, 8:51:06 PM5/19/16

to unmarked

Hi Abid,

Okay, I'll bite, but this sure isn't canon (this may be a useful reference among many others): opinions my own, trade-names do not suggest endorsement by the board, or the U.S. government, etc. I suggest quickly skimming the methods sections for a sample of papers related to your work, and see how these people performed variable selection. My guess is that you will find the process you propose is not adopted very often or even at all. If model selection is ultimately made using an I-T approach (and it has to be), it would probably make reviewers happier to remain consistent with the selection criteria throughout the process. A more consistent approach for exploratory analysis would be to use the dredge function (associated with library MuMIn) or some sort of stepwise procedure (see Richard Schuster's post above). Whether this is an objectively reasonable approach is a different question. Maybe this is more reasonable if the occupancy state or parameter is something being used for a further analysis that is the underlying motivation for the research (e.g., 'species with this general trait tend to have larger distributions in the study area', etc.), and maybe less reasonable if the purpose is to say substantive things about what truly influences the occupancy state of a single species. If you can break your set of covariates up into a set of meaningfully distinct hypothetical models (hypotheses being the basis for both AIC and p-values), maybe all the better if the point is inference about general influences. One pragmatic issue with an additive significance approach is that the effect of certain covariates may only be revealed via interactions.

John

Reply all

Reply to author

Forward

0 new messages