estimating effect sizes using the indicator variable approach to variable selection

76 views
Skip to first unread message

eric.d.stolen

unread,
Apr 22, 2024, 1:07:05 PMApr 22
to hmecology: Hierarchical Modeling in Ecology

Suppose one is using an indicator variable approach to judge the relative strength of several covariates on some part of a hierarchical Bayesian model (such as the process model component of an occupancy model). Then one can calculate the proportion of times each variable was included in an MCMC iteration and use this as a judgement of variable importance. My question is how to get estimates for the effect parameters being selected? The first thing that come to mind is a choice between 1) using the full posterior for parameter estimate, or 2) selecting only the iterations that included that variable to compute the posterior estimates. The problem with the first is if the parameter has low frequency of selection, then the posterior is heavily influenced by the prior. For the second, this seems like it runs the risk of inflating the strength of the effect (Lukacs, P., K. Burnham, and D. Anderson. 2010. Model selection bias and Freedman’s paradox. Annals of the Institute of Statistical Mathematics 62:117-125). I am considering using a different approach. What if I use a slab and spike prior for the effects with the spike being an improper prior with infinite density centered at zero. The slab part is the normal uninformative prior for the parameter. I think this will accomplish something akin to what is achieved in model averaging using an information criterion. This is different than what is usually done with slab and spike priors where the spike is chosen to match the parameter estimated without variable selection, purely to improve mixing. I would like to know what others think about this idea, and if anyone has a citation that used this method?

Matthijs Hollanders

unread,
Apr 23, 2024, 8:42:27 AMApr 23
to eric.d.stolen, hmecology: Hierarchical Modeling in Ecology
Hey Eric,

I remember asking about this on the nimble user forums with regard to the reversible jump MCMC implemented in that package for variable selection. I ended up reporting full posterior probabilities (including iterations where the effect is not included (= 0) as well as the RJMCMC inclusion probabilities.

Cheers,

Matt

On Tue, Apr 23, 2024 at 3:07 AM eric.d.stolen <eric.d...@gmail.com> wrote:

Suppose one is using an indicator variable approach to judge the relative strength of several covariates on some part of a hierarchical Bayesian model (such as the process model component of an occupancy model). Then one can calculate the proportion of times each variable was included in an MCMC iteration and use this as a judgement of variable importance. My question is how to get estimates for the effect parameters being selected? The first thing that come to mind is a choice between 1) using the full posterior for parameter estimate, or 2) selecting only the iterations that included that variable to compute the posterior estimates. The problem with the first is if the parameter has low frequency of selection, then the posterior is heavily influenced by the prior. For the second, this seems like it runs the risk of inflating the strength of the effect (Lukacs, P., K. Burnham, and D. Anderson. 2010. Model selection bias and Freedman’s paradox. Annals of the Institute of Statistical Mathematics 62:117-125). I am considering using a different approach. What if I use a slab and spike prior for the effects with the spike being an improper prior with infinite density centered at zero. The slab part is the normal uninformative prior for the parameter. I think this will accomplish something akin to what is achieved in model averaging using an information criterion. This is different than what is usually done with slab and spike priors where the spike is chosen to match the parameter estimated without variable selection, purely to improve mixing. I would like to know what others think about this idea, and if anyone has a citation that used this method?

--
*** Three hierarchical modeling email lists ***
(1) unmarked: for questions specific to the R package unmarked
(2) SCR: for design and Bayesian or non-bayesian analysis of spatial capture-recapture
(3) HMecology (this list): for everything else, especially material covered in the books by Royle & Dorazio (2008), Kéry & Schaub (2012), Kéry & Royle (2016, 2021) and Schaub & Kéry (2022)
---
You received this message because you are subscribed to the Google Groups "hmecology: Hierarchical Modeling in Ecology" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hmecology+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hmecology/10423e2d-fdf0-49bc-b86e-df22fb487d15n%40googlegroups.com.

eric.d.stolen

unread,
Apr 28, 2024, 9:28:22 AMApr 28
to hmecology: Hierarchical Modeling in Ecology
Thanks Matt, I think your approach is sensible. I think my original idea is not correct, and here's why:
Averaging an effect by forcing the estimate to be zero when not included in an MCMC iteration will usually not be correct, because it will be dependent on other variables considered. Usually, the estimated effect for a covariate will depend on what other variables are included (which models are subject to selection). To see this, consider a case in which two variables A and B each affect a response Y and are also related to one another via some complex causal pathway. It seems clear that the estimated effect of A will differ depending on whether B is included in the model. For this reason, any model-averaged estimate of A will depend on the inclusion (or exclusion) of B from consideration in model selection or averaging. Perhaps we should think about the estimate of the effect of A when B is present as a different effect than that when B is not present, but we usually label both the same which often misleads us.

I think that restricting the posterior of an effect estimate to just the MCMC iterations when the variable was included perhaps makes sense, because then the effect is estimated over models with and without the other variables weighted by their inclusion probabilities. But I would be interested in hearing other’s thoughts on this. 

Eric

Quresh Latif

unread,
Apr 29, 2024, 9:22:16 AMApr 29
to hmecology: Hierarchical Modeling in Ecology
Hi Eric. I won't comment on the nuances of slab spike priors as that is beyond my comfort zone, but of your original options 1 and 2, neither is more correct in a general sense than the other. You need to decide what inference you want to make first, and then determine which of the two options will match your desired inference better. For example, for prediction I would use the full posterior estimate as that includes your uncertainty in whether the covariate matters at all. On the other hand, if you want to make inference on the relationship itself, it might make sense to focus on the conditional estimate (i.e., posterior when the variable is included).
Reply all
Reply to author
Forward
0 new messages