Variable selection using RJMCMC

421 views
Skip to first unread message

gesta...@gmail.com

unread,
Oct 13, 2021, 5:25:52 PM10/13/21
to nimble-users
In this post ( https://r-nimble.org/variable-selection-in-nimble-using-reversible-jump-mcmc) it describes how to do variable selection using RJMCMC. This looks very cool, but as I played around with it, I realized that I have a question (probably very elementary, and I should know the answer, but I've been puzzling over it) about the interpretation of the posteriors for the regression coefficients.

Say a coefficient value is close to (but not at) zero, but the posterior inclusion probability for the covariate is about 0.6. What is the correct interpretation for the posterior for that coefficient? Is it interpreted only for the iterations where z[i] ==1 (which moves the mean away from zero) or should one use all the iterations, even when z[i]==0 (which may result in a posterior that has two modes - a greater one at zero and a lesser one away from zero)? I'd think the latter is correct, and the greater the spike at zero, the less important that variable is. 

Or, does one choose an inclusion threshold (say 0.5), and then refit the model, including only the covariates that exceed the threshold? 

In another (fairly large) dataset where I tried to apply this variable selection approach, I had one variable whose inclusion probability was > 0.8, but the posterior mean was also very, very close to zero. Maybe that indicated that variable are more likely to be included when a dataset is very large?  

Also, the example linked above used independent priors for z and beta, but I read in Hooten and Hobbs 2015 that this can cause problems if the prior for the beta is too vague. If one uses a more constrained prior, is the independent prior prior approach still reasonable?

The waters are deeper than I anticipated....

Glenn


John Clare

unread,
Oct 14, 2021, 2:43:18 PM10/14/21
to nimble-users
Hi Glenn,

I believe Nimble's RJMCMC sampler avoids the key issue with indicator variable selection mentioned by Hooten and Hobbs--moving off into some part of the parameter space where it becomes almost impossible to "re-add" the term.  Some of the workshop materials provide a few comparisons (e.g., https://github.com/nimble-training/nimble-TWS-workshop-2019) and the linked example shows reasonable behavior despite a diffuse N(0, sd=100) prior). 

Other folks can probably chime in with better answers regarding your broader question. For prediction, the full posterior is probably the way to go. If you had to select or interpret one model, I guess I've seen some arguments for considering the median probability model, although my understanding (limited) is that inference based on coefficients/CI from any model selection/variable selection/regularization exercise is tricky and perhaps at odds with what these techniques are designed to do.

Cheers,

John

Chris Paciorek

unread,
Oct 16, 2021, 1:18:16 PM10/16/21
to gesta...@gmail.com, nimble-users
Hi Glenn,

A few thoughts.

I would interpret the posterior for beta based on all the samples, including those where beta is exactly zero. And you can still interpret the proportion of times beta is exactly zero as the posterior probability that beta is zero. As far as refitting without the variable, I think the practical answer is that people may do that in cases where the posterior is pretty conclusive that the variable doesn't matter. That said, in that case, the refitted model will probably be pretty similar to the original fitted model, so not clear that refitting achieves much.

If you do have a large dataset, then I think that you could end up with a large posterior inclusion probability but the posterior having mass near zero. The data are probably indicating a non-zero, but small, effect, which a large dataset has enough power to determine.

I'm not taking the time to look back at Hobbs and Hooten (2015), but if their reason for choosing the prior for beta to not be too vague is for computational/MCMC mixing, then using RJMCMC can help with that problem. If the reason is statistical, then I don't think there is anything stopping you from setting up correlated priors in NIMBLE and then using our RJMCMC.

-chris

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/f2efd711-c648-4a21-a9a3-59cb7a03f20an%40googlegroups.com.

gesta...@gmail.com

unread,
Oct 25, 2021, 5:40:07 PM10/25/21
to nimble-users
John and Chris,

Sorry to be coming back to this so long after I posted - the response notification got buried in a bunch of emails and I missed it completely. But thanks for the thoughts. I'll take a look at those examples in the workshop materials.

Glenn
Reply all
Reply to author
Forward
0 new messages