Reversible Jump mcmc (RJMCMC) control question.

22 views
Skip to first unread message

Troy Wixson

unread,
Jun 6, 2023, 7:35:43 PM6/6/23
to nimble-users
Good Evening, 

I'm working on using RJMCMC for variable selection in a complicated model. I have found in simulation studies that (according to power and log-score) I get better results when the prior on the indicator for variable inclusion has mass between 0.3 and 0.7 (beta(25, 25)) than when the mass is concentrated below 0.2. This is confusing to me because the true proportion of included variables is 0.08. My guess is that the beta(25, 25) prior encourages proposing inclusion more than the sparser prior and thus the space is explored better but it seems problematic to artificially change the model to encourage inclusion. Is this a plausible reason for these strange results? Is there a better way to tune the RJMCMC? 

I think I may need to adjust the control parameters for the RJMCMC. Can anyone shed some light on what these values do and if/how these values could effect the exploration? 

Thanks, 
-Troy 

Matthijs Hollanders

unread,
Jun 6, 2023, 7:40:14 PM6/6/23
to Troy Wixson, nimble-users
Hey Troy,

This doesn’t really answer your question. But in one of the examples on the website, some simulated variables have small effect sizes (0.1 or so) and these are frequently excluded by the RJMCMC. So perhaps “true proportion” of included variables here is difficult to grasp depending on how you define the “inclusion”. For instance, for small effect sizes with not that much data, the RJMCMC may probabilistically exclude those predictors even though their true effect is non-zero. 

Just spitballing here, curious also to hear what others have to say. 

Matt

--
You received this message because you are subscribed to the Google Groups "nimble-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nimble-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nimble-users/53f41ad4-302c-49bd-a01e-83ddfaf9594bn%40googlegroups.com.
--
Dr. Matthijs Hollanders
Statistical Consultant – Quantecol
Postdoctoral Research Fellow – College of Science | Australian National University


Sally paganin

unread,
Jun 7, 2023, 11:08:00 AM6/7/23
to Matthijs Hollanders, Troy Wixson, nimble-users
Hi Troy, 

I second Matt's comment. I am not sure I have enough details to provide insights about this behavior. For example, with the sparse prior, do you end up having fewer or more variables included than the true ones? 

Anyway,  I would point out that the reversible jump in nimble uses a normal proposal distribution, and you can specify the mean and the scale of this distribution when calling configureRJ via the control argument. Maybe this could help in exploring the space differently while using the same prior for the inclusion probabilities. 

Hope this helps.

Best, 

Sally 


Troy Wixson

unread,
Jun 7, 2023, 5:13:01 PM6/7/23
to nimble-users
Thanks for the comments! 

The sparser prior results in fewer included variables than the prior with mass (incorrectly) between 0.3 and 0.7. This (incorrect) prior results in fewer included variables than were used in the data generation. The posterior mean of the inclusion probability hyperparameter is 0.02 in the sparse prior case and 0.04 in the incorrect prior case (and the true inclusion probability is 0.08). All of the variables that are included have the same true effect size. The incorrect prior results in including more variables in the model which means that it will sometimes include variables that should not be included but it is also includes more of the true variables. This seems to me like a bit of a power - false discovery tradeoff that I am trying to tune the priors/RJ around because I am willing to allow for a few more false discoveries in order to identify more of the true variables. 

Could y'all help me think a little about the control parameters? I'm assuming this is for a random walk proposal for the effect size node when that node is included in the model (and that the sampler remains dormant when the node is excluded). I know that the effect size is always positive and thus a random walk proposal which is centered at zero will result in impossible proposals half of the time that a node is brought from an excluded state to an included state. Is this proposal distribution the distribution that is used for all proposals for the node when it is included or is it just used when trying to bring a "dead" node back into the model. It looks like the sampler for the effect size node is removed in the set up for RJMCMC and it is replaced with this RJ-toggled sampler which suggests to me that this is the only sampler on the node. I ask because knowing the effect size is positive suggests that I want a proposal that is always positive when bringing nodes into the model, but I do not want a random walk which only proposes positive values once the node is in the model. 

I ran a few simulations with different values for the scale parameter in the control argument and was able to find scales that are too large and too small (inclusion drops even lower). There were a few values for the scale which seemed OK and didn't make much of a difference on the posterior mean of the inclusion probability hyperparameter. Are there any other parts to nimble's RJMCMC that I can play with tuning? 

Thanks again, 
-Troy 

Sally paganin

unread,
Jun 14, 2023, 1:10:05 PM6/14/23
to Troy Wixson, nimble-users

Hi Troy, 
Sorry for the late follow-up, but I have been traveling the past few days.

Thanks for the explanation. However, I feel the point about incorrect/correct priors in terms of variable selection is more of a statistical question than a software question, so that will be up to you to figure out. 

However, I can clarify what happens with the reversible jump in nimble (more in section 7.10 of the User manual ) 

The configureRJ function modifies the MCMC configuration to:

  (1) assign samplers that turn on and off variables in the model. This can happen in 2 possible ways depending on whether or not indicator variables are written explicitly in the model. The control argument allows specifying the mean and scale for the proposal distribution. This will be the proposal distribution that is used only "when trying to bring a dead node into the model". Notice that here one can potentially provide different proposal distributions for each coefficient; this is achieved by providing vectors for the mean and scale of the size of the vector of coefficients.  
  (2) modify the existing samplers for the regression coefficients to use ‘toggled’ versions. This is mainly to avoid sampling values for the coefficient parameters when they are out of the model (a.k.a. dead nodes); when the coefficients are in the model, it will use the same samplers that are assigned to the parameters before calling configureRJ. 

Finally, if you think there may be something wrong with the algorithm while experimenting with this, please send over a reproducible example so we can look into that.

Hope this helps!

Sally 

Reply all
Reply to author
Forward
0 new messages