Hi Nels,
Thanks for making a new thread on this and also the two code ideas. This is a really interesting topic and we are hearing on-list and by other channels of interests in more complicated RJMCMC needs.
You are right that the RJ scheme already implemented is fairly limited. Each indicator can be paired with exactly one coefficient. The sampler for the indicator will use RJ logic to jointly propose making the coefficient 0 (if turning indicator to 0) or proposing a new indicator value (if turning indicator to 1). The regular sampler for the coefficient (what is used when it is "on") is placed inside of a toggle function, which simply skips it if it is off.
For background, the two issues addressed by this in general are wasted computation and coefficient mixing. If one simply codes indicators into the model without using the RJ samplers, two potentially undesirable things will happen. A coefficient will be sampled even when its indicator is 0, which represents wasted computation. And the coefficient will follow its prior when its indicator is 0, which means it might wander around for many iterations before reaching suitable values for the indicator to successfully mix to 1 again. In practice this has sometimes made people place informative priors on coefficients, essentially for implementation reasons rather than statistical/scientific reasons. That could be why you have informative priors on beta[j] and beta2[j].
In your first example, it looks to me like it will work correctly but suffers from both issues. The beta1[] and beta2[] elements for which z1[] and z2[], respectively, are 0, will continue to be mixed, wasting computation and following their priors. Possibly that is why you gave them informative priors. Also I am not seeing clearly whether this scheme allows you to have x1 or x2 not in the model at all. That could be covered if you add an extra column to X1 and X2 with zeros, so if that column is the currently selected one, in effect there is no X1 or X2. So aside from efficiency and having your hand forced to use informative priors (and I don't know if either are concerns for you in this project), it seems feasible.
In your second example, I think the I1 and I2 serve to turn the entire X1 or X2 on or off, respectively. But they don't satisfy a one-to-one indicator-coefficient pairing, so our basic RJ system won't work for them. So again it makes sense and could work unless informative priors or efficiency are concerns.
Here is another idea for setting it up with existing tools. And below I'll comment on extending the RJMCMC system. I am kind of assuming your informative priors were a concession to the implementation, so I will try to support uninformative priors.
z1 ~ dmulti(size=1,prob=rep(1/R,R))
z2 ~ dmulti(size=1,prob=rep(1/R,R))
z1i <- which(z1[1:R] == 1) # There was a minor bug discovered for compilation of "which" recently,
z2i <- which(z2[1:R] == 1) # so if this doesn't work, it could be done easily in a nimbleFunction instead
beta1_shared ~ dnorm(0, sd = 1000) # uninformative
beta2_shared ~ dnorm(0, sd = 1000) # ditto
l1 ~ dbern(phi) # (I'm not sure how well informed phi will be by two cases.)
l2 ~ dbern(phi)
Then you could do RJ with I1 indicating for beta1_shared and I2 indicating for beta2_shared. This would not save the wasted computation of updating the beta1[j] and beta2[j] elements. The idea would be that maybe the coefficients for the different r values are not wildly different, so they can share a coefficient with an uninformative prior and then have their own adjustments with informative priors. The latter will follow their priors when turned "off". There would also be wasted computation when z1 or z2 is updated if I1 or I2 is off (0), respectively. The beta1_shared and beta1[1:R] (ditto for beta2) are over-parameterized, but possibly in a way that will give reasonable interpretation (or post-processing) due to the informative priors. If the coefficient posteriors for different r values really overlap, you could possibly even remove the beta1[1:R] (ditto for beta2).
You might be able to put together variations on these ideas.
If you want to take this further, you could consider copying and modifying the RJMCMC code in the package. Although we've heard interest in this kind of thing from others as well, right now none of the core development team feels we can take time to do this ourselves, but we can support you and answer questions (on list or off list) if you want to get into it. It's all done from R in nimbleFunctions.
I hope that helps!
-Perry