Enumerate vs. MCMC in Infer

34 views
Skip to first unread message

Isaac Davis

unread,
May 11, 2020, 1:14:10 PM5/11/20
to webppl-dev
Hi folks. I really appreciate all the help I've been receiving. I'm having a weird issue using the 'MCMC' method in an Infer operator. The code for this project is somewhat complicated, so I thought I would ask a general question first, and if that doesn't help, I will try to construct a simplified version of the program that's giving me the issue. 

Essentially, I have a model that returns a posterior distribution over 4 possible values (call them A,B,C, and D), conditioned on some evidence E. When I run the model using the 'enumerate' method inside the infer operator, it appears to return the correct distribution (I get something like "A: .4, B: .4, C: .15, D: .05"). When I run the exact same model using the 'MCMC' method, however, it returns a Delta distribution concentrated on a single value (e.g. "A: 1"). Furthermore, the value that it is concentrated on changes from run to run; i.e. if I run it twice with the exact same inputs (evidence), the first run might return "A: 1" while the second run might return "B: 1." Again, the code is identical except that I use 'MCMC' instead of 'Enumerate.' I have tried adjusting the # of samples (all the way up to 100,000, which seems like way too many for this model), as well as the lag and burn, but nothing seems to fix it. 

I'm wondering if anyone has an idea as to what could be causing this discrepancy in behavior. I realize this may be impossible to answer without looking at the code, but I thought I'd see if anyone has a general suggestion as to what the problem might be before getting into the gritty details. 

(to clarify why I need to use 'MCMC' at all: the next step of this project requires embedding the current model inside a hierarchical model with continuous parameters, so I cannot use 'enumerate' for the next part). 


Thanks!
-Isaac


null-a

unread,
May 11, 2020, 2:36:43 PM5/11/20
to webppl-dev
The fact that all the mass is on a single sample suggests to me that MCMC is initializing OK, (that is, finding an initial assignment to all `sample` statement by sampling from the prior), but failing to successfully take any MCMC steps. Here's one possibility for why that might be. A single step of MCMC involves selecting a *single* `sample` statement to re-sample from the prior. From that, the MH acceptance probability is computed, and the new assignment to the `sample` statement is accepted or rejected accordingly. A problem can arise when the choices made at `sample` statements are very highly correlated, as it can then be next to impossible to make a move using this single site strategy. e.g. If there are two latent variables whose sum is constrained to be a constant, then it might be impossible to take an MCMC step by modifying only one of them. (Since resampling only one choice will result in the constraint being violated, which in turn leads to an acceptance probability of zero.) There are probably other potential causes, but this is the one that comes to mind.

Isaac Davis

unread,
May 11, 2020, 3:05:03 PM5/11/20
to webppl-dev
That is good to know, and I think that is likely the cause in this case. A quick follow up just to make sure I understand:

Suppose I have a generative model of the form you suggest, e.g.

var model=function(){
     var A=flip(.1);
     var B=flip(.6);
     factor(A+B<2 ? 0 : -Infinity);
     return [A,B] 
}

var D=Infer({method: 'MCMC', samples: 1000}, model);

When you say

"A single step of MCMC involves selecting a *single* `sample` statement to re-sample from the prior."

The way I interpret this is: for each of the 1000 proposals that are generated when computing D, the Nth proposal is generated by either

a) Sampling A from Bernoulli(.1) and using the (N-1)th value of B, or
b) Sampling B from Bernoulli(.6) and using the (N-1)th value of A

But the Nth proposal is NOT generated by

c) Sampling A from Bernoulli(.1) and then sampling B from Bernoulli(.6)

Is my interpretation correct here? If so, is it possible to control which RV gets sampled in each iteration?

Thanks again
-Isaac

null-a

unread,
May 11, 2020, 4:10:39 PM5/11/20
to webppl-dev
> Is my interpretation correct here?

Yes, that sounds right.

> If so, is it possible to control which RV gets sampled in each iteration?

The MH MCMC algorithm is single-site only I'm afraid. For continuous latent variables HMC can work in this scanario, but that doesn't help you. You might try SMC, it's not out of the question it will work better than MH.

Isaac Davis

unread,
May 11, 2020, 4:29:55 PM5/11/20
to webppl-dev
This is very helpful. Thanks again
-Isaac

Isaac Davis

unread,
May 15, 2020, 12:08:46 PM5/15/20
to webppl-dev
Hi again. Thanks for your help. I have a follow up question about the MCMC method. Suppose I have a hierarchical model with the following form:


-Parameter alpha is sampled from a prior P(alpha);

-An initial hidden state g_0 is sampled from a distribution 

g_0 ~ gDist(g|alpha)

-An initial observable state s_0 is sampled from a distribution 

s_0~sDist(s|g_0)

For each subsequent step, a hidden state g_t is sampled from gDist(g_t|g_(t-1), s_(t-1), alpha) (i.e. g_t depends on g_(t-1), s_(t-1), and alpha),
then an observable state s_t is sample from sDist(s_t|s_(t-1), g_t) (i.e. s_t depends on s_(t-1) and g_t);

This continues until a terminal state is reached, resulting in the trial data s_0, s_1,....s_T (since the g-states are hidden);

Now, I want to write a program that will infer the value of parameter alpha, given data from multiple trials T_0, ..., T_N, where each T_i is a sequence of observable states.

My program has the following form:

var inferAlpha=function(trials){
     return Infer({method: 'MCMC', samples: 1000}, function(){
              var alpha=sample(alphaPrior);

              //recursively iterate through a trial and add factors
              var observe_trial=function(s_states, g_states, trial){
                      if (s_states.length==trial.length){return true}
                      else{

                            //load current state values
                            var current_s=s_states[s_states.length-1];
                            var current_g=g_states[g_states.length-1];
                            
                            //sample next hidden value
                            var next_g=sample(gDist(current_g, current_s, alpha));

                            //compute distribution over state values for next observable state
                            var stateDist=sDist(next_g, current_s);

                            //add factor for next observable state
                            var next_s=trial[s_states.length];
                            factor(stateDist.score(next_s));

                            return observe_trial(s_states.concat(next_s), g_states.concat(next_g), trial)
               }

               var addFactors=map(function(x){return observe_trial([x[0]], [sample(gDist(alpha))], x)}, trials);
               return alpha
         })
   }

The inner loop iterates through a single trial, fills in the hidden states by sampling from the appropriate gDist distribution, and adds the likelihood of observing the given observable state, given the sampled hidden state. 

The addFactors command maps the inner loop to each trial in the data.


So here's my question: from the previous discussion, my understanding is that each step of MCMC will choose a single sample statement to resample, in order to generate the next proposal. But in this case, most of the sample statements aren't actually included in the return value. The first sample statement generates a sample of the parameter alpha, but the remaining sample statements are just involved in filling in the hidden states. So, in this program, will each step of MCMC resample all of the hidden states AND the parameter, or will it just choose one of the sample statements (either the one associated with alpha, or one of the hidden states) and resample that one?

Thanks again
-Isaac               
                            
         








On Monday, May 11, 2020 at 4:10:39 PM UTC-4, null-a wrote:

null-a

unread,
May 15, 2020, 12:42:20 PM5/15/20
to webppl-dev
> my understanding is that each step of MCMC will choose a single sample statement to resample, in order to generate the next proposal. But in this case, most of the sample statements aren't actually included in the return value.

It doesn't matter what's included in the return value, MCMC with the MH kernel (which you're using here) always(*) perturbs the value at a single `sample` statement at each MCMC step.

(*) Assuming that the model structure is fixed, i.e. that every time the model is run the same set of `sample` statements are encountered. I think this was true of the earlier models we discussed, and I guess it's true here too, but let me know if not. The story is a little bit more complicated when this isn't the case.

Isaac Davis

unread,
May 15, 2020, 12:51:02 PM5/15/20
to webppl-dev
Ah ok, thanks for the clarification!

Isaac Davis

unread,
May 15, 2020, 6:39:09 PM5/15/20
to webppl-dev
One more quick question: when I use the "verbose" option for MCMC, and it reports the iteration # and acceptance ratio: 

does each iteration correspond to a single "generate proposal + accept or reject," or does each iteration correspond to "generate proposals until one is accepted"?

null-a

unread,
May 16, 2020, 11:53:38 AM5/16/20
to webppl-dev
No worries. The iteration counter is incremented after every proposal, whether or not it is accepted. (Since if we reject, we've still taken an MCMC step, and potentially collected a sample, it's just that we've stayed in the same state we were already in.)

Isaac Davis

unread,
May 16, 2020, 3:07:41 PM5/16/20
to webppl-dev
That's what I suspected, thanks for clarifying!
Reply all
Reply to author
Forward
0 new messages