Hi folks. I have a question about how the Observe operator works inside a recursive loop.
I'm working on a model with the following generative form:
1) sample a hidden state H from a prior distribution P_H
2) sample a sequence of observable states s_0, ..., s_n, where
-s_0 is drawn from a distribution P(s_0|H) (dependent on H), and
-s_t is drawn from a distribution P(s_t|s_t-1, H) (dependent on both the previous observable state and the hidden state).
The inference problem is to infer the value of H given a sequence of observable states s_0, ..., s_n.
Since the sequences can vary in length, I use the Factor operator inside a recursive loop to condition on the data:
var inferH=function(data, H_prior, transitionDist){
return Infer({method: 'MCMC', samples: 1000}, function(){
//inner recursive loop
var factor_data=function(states, H){
if (states.length==data.length){return true}
else{
var current=states[states.length-1];
var next=sample(transitionDist(current, H));
factor(next==data[states.length]? 0 : -Infinity)
return factor_data(states.concat(next), H)
}
}
//sample hidden state
var H=sample(H_prior);
//add factors
var factors=factor_data([], H);
return H
})
}
The function "transitionDist(state, H)" returns the distribution P(s_t|s_t-1, H) described above (or P(s_0|H) if the "state" argument is empty).
My understanding is that the "factor" statements will stack additively, so long as they are within the scope of the same Infer statement. I've run this program on a few test examples and it seems to work correctly. However, some of the data sequences can be quite long, and in those cases the MCMC sampler will take a very long time to initialize (or fail to initialize at all).
To circumvent this, I tried replacing most of the "sample + factor" statements with an Observe statement as follows:
var inferH_alt=function(data, H_prior, transitionDist){
return Infer({method: 'MCMC', samples: 1000}, function(){
//inner recursive loop
var observe_data=function(states, H){
if (states.length==data.length){return true}
else{
var current=states[states.length-1];
var next=data[states.length];
observe(transitionDist(current, H), next)
return observe_data(states.concat(next), H)
}
}
//sample hidden state
var H=sample(H_prior);
//add factors
var factors=observe_data([], H);
return H
})
}
This version only samples the initial hidden state H. Then it Observes each data point being generated by the appropriate transition distribution. I tried running this version on the same set of test examples. It does run, but instead of outputting the correct posterior distribution over H, it assigns all of the probability mass to a single value of H (and the value changes from run to run). That is, the first version (with sample+factor) will correctly output something like
Values: [0, 1, 2], Probs: [.3, .3, .4]
But the second version (with observe) outputs something like
Values: [0], Probs [1]
each time (and the particular value it outputs is not fixed).
So, I'm wondering if perhaps the Observe operator does not stack like the factor operator in this sort of recursive loop. I.e., perhaps the second program only counts the first/last observe statement and discards the rest? (and, if so, what else I can do about the failure to initialize on longer data sequences).
I checked the documentation on Factor and Observe (as well as the chapter on "Conditioning" in the online textbook) but could not find an answer.
Any insights or suggestions would be much appreciated.
Thanks!
-Isaac