On 9/17/13 10:41 AM, Martin �m�ra wrote:
> Hello Stan users,
>
> I'm bringing another interesting model from EJ's book (
https://webfiles.uci.edu/mdlee/BB_Free.pdf). It is latent mixture
> model Exam_2 on the page 79.
I've been meaning for ages to write a mixture model like this
for Mechanical Turkers who submit completely random results.
It's a little more complicated in that there's no official answer
key, but not much. It's why I recommended checking out the
Dawid and Skene (1979) model. Maybe Lee and Wagenmkaers get to it later.
> Purpouse of the model is to separate exam results who were just guessing from those who have some knowledge.
Why use a (0,1)-truncated normal rather than a Beta
distribution for a chance-of-success parameter?
Even if I did use a Gaussian, I wouldn't recommend
the gamma(0.001,0.001) prior.
> My model (below) is working and doing the job, but I'm not sure about some parts of the code. Esspecialy in generated
> quantities block - if I transformed lp__ for mixture in the right way.
You need the Jacobian for the 1/x^2 transform transform.
> The mixing actually differs slightly from mixing
> I have in Jags.
Nor surprising. You're also missing the index on your log prob
sums. But I'm recommending just putting it in the model and
using our log_sum_exp op. See below.
> Isn't there a better way how to extract mixing proportions? And any other suggestions how to improve the model?
>
> I've also encountered one minor error when I was trying to put truncation on sampling of phi: phi ~
> normal(mu,sigma)T[0,1]. Stating : no matches for function name="normal_cdf" ... lower bound in truncation type does not
> match sampled variate in distribution's type . Why is that?
Truncation doesn't play very nicely with vectorization
in the way we're evaluating. We need a better error message.
Specific domments on the code below.
>
> /model ="/
> /# Exam Scores/
> /data { /
> / int<lower=0> p;/
> / int<lower=0> k[p];/
> / int<lower=0> n;/
> /}/
> /parameters {/
> / vector<lower=0,upper=1>[p] phi;/
> / simplex[2] mix[p];/
> / real<lower=.5,upper=1> mu;/
> / real<lower=0> lambda;/
> /} /
> /transformed parameters {/
> / real<lower=0> sigma;/
> / matrix[p,2] lp_for_z;/
> /
> /
> / sigma <- 1 / sqrt(lambda);/
> / for (i in 1:p) {/
> / lp_for_z[i,1] <- exp(log(mix[i,1]) + binomial_log(k[i],n,phi[i]));/
> / lp_for_z[i,2] <- exp(log(mix[i,2]) + binomial_log(k[i],n,.5));/
> / }/
See below.
> /}/
> /model {/
> / mu ~ beta(1,1)T[.5,1];/
> / lambda ~ gamma(.001,.001);/
This is where you need the Jacobian adjustment.
log | d/d.lambda 1 / sqrt(lambda) |
= log(1/2 lambda^{-3/2}) // lambda > 0
= log(1/2) - 3/2 log(lambda)
and we can drop constants, so that's
lp__ <- lp__ - 1.5 * log(lambda);
But you should really use a different prior altogether
here. The 0.0001 business is just a (failed)
attempt to provide a "non-informative" prior.
The simplest thing to do is just constrain sigma
to a (reasonable) interval (not [0,10000] for this
model, where the sd can't be large because
of the truncation!).
> / phi ~ normal(mu,sigma); # !!!/
Here you need to loop over p.
for (n in 1:p)
phi[p] ~ normal(mu,sigma) T[0,1];
I think it'd also make sense (actually not --- the whole
use of normal here is funny) to truncate to [0.5,1].
But really, just use a Beta prior!
> /
> /
> / for (i in 1:p) {/
> / lp__ <- lp__ + log(sum(lp_for_z));/
You need the index lp_for_z[i] here and it should be right.
I assume this is a typo given that you had the loop.
The transformed params you had are these, but I'd just
get rid of them altogether:
> for (i in 1:p) {/
> / lp_for_z[i,1] <- exp(log(mix[i,1]) + binomial_log(k[i],n,phi[i]));/
> / lp_for_z[i,2] <- exp(log(mix[i,2]) + binomial_log(k[i],n,.5));/
> / }/
Instead, replace the above loop with
for (i in 1:p)
lp__ <- lp__ + log_sum_exp(log(mix[i,1]) + binomial_log(k[i],n,phi[i]),
log(mix[i,2]) + binomial_log(k[i],n,.5));
It's more arithmetically stable. You can assign the inner terms
to a local variable if you think it's clearer.
> / }/
> /}/
> /generated quantities {/
> / vector[p] prob;/
> / int<lower=0,upper=1> z[p];/
> / vector<lower=0,upper=1>[p] theta;/
> //
> / for (i in 1:p) {/
> / prob[i] <- lp_for_z[i,1] / (lp_for_z[i,1] + lp_for_z[i,2]);/
> / z[i] <- bernoulli_rng(prob[i]);/
> / theta[i] <- (z[i] == 0) * .5 + (z[i] == 1) * phi[i];/
> / } /
> /}"/
You can do this, but it's more efficient for downstream inference
if you just work with the expectations for z. (See the Rao-Blackwell
theorem.)
- Bob