I'd like to fit a truncated multivariate normal distribution. My model block would look like
model {
y ~ left_truncated_multi_normal_cholesky(mu, L, lower_bound);
}
But that sampling statement (in bold) isn't available. So my next idea was:
functions {
real left_truncated_multi_normal_cholesky_log (vector y, vector mu, matrix L, vector lower_bound) {
return( multi_normal_cholesky_log(y, mu, L) - (1 - multi_normal_cholesky_cdf_log(lower_bound, mu, L)) );
# I'm not sure if this is even the correct syntax, but hopefully it gets the idea across
}
}
model {
y ~ left_truncated_multi_normal_cholesky(mu, L, lower_bound);
}
But those aren't available either.
The former would be easy to implement (I think).
The latter, maybe not: https://github.com/stan-dev/stan/issues/158. That request links to a paper (http://www.math.wsu.edu/faculty/genz/papers/mvn.pdf) that gives an algorithm for computing the CDF of the multivariate normal. However, that algorithm (top of page 3) depends on being able to generate standard uniform random variables. I am also aware that this is also not implemented in Stan outside of the Generated Quantities block, and I feel like it might not be a good idea to try and build a pseudorandom number generator into my Stan program. And I don't know a thing about numerical integration. So basically I have no idea how else I might go about implementing multi_normal_cholesky_cdf_log or, for that matter, left_truncated_multi_normal_cholesky.
Of course, it'd be really nice if I could write a program like
data {
int K;
vector[K] y;
}
transformed data {
vector[K] lower_bound;
lower_bound <- to_vector(rep_array(0, K));
}
parameters {
vector[K] mu;
}
model {
y ~ multi_normal_cholesky(mu, L) T[lower_bound, ];
}
but I imagine that's not going to happen any time soon. Interestingly, this also seems to be impossible in JAGS, but I think it's possible in OpenBUGS. I would much rather use Stan and not OpenBUGS for this project, but as of right now I'm stuck on how.
That said, I'm only using the truncated normal for convenience and generality, not for any substantive reason. The alternative model I had in mind was a Gaussian copula with zero-inflated Gamma marginals. Should I just forget the truncation business and use that instead?