Hierarchical Bayesian model and constrained optimization in RStan?! Modeling

176 views
Skip to first unread message

Mohammed Effat

unread,
May 9, 2017, 3:34:27 AM5/9/17
to Stan users mailing list
Hi all,
I have a complex pdf based on hierarchical Bayesian formalism where x depends on the hyperpriors w' and w'' as w=Gamma(beta/2,2/zeta) where Gamma is the gamma distribution with parameters beta and zeta. The resulting negative log likelihood for joint posterior pdf is in the attached picture. I want to sample both of w', w'', and x. My approach is to use the Hamiltonian Monte Carlo sampler. I have the following questions below:

1) I thought before that I can split the sampling process that, I sample w' and w'' from the gamma distribution first. And as they become known values, they can be taken out from the exponential as normalizing constants leaving only x to be sampled. As I have the constraint x>=0, this makes the posterior truncated mulitivariate Gaussian. Consequently, I thought of sampling x using the exact algorithm by Pakman and Paninski. Is this approach feasible?

2) I want to use the HMC algorithm by Stan, however I want to know if the optimizer in Stan to estimate the MAP supports the x>=0 constraint or it is for x>0 only?

3) Can you refer me to a good RStan example to train myself who to sample a complex hierarchical posteriors like the one attached?

Thanks and best regards

Notes:
Exact HMC algorithm: https://arxiv.org/abs/1208.4118
lambda, beta, zeta are given input parameters


Bob Carpenter

unread,
May 10, 2017, 2:32:13 PM5/10/17
to stan-...@googlegroups.com

> On May 9, 2017, at 3:34 AM, Mohammed Effat <mbef...@gmail.com> wrote:
>
> Hi all,
> I have a complex pdf based on hierarchical Bayesian formalism where x depends on the hyperpriors w' and w'' as w=Gamma(beta/2,2/zeta)

??? w' and w'' don't show up in this expression. Can you write out
what your density is or how to define a random variable with that
distribution?

> where Gamma is the gamma distribution with parameters beta and zeta. The resulting negative log likelihood for joint posterior pdf is in the attached picture. I want to sample both of w', w'', and x. My approach is to use the Hamiltonian Monte Carlo sampler. I have the following questions below:
>
> 1) I thought before that I can split the sampling process that, I sample w' and w'' from the gamma distribution first.

No. Stan doesn't split anything. You want to do everything jointly. All you
need to do is give it a continuously differentiable log density with parameters defined
with appropriate constraints.

> And as they become known values, they can be taken out from the exponential as normalizing constants leaving only x to be sampled. As I have the constraint x>=0, this makes the posterior truncated mulitivariate Gaussian. Consequently, I thought of sampling x using the exact algorithm by Pakman and Paninski. Is this approach feasible?
>
> 2) I want to use the HMC algorithm by Stan, however I want to know if the optimizer in Stan to estimate the MAP supports the x>=0 constraint or it is for x>0 only?

Don't use MAP if you can fit a Bayesian estimate unless someone's making you do it because they don't like Bayes.

>
> 3) Can you refer me to a good RStan example to train myself who to sample a complex hierarchical posteriors like the one attached?

Most of our examples are of that form. Check out the case studies
on the web.

- Bob

Bob Carpenter

unread,
May 10, 2017, 2:34:15 PM5/10/17
to stan-...@googlegroups.com
If you just want to code up what you attached you can code it exactly
as written or you can modify that "penalty" term to look like a normal
prior on coefficients. Then you can do maximum likelihood. But it
might not be defined if it's hierarchical---usually you need MML for
hierarchical MLEs. Stan should be able to do the Bayesian version, though,
which doesn't suffer from the pathologies of MLE for hierarchical models.

- Bob

> On May 9, 2017, at 3:34 AM, Mohammed Effat <mbef...@gmail.com> wrote:
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Boylan, Ross

unread,
May 12, 2017, 2:56:45 PM5/12/17
to stan-...@googlegroups.com

>If you just want to code up what you attached you can code it exactly
>as written or you can modify that "penalty" term to look like a normal
>prior on coefficients. Then you can do maximum likelihood. But it
>might not be defined if it's hierarchical---usually you need MML for
>hierarchical MLEs. Stan should be able to do the Bayesian version, though,
>which doesn't suffer from the pathologies of MLE for hierarchical models.

>- Bob

Can you provide a pointer to those pathologies?
Ross Boylan

Bob Carpenter

unread,
May 13, 2017, 7:16:34 PM5/13/17
to stan-...@googlegroups.com
Better yet, I can give you the math:

vector[N] alpha;
real mu;
real sigma;

alpha ~ normal(mu, sigma);
mu ~ ...
sigma ~ ...

as sigma -> 0 and alpha[n] -> mu for all n in 1:N,
the density grows without bound so there's no maximum likelihood
solution.

Greene's probabilityt heory book goes over it in detail and
there's a nice discussion in McKay's book under "EM goes boom".

It's also the reason systems like lme4 use max marginal likelihood
---the max likelihood itself doesn't exist.

- Bob

Ben B

unread,
May 13, 2017, 8:15:48 PM5/13/17
to stan-...@googlegroups.com
Minima get really finicky even if it's just a minimization you're working on.

It's pretty common to have things that seem like they would have minima, but in the end something crazy happens instead. It's more than just local minima keeping you from the global minima (assuming at least one exists). There can issues with saddlepoint sorts of things for instance (where you're always going down, but limiting to something much larger than the global minima -- this might require the domain of your function to be infinity I forget).

I like running the Stan optimizer on my problems occasionally just to see where it will go. It's one of the easier ways to validate to yourself or your colleagues (or your boss) all that work you did on the Bayesian stuff :D.

Ben

> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.

> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages