> On Mar 13, 2017, at 2:09 AM, Christian Schuhegger <
christian....@gmail.com> wrote:
>
> Hi Michael,
>
> thanks a lot for the link to your paper! This was exactly what I needed for developing a better understanding. My background is physics and your explanation of the analogy of HMC to gravitational systems in phase space was very helpful. It also makes clear to me now what the word “Hamiltonian” means in that context. I still need to read more and your references in that paper are on my reading list now :)
>
> I understand that NUTS is a variant/improvement of HMC, but otherwise the same “rules” apply for HMC as for NUTS. Is there any reason to continue to use the HMC sampler in Stan rather than NUTS?
Only for comparison and for use as bits of other algorithms.
> If this analogy with the gravitational system is right
Potential = negative log density
Kinetic = random standard normal at start of iteration
The algorithm then uses the leapfrog integrator to simulate the
Hamiltonian and draws point along trajectory. NUTS then
just controls how long the Hamiltonian is simulated forward
and backward in time and how a point is selected from along
the trajectory for the next draw.
> then I have a question about multi-modal distributions. If I take the analogy that in a multi modal system it could happen that there are more than one stable orbits, e.g. a stable orbit around earth plus a stable orbit around the moon. Will HMC/NUTS find all of the orbits or do I have to be careful in such scenarios?
No. Nothing will. If there's only a few, finding them
won't be the big problem---deciding how much time to spend
on each will be.
> Would R_hat still be a valid “problem indicator” in such a scenario? I guess that R_hat would be different from 1 if two different chains find two different orbits? So my question would be how to deal with multi-modal systems?
Rhat will explode if the different chains fall into different modes.
That's one thing we use it to diagnose.
> My original problem that I try to solve that put me on the track of your video is a mixture model of shifted gammas. This is not a real world problem, but just an exercise that I’ve chosen for myself in order to learn MCMC techniques.
You probably want to read Michael's case study (see our web site
under DOCS) on mixture models.
There aren't really any good techniques for fitting general multimodal
models. Mixture models we usually code directly and they wind up not
being so multimodal on their paramters.
- Bob