Unfortunately we’re overdo for a serious rewrite of the manual,
especially given our rapidly evolving understanding of subtle
theoretical MCMC issues. I have everything collected into
slides, but turning it into text will have to wait until I have some
free time. But I’ll summarize the results here.
MCMC is just a method for numerically estimating integrals,
which is hard in high dimensions because of something called
concentration of measure. Essentially the only parameters
that contribute to integrals lie on a hyper surface in parameter
space and if we want to estimate integrals then we need to
find that hyper surface and then explore it. That’s exactly
what a Markov chain does — eventually we have a series
of states distributed across the hyper surface, { y_{n} }, that
we can use as a grid for doing really simple quadrature,
\int dy p(y) f(y) ~ 1/N sum_{n = 1}^{N} f(y_{n})
Everything relies on our being able to find and then explore
the _entire_ hyper surface (which we call the typical set).
This is incredibly hard to determine theoretically so we’re
left with a bunch of necessary but not sufficient diagnostics
to identify when we’re not exploring everything.
All of these apply only to samples — because we dynamically
adapt sampling parameters in warmup those samples will
behave very differently. Again, ignore anything you see
in warmup.
Traceplots:
The characteristic way pathologies manifest is the Markov
chain getting stuck in the boundary of a pathological region
of parameter space. If you run long enough you’ll literally
see the chain get stuck for long periods of time. You can
also compare chains to identify multi modality and other
problems.
R-hat:
If our Markov chain can explore the entire target distribution
then any realization of it should look the same. R-hat
essentially performs an analysis of variance on a bunch of
chains (and subsets of chains) looking for any deviations.
The more chains you use the more sensitive you’ll be to
potential pathologies.
n-divergent:
Hamiltonian Monte Carlo has a unique diagnostic — regions
of the typical set that are hard to explore induce particular
numerical divergences that we can capture and report. This
is much more sensitive then looking at trace plots! If
you see any divergences in sampling then you have to be
careful. For a few divergences you can increase the target
acceptance probability, but for any nontrivial number of
divergences you’ll probably have to consider tweaking your
model to use stronger priors or different parameterizations.
In RStan you can grab the divergences with
fit <- stan(file='model_name.stan', data=input_data,
iter=2000, chains=1, seed=4938483)
count_divergences <- function(fit) {
sampler_params <- get_sampler_params(fit, inc_warmup=FALSE)
sum(sapply(sampler_params, function(x) c(x[,'n_divergent__']))[,1])
}
Finally the treedepth issue is not so much a diagnostic of a
bad Markov chain but rather a safeguard we’ve built in to
avoid infinite loops caused by ill-formed posteriors which
require the sampler to explore all the way to infinity in back
(requiring an infinite tree depth an infinite time). Sometimes
your sampler will need to explore beyond our default safeguard
value, in which case you have to increase it manually — this
doesn’t effect the validity of your results, just the performance
of the algorithm.
I like looking at a histogram of the treedepths,
hist_treedepth <- function(fit) {
sampler_params <- get_sampler_params(fit, inc_warmup=FALSE)
hist(sapply(sampler_params, function(x) c(x[,'treedepth__']))[,1], breaks=0:20, main="", xlab="Treedepth")
abline(v=10, col=2, lty=1)
}
Technically the sampler tries one more step after maxdepth
which is why you might see it exceed that value by one. I
can never remember the indexing and ordering to state whether
you should see maxdepth or maxdepth + 1 but in the end it
doesn’t really matter as the bad behavior (histogram saturating
at a large value) is pretty obvious.
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
stan-users+...@googlegroups.com.
> For more options, visit
https://groups.google.com/d/optout.