example of flaky chain

Bob Carpenter

unread,

Mar 11, 2014, 8:11:54 AM3/11/14

to stan...@googlegroups.com

Should we create a separate repo for problematic model/data + sampling
configs?

For the model I posted on Andrew's blog (data and model attached),
I was finding that with different seeds, one of the four chains would
sometimes fail.

With this call:

./pk_iv_oral sample id=19 random seed=1 data file=data.R

I get a traceplot for c0 where a chain temporarily jumps to a value
like 1e16, whereas other chains give it a posterior mean of around 9.
This, of course, drives n_eff to roughly 0 and tanks the R-hat stat.

The treedepth mean goes from 4 to 6, too.

You can tell from the traceplot that it's not purely an adaptation
problem.

- Bob

data.R

pk_iv_oral.stan

c0-traceplot.png

Ben Goodrich

unread,

Mar 11, 2014, 9:16:59 AM3/11/14

to stan...@googlegroups.com

On Tuesday, March 11, 2014 8:11:54 AM UTC-4, Bob Carpenter wrote:

Should we create a separate repo for problematic model/data + sampling
configs?

I don't know if we need a separate repo, but there is no great shortage of flaky BUGS examples. All of these were for 21 chains with a base seed of 12345 that is advanced by the chain_id but otherwise kept the default settings

Stan doesn't parse: fire
Stan blows up: salm2, salm, orange
Stan takes way too long: schools (not eight_schools)
Stan doesn't converge: hepatitisME, hepatitis, air, eyes
Stan has bad mixing at a minimum: blocker, kidney, leukfr, magnesium, seeds, birats, stagnant

In each of these cases (except fire), the model wasn't much better with JAGS, so I think it is more a situation where the models are bad (gamma(0,0) priors, etc.). And it underscores that these BUGS examples are not good for unit-testing, but it would be nice if Stan could do a little better.

Ben

Daniel Lee

unread,

Mar 11, 2014, 9:32:21 AM3/11/14

to stan...@googlegroups.com

Bob, I'd go with a separate repo. I think it makes sense if we're talking about an on-going effort of collecting models. If you don't want it to live under stan-dev for now, create it as a public repo in stan-privy (I can help if you come up with a name).

I think we should have some way to tag models and actually mark when they've been "solved." I think it would also be nice to have some history of how we solved it -- either through reparameterization, restricting the geometry of the problem through priors, or if it just got fixed in Stan's defaults.

Bob Carpenter

unread,

Mar 11, 2014, 9:32:46 AM3/11/14

to stan...@googlegroups.com

So the question is whether we should "fix" the priors in those
models to be more reasonable. I'll ping Andrew separately, as he
seems to be ignoring stan-dev, and will have the strongest opinions
on these.

We want examples people can follow to build models. Comparisons with
JAGS isn't the major goal any more, I don't think.

- Bob

> --
> You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Bob Carpenter

unread,

Mar 11, 2014, 9:34:49 AM3/11/14

to stan...@googlegroups.com

I'd rather put things in stan-dev where they'd be visible.
The whole point (at least in my mind) is to create examples
for users. If our existing BUGS models are not good, I think
we should change them.

- Bob

Daniel Lee

unread,

Mar 11, 2014, 9:47:13 AM3/11/14

to stan...@googlegroups.com

Sure -- I was suggesting using a public stan-privy repo as a staging area.

We've managed to keep the clutter down on stan-dev and I think it makes sense that we continue to do that. We could start out the project: setting up the repo, making sure it has the right structure, seeding it with one or more examples, before moving it over to stan-dev. The transfer of repos is easy. And, of course, this would still be a public repo, just within stan-privy until we get it off the ground.

Bob Carpenter

unread,

Mar 11, 2014, 10:01:44 AM3/11/14

to stan...@googlegroups.com

However you'd like to structure it is OK with me.
Do you want to set it up? I can set it up.

I'm afraid I hate config so much that I haven't
even tried the new CmdStan setup. Speaking of which,
are there instructions somewhere of what I need to
do other than emails? I still need to get the manual
refactored for the next release.

- Bob

Daniel Lee

unread,

Mar 11, 2014, 10:09:20 AM3/11/14

to stan...@googlegroups.com

On Tue, Mar 11, 2014 at 10:01 AM, Bob Carpenter <ca...@alias-i.com> wrote:

However you'd like to structure it is OK with me.
Do you want to set it up? I can set it up.

If you have an idea of how it'll look, go ahead and set it up. I'm not sure I can think my way through it right now.

Michael Betancourt

unread,

Mar 11, 2014, 10:12:14 AM3/11/14

to stan...@googlegroups.com

Just so long as we have before and after. So, for example,

models/new_model/before

models/new_model/after

before has the nominal .stan file and a discussion of the pathology (with graphics and such).

after has the fixed .stan file with a discussion of the fix and possible evidence/comparisons.

Aki Vehtari

unread,

Mar 11, 2014, 1:55:20 PM3/11/14

to stan...@googlegroups.com

Bob: please include me to the prior discussions.

Aki

Bob Carpenter

unread,

Mar 11, 2014, 2:58:52 PM3/11/14

to stan...@googlegroups.com

There weren't any recent prior discussions.

To bring you up to speed, we've been planning to set up a repo
that has models that don't work, an explanation of why, a fixed version,
and an explanation of the fix.

Ben's listing some of the BUGS models that don't work in
Stan (or apparently very well in BUGS, either). The issue
is priors.

We also want to make a practice of storing this kind of
before and after remodeling (that was my best pun in ages) for
user-submitted models.

- Bob

Ben Goodrich

unread,

May 6, 2014, 6:53:23 PM5/6/14

to stan...@googlegroups.com

On Tuesday, March 11, 2014 9:16:59 AM UTC-4, Ben Goodrich wrote:

I don't know if we need a separate repo, but there is no great shortage of flaky BUGS examples. All of these were for 21 chains with a base seed of 12345 that is advanced by the chain_id but otherwise kept the default settings

Stan doesn't parse: fire
Stan blows up: salm2, salm, orange
Stan takes way too long: schools (not eight_schools)
Stan doesn't converge: hepatitisME, hepatitis, air, eyes
Stan has bad mixing at a minimum: blocker, kidney, leukfr, magnesium, seeds, birats, stagnant

In each of these cases (except fire), the model wasn't much better with JAGS, so I think it is more a situation where the models are bad (gamma(0,0) priors, etc.). And it underscores that these BUGS examples are not good for unit-testing, but it would be nice if Stan could do a little better.

There is a discussion on stan-users about the blocker example in volume 1 of BUGS. Whether you get reasonable results depends on the seed, number of chains, number of iterations, etc., which is to say I don't think that Stan gives reasonable results. With these arguments:

library(rstan)
blocker <- stan_demo("blocker", seed = 12345, chains = 21, iter = 10000)

you get low Rhat values for all quantities

> summary(summary(blocker)$summary[,"Rhat"])
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.000   1.001   1.001   1.002   1.002   1.010

but chain 6, for example, does not mix well (see the clump below the diagonal)

pairs(blocker, pars = c("d", "delta_new", "sigma_delta", "lp__"), chain_id = list(6, c(1:5,7:21)))

Hopefully a user would notice that the ratio of n_eff to nm is low for several parameters, but a more concise way of saying it is that it takes at least 20 lags for the draws to become essentially independent (and it looks like the adaptation was thrown off somewhat):

Finally, you would reject the null hypothesis that all the chains have the same distribution, which is appropriate in light of chain 6:

> convergence(blocker, thin = 25, R = 199)
[1] "Multivariate loss = 1.004"
disco(x = mat, factors = chain_id, distance = FALSE, R = R)

Distance Components: index  1.00
Source                 Df   Sum Dist  Mean Dist    F-ratio    p-value
factors                41  543.49225   13.25591      1.865      0.005
Within               4158 29555.97387    7.10822
Total                4199 30099.46612

There are some mitigating circumstances (it's a bad model, many of the chains seem fine, BUGS gets similar results, etc.) but this isn't great.

Ben

Reply all

Reply to author

Forward