poisson vs. neg_binomial for salm.stan example

501 views
Skip to first unread message

Linas Mockus

unread,
Apr 17, 2014, 6:08:54 PM4/17/14
to stan-...@googlegroups.com
Hi,

For overdispersed data negative binomial is commonly used.  I am currently working with overdispersed particle count data and the model analogous to salm fits quite well.  I am wondering if replacing
         y[dose,plate] ~ poisson(exp(alpha_star
                              + beta * centered_logx[dose]
                              + gamma * centered_x[dose]
                              + lambda[dose,plate]) );
with
mu[dose,plate] <-alpha_star
                              + beta * centered_logx[dose]
                              + gamma * centered_x[dose]
                              + lambda[dose,plate];
         y[dose,plate] ~ neg_binomial(shape[dose], shape[dose]/exp(mu[dose,plate]) );

will do even better. What prior should I put on shape - gamma(0.001,0.001)?

I already tried using negative binomial for my problem but it didn't converge. After removing lambda it converged but then waic was higher than the one when using poisson. Any suggestions how to improve the model?

Thanks for help,
Linas

Bob Carpenter

unread,
Apr 18, 2014, 9:04:17 AM4/18/14
to stan-...@googlegroups.com

On Apr 18, 2014, at 12:08 AM, Linas Mockus <linasm...@gmail.com> wrote:

> Hi,
>
> For overdispersed data negative binomial is commonly used. I am currently working with overdispersed particle count data and the model analogous to salm fits quite well. I am wondering if replacing
> y[dose,plate] ~ poisson(exp(alpha_star
> + beta * centered_logx[dose]
> + gamma * centered_x[dose]
> + lambda[dose,plate]) );

Just as an aside, we have a poisson_log() distribution defined
directly on the log scale:

poisson_log(y|a) = poisson(y|exp(alpha))

> with
> mu[dose,plate] <-alpha_star
> + beta * centered_logx[dose]
> + gamma * centered_x[dose]
> + lambda[dose,plate];
> y[dose,plate] ~ neg_binomial(shape[dose], shape[dose]/exp(mu[dose,plate]) );

And this makes it seem pretty clear that we want a version of
neg binomial directly parameterized in terms of mean and
dispersion.

>
> will do even better. What prior should I put on shape - gamma(0.001,0.001)?

You should choose something more realistic for what you think the shape
might be. Really fat gamma priors can lead to problems in sampling due
to posteriors getting too fat.

>
> I already tried using negative binomial for my problem but it didn't converge. After removing lambda it converged but then waic was higher than the one when using poisson.

If you don't have hierarchical priors for the lambda and reasonable
priors on alpha_star, beta, and gamma, you can get a very fat
posterior here, too --- basically a non-identification of the parameters,
which have an additive invariance in alpha_star and lambda.

- Bob

> Any suggestions how to improve the model?
>
> Thanks for help,
> Linas
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Luc Coffeng

unread,
Apr 19, 2014, 8:17:29 PM4/19/14
to stan-...@googlegroups.com
Working on NB regression modeling currently (not in Stan though, unfortunately), I got excited seeing the topic of this post!

Bob, if the development team is considering a direct parameterization of the NB distribution in terms of mean and dispersion, it'd be great to make a distinction between NB1 and NB2 parameterizations (Linus is using the latter), two commonly used alternatives in negative binomial regression (Cameron & Trivedi, Regression Analysis of Count Data, 2013):

NB1:
  y ~ poisson(lambda)
  lambda ~ gamma(mu/dispersion, dispersion)
  variance = mu + mu * dispersion = mu * (1 + dispersion)

NB2:
  y ~ poisson(lambda)
  lambda ~ gamma(1/dispersion, mu*dispersion)
  variance = mu + mu^2 * dispersion = mu * (1 + dispersion * mu)

Thanks,

Luc

Marco Inacio

unread,
Apr 21, 2014, 10:35:46 AM4/21/14
to stan-...@googlegroups.com

On 14-04-19 09:17 PM, Luc Coffeng wrote:
Working on NB regression modeling currently (not in Stan though, unfortunately), I got excited seeing the topic of this post!

Bob, if the development team is considering a direct parameterization of the NB distribution in terms of mean and dispersion, it'd be great to make a distinction between NB1 and NB2 parameterizations (Linus is using the latter), two commonly used alternatives in negative binomial regression (Cameron & Trivedi, Regression Analysis of Count Data, 2013):

NB1:
  y ~ poisson(lambda)
  lambda ~ gamma(mu/dispersion, dispersion)
  variance = mu + mu * dispersion = mu * (1 + dispersion)

NB2:
  y ~ poisson(lambda)
  lambda ~ gamma(1/dispersion, mu*dispersion)
  variance = mu + mu^2 * dispersion = mu * (1 + dispersion * mu)


NB2 is already in included in Stan development branch.

It's the first time I see NB1 you describe, I think it's the seventh different parametrization of negative binomial I see.

Avraham Adler

unread,
Apr 23, 2014, 1:32:23 AM4/23/14
to stan-...@googlegroups.com
Agreed. Unfortunately, the negative binomial has so many different parametrizations that it tends to cause massive confusion when people from different backgrounds try and talk about the same problem. At the least, I think we should have a decently-sized footnote in this section of the manual making it absolutely clear that Stan uses the Gelman et al. parametrization, and for those of us more comfortable with other versions, we should use increment_log_prob.

Luc Coffeng

unread,
Apr 27, 2014, 6:51:24 PM4/27/14
to stan-...@googlegroups.com
Yeah, I can totally appreciately why you wouldn't want to include seven parameterizations. A footnote in the manual would be great!
Luc

Marco Inacio

unread,
Apr 29, 2014, 12:35:46 AM4/29/14
to stan-...@googlegroups.com
I created a pull request to improve this part of the documentation: https://github.com/stan-dev/stan/pull/629


On 14-04-27 07:51 PM, Luc Coffeng wrote:
Yeah, I can totally appreciately why you wouldn't want to include seven parameterizations. A footnote in the manual would be great!
Luc
--
Reply all
Reply to author
Forward
0 new messages