Help with poisson regression stan code

972 views
Skip to first unread message

Noah

unread,
May 19, 2014, 8:51:40 PM5/19/14
to stan-...@googlegroups.com
Hi,

It's been ages since I've used Stan, and I've become stuck on what should be a trivial model.  Can somebody take a look and tell me what I'm missing here.

This is a poisson regression for student enrollments.  I have 27 predictor variables and 13 campuses.  There is a row of data for each county in the study (about 2,000)  The dependent variable is an integer count of student enrollment.

Inputs:
N = number of rows in the data set
J = number of campuses
campus = a vector of length N with an integer for each campus
X = my covariates

I'm trying to make this a multi-level model, where the campus effect is sampled from a normal distribution.  (later, I might force that effect to be centered on 0)  Also, from the look of the data, I might want to expand this to a zero-inflated model.  But first, I want to a basic hierarchical poisson model working.

Attempting to do this all in RStan to make life easier.

Stan gives me an error for the line "theta[campus[i]]"  The error is "indexes inappropriate for expression".

Code is below.

Thanks!

Enter code here...data {
   
int<lower=0> N;
   
int <lower=0> J;
    vector
[N] Y;
   
int<lower=0> campus;
    matrix
[N,27] X;

}
parameters
{
    vector
[27] betas;
    vector
[J] theta;
    real
<lower=0> sigma;
    real
<lower=0> tau;
}
transformed parameters
{
    vector
[N] Xbeta;
   
for(i in 1:N){
       
Xbeta[i] <-  theta[campus[i]] + X[i] * betas;
   
}
}
model
{
    sigma
~ gamma(0.01, 0.01);
    tau
~ gamma(0.01,0.01);
    betas
~ normal(0, sigma);

   
for (j in 1:J){
        theta
[j] ~ normal(0, tau);
   
}

    Y
~ poisson(Xbeta);

}


Doogan, Nathan

unread,
May 19, 2014, 9:08:15 PM5/19/14
to stan-...@googlegroups.com
It looks like you've initialized 'campus' as a scalar rather than a vector.


--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Noah Silverman

unread,
May 19, 2014, 9:23:42 PM5/19/14
to stan-...@googlegroups.com
Thanks Nathan,

That was it. 

Now, I just need to add a "zero inflated" layer to this.  Any suggestions?
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/xgK42dCNu4Y/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

Marco Inacio

unread,
May 19, 2014, 11:42:24 PM5/19/14
to stan-...@googlegroups.com
A while ago I developed zi_poisson_log_log and zi_neg_binomial_log_log, so you would be able to call directly via:

Y ~ zi_poisson_log(Xbeta);

They are available here: https://github.com/randommm/stan/commits/feature/zi_dist

I haven't requested them to be included in Stan because, among other things, they are only parametrized in log-link function (that is I only have zi_poisson_log_log, not zi_poisson_log).

So you would need to compile it to be able to use.

As an alternative you can write the likelihood function and use increment_lp():
parameters {
    vector[27] betas;
    vector[27] gammas;

    vector[J] theta;
    real<lower=0> sigma;
    real<lower=0> tau;
}

transformed parameters {
    vector[N] Xbeta;
    vector[N] Xgamma;

    for(i in 1:N){
        Xbeta[i] <-  theta[campus[i]] + X[i] * betas;
        Xgamma[i] <-  theta[campus[i]] + X[i] * gammas;

    }
}
model{
    sigma ~ gamma(0.01, 0.01);
    tau ~ gamma(0.01,0.01);
    betas ~ normal(0, sigma);

    for (j in 1:J){
        theta[j] ~ normal(0, tau);
    }
  
    //Note 1: you can split your data Y in zeros and non-zeros at the beginning and vectorize all this.
    //Note 2: Poisson model with log-link function, bernoulli model with logit link function
    //Note 3: Very that this code is correct with a simple model.
    for (i in 1:N) {
      if (Y[i] == 0)
        increment_log_prob(log_sum_exp(Xgamma[i],-exp(Xbeta[i])) - log1p_exp(Xgamma[i]));
      else {
        Y[i] ~ poisson_log(Xbeta[i]);
        0 ~ bernoulli_logit(Xgamma[i]);
      }
    }

}



Also, in your code:

Y ~ poisson(Xbeta);

Are you intentionally using identity link function or do you wish to use log-link function:

Y ~ poisson_log(Xbeta);

Bob Carpenter

unread,
May 20, 2014, 10:58:13 AM5/20/14
to stan-...@googlegroups.com
There's a discussion of zero inflation along the same lines as
Marco's Stan code in the comparison to BUGS in the appendix to the
manual.

It wound up there by coincidence due to comparisons with
BUGS. Given that I'm now knee deep in the manual for 2.3.0, I just
added a section on zero inflation to the mixture models chapter.

- Bob
Reply all
Reply to author
Forward
0 new messages