Help with Poisson AR(1) model

1,278 views
Skip to first unread message

Fabio Zottele

unread,
Feb 1, 2016, 1:48:01 PM2/1/16
to Stan users mailing list
Hi all.
I am reall new both in Hierarchical modelling and STAN, and I am trying to fit a Poisson Autoregressive model of orede 1 as described in Brandt & Williams (2000) (http://www.utdallas.edu/~pbrandt/pests/parp.pdf)
The model is formalised at page 9:

y_t ~ Poisson(m_t) for all t
m_t = \rho* y_(t-1) + (1-\rho)*exp(a+b*t)
m_t = \Gamma(\sigma_(t-1)*m_(t-1), \sigma_(t-1))

For a reproducible example I use the R's "gsarima" package

library(gsarima)

N
= 100
rho
= 0.8
y
= garsim(n=N, phi=c(rho), family="poisson", X=data.frame(intercept=rep(1,N),x=seq(0:(N-1))), beta=c(intercept=0, alpha=0.6))
data
= data.frame(X=seq(0:(N-1)), Y=y)



With the following stan code I would like to fit the values of \rho, intercept and \beta. Here's the code:

data {
   
int<lower=1> N;
   
int Y[N];
    real X
[N];
}
parameters
{
    real beta
;
    real intercept
;
    real rho
;
    real
<lower=0> sigma0;
}

transformed parameters
{
    real m
[N];    
    m
[1] <- exp(beta*X[1] + intercept);
   
for (t in 2:N) {
        m
[t] <- rho*Y[t-1]+(1-rho)*exp(beta*X[t] + intercept);        
   
}
}

model
{    
    real sigma
[N];    
    beta
~ cauchy (0, 5) ;
    intercept
~ cauchy (0, 5) ;
    rho
~ uniform(0,1);
    sigma0
~ gamma(3,1);
    sigma
[1] <- sigma0;
   
   
for (t in 2:N) {
        m
[t] ~ gamma(sigma[t-1]*m[t-1], sigma[t-1]);
   
}
   
   
for (i in 1:N) {
        Y
[i] ~ poisson(m);
   
}
}

 
No luck however.  I get a bunch of errors and the model won't fit.

[....]
[595] Error evaluating the log probability at the initial value."                                                                         
[596] "Informational Message: The current Metropolis proposal is about to be rejected because of the following issue:"                       
[597] "Exception thrown at line 37: stan::math::gamma_log: Shape parameter is nan, but must be > 0!"                                         
[598] "If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,"
[599] "but if this warning occurs often then your model may be either severely ill-conditioned or misspecified."                             
[600] "Rejecting initial value:"                                                                                                             
[601] "  Error evaluating the log probability at the initial value."                                                                         
[602] "Initialization between (--2, -2) failed after 100 attempts. "                                                                         
[603] " Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model."                                
error occurred during calling the sampler; sampling not done

I tried to play with priors but without luck. Any help is appreciated.

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                
 [9] LC_ADDRESS=C               LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

other attached packages:
[1] rstan_2.9.0   ggplot2_2.0.0 gsarima_0.1-4

loaded via a namespace (and not attached):
 [1] colorspace_1.2-6 scales_0.3.0     MASS_7.3-45      plyr_1.8.3     
 [5] tools_3.2.3      parallel_3.2.3   inline_0.3.14    gtable_0.1.2   
 [9] gridExtra_2.0.0  Rcpp_0.12.3      codetools_0.2-14 grid_3.2.3     
[13] stats4_3.2.3     munsell_0.4.2

AR1.R
AR1.stan

Bob Carpenter

unread,
Feb 1, 2016, 2:03:22 PM2/1/16
to stan-...@googlegroups.com
Thanks so much for sending the model in math notation along
with the full Stan code and errors! It makes it sooo much
easier on us responding.

You need to define variables before using them in Stan.
They are all initialized as NaN to help debugging in
cases like these. Here, you have this:

model {
real sigma[N];
beta ~ cauchy (0, 5) ;
intercept ~ cauchy (0, 5) ;
rho ~ uniform(0,1);
sigma0 ~ gamma(3,1);
sigma[1] <- sigma0;
for (t in 2:N) {
m[t] ~ gamma(sigma[t-1]*m[t-1], sigma[t-1]);
}

so that you're using terms sigma[2], ..., sigma[N] that arne't
defined.

You want to define the vector sigma as a parameter.

parameters {
...
vector<lower=0>[N] sigma;
...

model {
...
sigma[1] ~ gamma(3, 1);
sigma[2:N] ~ ...; // whatever prior you want here for the remaining sigma

... the rest of your model
}

Also, every variable needs to have support on all its legal variables,
which rho violates:

parameters {
...
real rho;
...

model {
...
rho ~ uniform(0,1);

So if you really want rho to be uniform(0, 1), you need to declare

real<lower=0, upper=1> rho;

After you get the model working, send it back and we can
talk about efficiency.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <AR1.R><AR1.stan>

Bob Carpenter

unread,
Feb 1, 2016, 2:22:21 PM2/1/16
to stan-...@googlegroups.com
The other issue you'll run into with this parameterizatin
is that you'll need to account for the Jacobian in transforming
m[t]. I don't understand why the authors formulated the model
this way.

- Bob

> On Feb 1, 2016, at 1:48 PM, Fabio Zottele <fabio....@fmach.it> wrote:
>

Fabio Zottele

unread,
Feb 2, 2016, 7:30:39 AM2/2/16
to Stan users mailing list
thank you Dr. Carpenter for the quick response and for pointing me to the right path.
The \rho's domain is now correctly set, and the sigmas are initialized too. See AR1.stan attached
The results are not encouraging and for this reason I re-checked both code and paper: it took me time to answer you. 
Unluckily, I think the problem (as you wrote in the 2nd message) resides not only in the code but in the structure of the model and specifically when the Poisson mean is sampled from the gamma distribution:
m_t = \Gamma(\sigma_(t-1)*m_(t-1), \sigma_(t-1))

Moreover with the imposed distribution, i obtain a very bad fit (see AR1.png attached, obtained from the corrected AR1.stan ).
So I rewrote from scratch the model pruning every line I do not understand. The model I get - without knowing if it's correct or not - is attached in the AR2.stan file and basically:

y_t ~ Poisson(m_t) for all t
m_t = \rho* y_(t-1) + (1-\rho)*(a+b*t)  #no more exp here
#no other hierarchical level here
 
Now i get a lot of numerical warnings: "validate transformed params: m[k0__] is -1.83315, but must be greater than or equal to 0" but the parameters are estimated much better (see AR2.png)
So now I am stuck between a published model that I cannot tame and an handmade-but-uncorrect model. :-(
Is there an example of a correct Poisson Autoregressive linear model, even the simplest, implemented in STAN from which I can learn?

 


AR1.stan
AR1.png
AR2.stan
AR2.png

Bob Carpenter

unread,
Feb 2, 2016, 12:08:24 PM2/2/16
to stan-...@googlegroups.com
I couldn't understand that paper's prior structure on m[t] (I do
get that gamma-Poisson is negative binomial).

Do you just want

for (t in 2:T)
y[t] ~ Poisson(exp(rho * y[t - 1] + (1 - rho) * (a + b * t)));

What is

y[1] ~ ?

Or do you just not want to model it?

With some reasonable priors on rho, a, and b?

- Bob

P.S. I dug through that paper you sent and found my way to Chib and Winkelmann,
whose language I understood and whose example I managed to recreate, but
it's multivariate rather than a time series. I'll write up a bunch of these
examples as I understand them. Here's that example, followed by R code (yes, it
really needs tree depth 12).

/**
* Likelihood derived from the following paper:
*
* Markov Chain Monte Carlo Analysis of Correlated Count Data
* Author(s): Siddhartha Chib and Rainer Winkelmann
* Source: Journal of Business & Economic Statistics,
* Vol. 19, No. 4 (Oct., 2001), pp. 428-435
* Stable URL: http://www.jstor.org/stable/1392277
*
* Notes:
* -- replaced Wishart and normal priors
* -- assume each outcome J has the same covariates (if not,
* can inefficiently pad with zeros)
*/
data {
int<lower=0> I; // subjects
int<lower=0> J; // outcomes
int<lower=0> K; // covariates
int y[I, J]; // observed counts
matrix[I, K] x; // covariates
}
transformed data {
vector[J] zeros; // mean noise
zeros <- rep_vector(0, J);
}
parameters {
cholesky_factor_corr[J] L_Omega; // outcome correlations
vector<lower=0>[J] sigma; // outcome scale
matrix[K, J] beta; // regression coefficients
row_vector[J] b[I]; // noise terms
}
model {
// priors
L_Omega ~ lkj_corr_cholesky(4);
sigma ~ normal(0, 2);
to_vector(beta) ~ normal(0, 2);
b ~ multi_normal_cholesky(zeros, diag_pre_multiply(sigma, L_Omega));
// likelihood
for (i in 1:I)
y[i] ~ poisson_log(x[i] * beta + b[i]);
}


==========================

beta <- matrix(rnorm(K*J, 0.5, 2), K, J);
b <- matrix(NA, I, J);
for (i in 1:I)
b[i, ] <- mvrnorm(1, rep(0, J), Sigma);
y <- matrix(0, I, J);
for (i in 1:I) {
for (j in 1:J) {
log_lambda <- sum(x[i,] * beta[,j]) + b[i,j];
y[i, j] <- rpois(1, exp(log_lambda));
}
}

library(rstan);
fit <- stan("chib-winkelmann.stan", data=c("I", "J", "K", "y", "x"),
iter=1000, refresh=1, chains=1, init=0.5, fit=fit,
control=list(stepsize=0.02, adapt_delta=0.90, max_treedepth=12));
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> <AR1.stan><AR1.png><AR2.stan><AR2.png>

Bob Carpenter

unread,
Feb 2, 2016, 4:43:23 PM2/2/16
to stan-...@googlegroups.com
Here's a very simple Poisson AR(1) model (based on
Cameron and Trivedi's book Regression Analysis of Count
Data):

For count time series y, the likelihood uses a log link GLM:

y[t] ~ poisson(exp(alpha + u[t]));

with intercept alpha and addition noise term u modeled as AR(1):

u[t] ~ normal(rho * u[t -1], sigma);

I added some weakly informative priors. The simulator and Stan
model are attached (with citation of chapter and verse of
the reference to Cameron and Trivedi). You'll see that I configured 10K
iterations; it's mixing, just very slowly.

The reason mixing is slow is that there's only weak identification
of alpha and u because of the additive invariance (add to alpha,
subtract from u, get the same likelihood). You see this right away
in the pairs plot:

poisson-ar1.stan
sim-poisson-ar1.R
Rplot001.png

Fabio Zottele

unread,
Feb 7, 2016, 4:17:18 AM2/7/16
to Stan users mailing list

Thank you Dr.Carpenter for the two answers.
I'll first try the Cameron and Trivedi's model: i feel more comfortable with it rather than with the Chib&Winkelmann's model. I am just moving the first steps into such models and a multivariate one is - for me- "a beast rather difficoult to tame". I need more statistic theory and math, so firstly I will check the paper and the book and I need a little time for doing it.
However, I will try both models and see the performance over my generated datasets. Probably the Chib and Winkelmann is "a model to rule them all".
So, thanks for putting me on the right path. If, in the future, you plan to write a "STAN for dummies" book, I would love to be a tester!



andre.p...@googlemail.com

unread,
Feb 27, 2016, 11:05:15 PM2/27/16
to Stan users mailing list
Bob,

the latent variable poisson process does not reflect that the
variance becomes smaller while t grows, see page 10 bottom 
in:

So wouldn't it be the same, if we model 

  y[t] ~ poisson(exp(alpha + u[t])); 

with a negative binomial and let the variance follow an gamma
process (or cauchy process)?

Andre

Bob Carpenter

unread,
Feb 28, 2016, 8:09:55 PM2/28/16
to stan-...@googlegroups.com
I don't think I'm the one you want working out that math.

I would suggest replacing your expression with

y[t] ~ poisson_log(alpha + u[t]);

which takes the parameter to be on the log scale.
Also, if u is a vector and y is an array of integers, you
can simplify further:

y ~ poisson_log(alpha + u);

- Bob

andre.p...@googlemail.com

unread,
Feb 29, 2016, 6:18:38 AM2/29/16
to Stan users mailing list
Bob,

nothing to add. I was referring to your question for 

" I don't understand why the authors formulated the model 
this way. "

and I thought this paper could bring some clarity to Fabio and
answer your question. Sorrily I did express myself too clearly
and stole your time. 

Andre



Fabio Zottele

unread,
Feb 29, 2016, 7:44:17 AM2/29/16
to stan-...@googlegroups.com
Dear Dr.Carpenter and Dear Dr.Pfeuffer,
thank you for the interest.
I am still studying the foundings of the time series and doing simulations over simulations...
However, it is still hard for me to produce dataset and train my models with them!
So, following the paper linked by Dr.Pfeuffer I found the  PESTS software
(Poisson Estimators for State-space Time Series) that could help me to test my models
on simulated data.
This helps me in learning both Stan and Statistics on time series.
As soon I will help a clear and sounding documentation I will be happy to share
it with all the Stan community!

F.


--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/ESxlrXmaQkM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+...@googlegroups.com.

andre.p...@googlemail.com

unread,
Mar 2, 2016, 10:32:59 PM3/2/16
to Stan users mailing list
Dear Fabio Zottele,

to get the model work, as Bob stated already, we would require the Jacobian 
of m[t]. This should be possible for the gamma distribution. The recursive m[t] 
structure would some additional thoughts. 

Andre
Reply all
Reply to author
Forward
0 new messages