# Hierarchical AR(K) Models

121 views

### Vishv Jeet

Jun 5, 2017, 3:25:27 PM6/5/17
I have various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.

My question is: my Stan code effectively doesn't model data (and parameters) for any time-series that has less than 4 observations, is that the right thing to do? I think I have no choice but to skip the series with just 1 observation but what about series with 2 or 3 observations?

Below is my Stan code:

data {
int<lower=1> N; // total number of observations
int<lower=1> K; // number of lagged factor
int<lower=1> numG; // number of groups
vector[N] y; // data for Hierarchical AR(K)
int sizes[numG]; // sizes of observations across groups
}
parameters {
real<lower = 0> local_sigma;
vector<lower = 0>[K] global_sigma;
vector[K] z[numG];
vector<lower = -2, upper = 2>[K] global_beta; // global AR(K) multipliers
}
transformed parameters {
vector[K] group_betas[numG]; // group level AR(K) multipliers
for(i in 1:numG) {
group_betas[i] = global_beta + global_sigma .* z[i]; // NCP formulation, no correlations
}
}
model {
int pos;
pos = 1;
for(i in 1:numG) {  // loop over groups
int local_N;
vector[sizes[i]] local_y;
local_N = sizes[i];
local_y = segment(y, pos, local_N);
for (n in (K + 1):local_N) { // loop over observations
real mu;
mu = 0;
for (k in 1:K) { // loop over lags
mu = mu + group_betas[i][k] * local_y[n - k];
}
y[n] ~ normal(mu, local_sigma);
}
z[i] ~ normal(0, 1);
pos = pos + local_N;
}

// hyperpriors
local_sigma ~ cauchy(0, 2);
global_sigma ~ cauchy(0, 2);
global_beta ~ cauchy(0, 2);
}

### Bob Carpenter

Jun 6, 2017, 2:35:07 PM6/6/17
It is not unusual to employ separate models for the first few observations
in a time series, e.g.

p(y) = p(y)
* p(y | y)
* p(y | y[1:2])
* PROD_{n > 3} p(y[n] | y[n-3:n-1]) ...

You just need to formulate models for the first three terms.

- Bob

> On Jun 5, 2017, at 3:25 PM, Vishv Jeet <vishv...@gmail.com> wrote:
>
> I have a various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.