Hierarchical AR(K) Models

147 views

Skip to first unread message

Vishv Jeet

unread,

Jun 5, 2017, 3:25:27 PM6/5/17

to stan-...@googlegroups.com

I have various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.

My question is: my Stan code effectively doesn't model data (and parameters) for any time-series that has less than 4 observations, is that the right thing to do? I think I have no choice but to skip the series with just 1 observation but what about series with 2 or 3 observations?

Below is my Stan code:

data {

int<lower=1> N; // total number of observations

int<lower=1> K; // number of lagged factor

int<lower=1> numG; // number of groups

vector[N] y; // data for Hierarchical AR(K)

int sizes[numG]; // sizes of observations across groups

}

parameters {

real<lower = 0> local_sigma;

vector<lower = 0>[K] global_sigma;

vector[K] z[numG];

vector<lower = -2, upper = 2>[K] global_beta; // global AR(K) multipliers

}

transformed parameters {

vector[K] group_betas[numG]; // group level AR(K) multipliers

for(i in 1:numG) {

group_betas[i] = global_beta + global_sigma .* z[i]; // NCP formulation, no correlations

}

model {

int pos;

pos = 1;

for(i in 1:numG) { // loop over groups

int local_N;

vector[sizes[i]] local_y;

local_N = sizes[i];

local_y = segment(y, pos, local_N);

for (n in (K + 1):local_N) { // loop over observations

real mu;

mu = 0;

for (k in 1:K) { // loop over lags

mu = mu + group_betas[i][k] * local_y[n - k];

}

y[n] ~ normal(mu, local_sigma);

}

z[i] ~ normal(0, 1);

pos = pos + local_N;

}

// hyperpriors

local_sigma ~ cauchy(0, 2);

global_sigma ~ cauchy(0, 2);

global_beta ~ cauchy(0, 2);

}

Bob Carpenter

unread,

Jun 6, 2017, 2:35:07 PM6/6/17

to stan-...@googlegroups.com

It is not unusual to employ separate models for the first few observations
in a time series, e.g.

p(y) = p(y[1])
* p(y[2] | y[1])
* p(y[3] | y[1:2])
* PROD_{n > 3} p(y[n] | y[n-3:n-1]) ...

You just need to formulate models for the first three terms.

- Bob

> On Jun 5, 2017, at 3:25 PM, Vishv Jeet <vishv...@gmail.com> wrote:
>
> I have a various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.

> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages