Hierarchical AR(K) Models

144 views
Skip to first unread message

Vishv Jeet

unread,
Jun 5, 2017, 3:25:27 PM6/5/17
to stan-...@googlegroups.com
I have various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.

My question is: my Stan code effectively doesn't model data (and parameters) for any time-series that has less than 4 observations, is that the right thing to do? I think I have no choice but to skip the series with just 1 observation but what about series with 2 or 3 observations?

Below is my Stan code:

data {
  int<lower=1> N; // total number of observations
  int<lower=1> K; // number of lagged factor
  int<lower=1> numG; // number of groups
  vector[N] y; // data for Hierarchical AR(K)
  int sizes[numG]; // sizes of observations across groups
}
parameters {
  real<lower = 0> local_sigma; 
  vector<lower = 0>[K] global_sigma;
  vector[K] z[numG];
  vector<lower = -2, upper = 2>[K] global_beta; // global AR(K) multipliers
}
transformed parameters {
  vector[K] group_betas[numG]; // group level AR(K) multipliers
  for(i in 1:numG) {
    group_betas[i] = global_beta + global_sigma .* z[i]; // NCP formulation, no correlations
  }
}
model {
  int pos;
  pos = 1;
  for(i in 1:numG) {  // loop over groups
    int local_N;
    vector[sizes[i]] local_y;
    local_N = sizes[i];
    local_y = segment(y, pos, local_N);
    for (n in (K + 1):local_N) { // loop over observations
      real mu;
      mu = 0;
      for (k in 1:K) { // loop over lags
        mu = mu + group_betas[i][k] * local_y[n - k];
      }
      y[n] ~ normal(mu, local_sigma);
    }
    z[i] ~ normal(0, 1);
    pos = pos + local_N;
  }
  
  // hyperpriors
  local_sigma ~ cauchy(0, 2);
  global_sigma ~ cauchy(0, 2);
  global_beta ~ cauchy(0, 2);
}


Bob Carpenter

unread,
Jun 6, 2017, 2:35:07 PM6/6/17
to stan-...@googlegroups.com
It is not unusual to employ separate models for the first few observations
in a time series, e.g.

p(y) = p(y[1])
* p(y[2] | y[1])
* p(y[3] | y[1:2])
* PROD_{n > 3} p(y[n] | y[n-3:n-1]) ...

You just need to formulate models for the first three terms.

- Bob

> On Jun 5, 2017, at 3:25 PM, Vishv Jeet <vishv...@gmail.com> wrote:
>
> I have a various time-series of different lengths (usually called panel data). I want to run AR(K=3) regression on each of them but some of them are too short for a meaningful regression so I decided to use a Bayesian Hierarchical model in which the 3 multipliers for each series are sampled from a common multivariate normal distribution, so far so good, the model fits just fine and there are no divergent transitions and all Rhat values are extremely close to 1.
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages