vector autoregression with custom order of lags

47 views
Skip to first unread message

Jack Shim

unread,
Apr 12, 2017, 11:43:10 AM4/12/17
to pystatsmodels
My question is two-fold.

1. My understanding is that the current VAR estimation method (statsmodels.tsa.vector_ar.var_model.VAR.fit) is capable of including different orders of lags via the maxlags parameter - with maxlags p, the estimation would include lags of order 1,2,...,p. The problem is that sometimes only a subset of the lags are necessary in the estimation. For instance,  I have no way of implementing the empirical model y_t = A y_{t-12} + e_t, since specifying maxlags=12 would include all lags from 1 to 12. A point of comparison: perhaps because of this reason, STATA VAR estimation accepts list of lags instead of max lag. 

Is there a way to get around this problem in the current version? If not, is it planned to be improved upon in future releases?

2. In the attempt to get around the said problem, I noticed that there is a method to estimate VAR from a formula (statsmodels.tsa.vector_ar.var_model.VAR.from_formula). But I could not find a documentation for the syntax for the formula.I have read this documentation, but it doesn't cover what's relevant for VAR. Did I miss something?

josef...@gmail.com

unread,
Apr 12, 2017, 12:19:19 PM4/12/17
to pystatsmodels
On Wed, Apr 12, 2017 at 11:07 AM, Jack Shim <jack...@gmail.com> wrote:
> My question is two-fold.
>
> 1. My understanding is that the current VAR estimation method
> (statsmodels.tsa.vector_ar.var_model.VAR.fit) is capable of including
> different orders of lags via the maxlags parameter - with maxlags p, the
> estimation would include lags of order 1,2,...,p. The problem is that
> sometimes only a subset of the lags are necessary in the estimation. For
> instance, I have no way of implementing the empirical model y_t = A
> y_{t-12} + e_t, since specifying maxlags=12 would include all lags from 1 to
> 12. A point of comparison: perhaps because of this reason, STATA VAR
> estimation accepts list of lags instead of max lag.
>
> Is there a way to get around this problem in the current version? If not, is
> it planned to be improved upon in future releases?

There is currently no plan for this, mainly because it never came up
and I never considered this case.

note: there is now also a statespace version, which does not allow
selected lags either, AFAICS
http://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.varmax.VARMAX.html

In terms of estimation:
If we drop lags, but keep the same regressors or lags in each
equation, then the estimation problem remains the same, which is
essentially OLS for the parameters. One way to use this with the
current models is to add the regressors for additional lags after a
gap as `exog`. This is currently not supported in VAR, but is in
VARMAX, and should be available soon also for VAR.

AFAIK, Stata delegates the estimation to SUR (system of unrelated
regression) or sysreg (?). We don't have the equivalent general
estimator yet, but some of it like MultivariateOLS has been started,
and SUR is in an old PR.
The estimation in the VAR model could be adjusted to using selected
common lags without fundamental changes.

post-estimation:
However, even if estimation can be adjusted, I'm not sure how the post
estimation support would work for things like impulse response
functions and similar. That might need a full model with zero
parameters in the gaps.

Do you need just the parameter estimate with inference or the full VAR
post estimation results?


The more general case:
We discussed in the past and have open issues for arbitrary zero
constraints in the VAR model. This is more difficult because the
equivalence to OLS breaks down and we need a different estimation
method based on nonlinear optimization.



>
> 2. In the attempt to get around the said problem, I noticed that there is a
> method to estimate VAR from a formula
> (statsmodels.tsa.vector_ar.var_model.VAR.from_formula). But I could not find
> a documentation for the syntax for the formula.I have read this
> documentation, but it doesn't cover what's relevant for VAR. Did I miss
> something?

I think `from_formula` is inherited and does not apply to VAR. It
could be supported when there are additional `exog` but AFAIR, formula
support in VAR models did not come up yet.

Josef

Brock Mendel

unread,
Apr 12, 2017, 9:19:25 PM4/12/17
to pystatsmodels
Chad can weigh in on this, but I'm pretty sure what you're describing is handled by tsa.statespace.sarimax.SARIMAX. For your purposes you only need the SAR part.

josef...@gmail.com

unread,
Apr 12, 2017, 9:26:10 PM4/12/17
to pystatsmodels
On Wed, Apr 12, 2017 at 9:19 PM, Brock Mendel <jbrock...@gmail.com> wrote:
> Chad can weigh in on this, but I'm pretty sure what you're describing is handled by tsa.statespace.sarimax.SARIMAX. For your purposes you only need the SAR part.

SARIMAX allows selected lags including seasonal patterns but it is
univariate, the question was for VAR.

Josef

Chad Fulton

unread,
Apr 13, 2017, 12:25:01 AM4/13/17
to Statsmodels Mailing List
The main issue with non-consecutive lags is that I don't know any transformation to ensure the VAR coefficient matrices are consistent with a stationary process. That makes MLE via scoring pretty tough to include naturally. We could probably do it with the EM algorithm, or maybe with subspace methods?

But as Josef says, you can make your own exog matrix with the lagged coefficients.

Chad

Brock Mendel

unread,
Apr 13, 2017, 8:15:03 PM4/13/17
to pystatsmodels
Veering away from usefulness for OP:

[...] I don't know any transformation to ensure the VAR coefficient matrices are consistent with a stationary process.

We could check whether estimated coefficients indicate a stationary process and warn/raise if they are not.
Reply all
Reply to author
Forward
0 new messages