Example of a multivariate autoregressive (MAR) time series model

1,217 views
Skip to first unread message

Ryan Batt

unread,
Dec 9, 2015, 7:07:37 PM12/9/15
to Stan users mailing list
Hi all,

I was just wondering if anyone had come across an example of a MAR(1) model (or more complicated, if it exists).

I've seen posts about a VAR model here, and a few cases with univariate time series (e.g., several in manual; AR, MA, ARMA), but no multivariate AR.

If not, I'll take a whack at simulating a time series and fitting it in R and in Stan, and post my example MAR back here.

-Ryan

andre.p...@googlemail.com

unread,
Dec 10, 2015, 11:52:38 AM12/10/15
to Stan users mailing list
I did a Vector ARMA(1,1)-GARCH(1,1) student-t based on the VAR published here. 
It is incredible slow in estimate, 1 hour(+). It has the advantage of being robust.

BTW, the VAR model needs stability restrictions of the  parameters.

Andre

Ryan Batt

unread,
Dec 10, 2015, 1:27:22 PM12/10/15
to Stan users mailing list
Hello Andre,


That model is wonderful, and you and others are providing great help!

I must confess that I am only casually familiar with VAR and GARCH models, having never used them myself.

That being said, it is my impression that 1) VAR models, although multivariate, don't hold quite the same meaning as the MAR [I can expand my thinking if asked]; 2) GARCH (like ARMA), is univariate, and my interests are specifically in the multivariate case.

Am I correct to understand that your models, wonderful as they are, are not necessarily substitutes for a MAR model? I will, however, be looking at ARMA, GARCH, and VAR examples that I've found to understand some of the relevant coding tricks.

For a model of the form X_t = X_{t-1} B + E_t , where X is an 1 x N matrix, B is a N x N matrix, E_t are shocks at time t (across all t, shocks are temporally-independent), with the shocks at any time step having a multivariate normal distribution with mean 0 and covariance matrix \Sigma.

Initially I want to hold the off-diagonal elements of B constant at 0, and just fit the diagonals. Same for the covariance matrix for the E. Then I want a fancier model where the off diagonals are parameters to fit, too.


My field of study is ecology. This Ecological Monograph by Ives et al shows the applications and interpretations in our field (which I don't think are terribly uncommon or unique): https://paperpile.com/shared/lpCAF7 I'm providing that simply as a reference to my language, in case I'm making any mistakes. And to illustrate my motivation for the MAR, in case that helps point out why I am pursuing MAR instead of something like VAR.

I'll probably get around to writing my own model tomorrow.

-Ryan

Bob Carpenter

unread,
Dec 10, 2015, 3:40:01 PM12/10/15
to stan-...@googlegroups.com

> On Dec 10, 2015, at 1:27 PM, Ryan Batt <bat...@gmail.com> wrote:
>
> ...

> That being said, it is my impression that 1) VAR models, although multivariate, don't hold quite the same meaning as the MAR [I can expand my thinking if asked]; 2) GARCH (like ARMA), is univariate, and my interests are specifically in the multivariate case.
>
> Am I correct to understand that your models, wonderful as they are, are not necessarily substitutes for a MAR model? I will, however, be looking at ARMA, GARCH, and VAR examples that I've found to understand some of the relevant coding tricks.

Wouldn't you be able to extend MAR analogously to the extensions
to AR?

> For a model of the form , where is an 1 x N matrix, is a N x N matrix, are shocks at time (across all , shocks are temporally-independent), with the shocks at any time step having a multivariate normal distribution with mean 0 and covariance matrix .
>
> Initially I want to hold the off-diagonal elements of B constant at 0, and just fit the diagonals. Same for the covariance matrix for the E. Then I want a fancier model where the off diagonals are parameters to fit, too.

I'd strongly recommend parameterizing the covariance of the error
distribution using a scaled Cholesky factor for a correlation matrix
with an LKJ prior (see the regression chapter of the manual).

Are you going to try to impose any constraints on B? And what kind
of prior makes sense? I saw that it has a stationary distribution if B's
eigenvalues all "live in the unit circle." We don't usually recommend imposing
such hard constraints (i.e., assuming the process is stationary), but even
if we did, I don't know how we'd code that as a constraint in Stan. ​

> My field of study is ecology. This Ecological Monograph by Ives et al shows the applications and interpretations in our field (which I don't think are terribly uncommon or unique): https://paperpile.com/shared/lpCAF7 I'm providing that simply as a reference to my language, in case I'm making any mistakes.

Thanks --- that paper's a very nice overview. We're trying to write
things like that for Stan. I already did the Dorazio and Royle
occupancy model --- I went through the whole paper and
replicated it with Stan (RStan + knitr, specifically, which works
well for this kind of small-scale coding and presentation):

https://github.com/stan-dev/example-models/tree/master/knitr/dorazio-royle-occupancy

It would be fun to take this paper and do the same thing --- replicate it
with Stan code. We're trying to collect more case studies like the above
from users to distribute as a set of Stan case studies. We're going to
put them up on our web site.

> And to illustrate my motivation for the MAR, in case that helps point out why I am pursuing MAR instead of something like VAR.
>
> I'll probably get around to writing my own model tomorrow.

Let us know how it goes or you want suggestions for parameterizations.

- Bob

Krzysztof Sakrejda

unread,
Dec 10, 2015, 4:04:28 PM12/10/15
to Stan users mailing list
Looking at Ives's paper the models he discusses are commonly referred to as VAR (he does use the MAR acronym but the usual VAR equations... I'm pretty sure...). Could you clarify the distinction you are making?

Bob Carpenter

unread,
Dec 10, 2015, 4:25:45 PM12/10/15
to stan-...@googlegroups.com
I'm not familiar with the terminology, but as far as I can
tell the definition in that paper matches this one:

https://en.wikipedia.org/wiki/Vector_autoregression

- Bob

> On Dec 10, 2015, at 4:04 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:
>
> Looking at Ives's paper the models he discusses are commonly referred to as VAR (he does use the MAR acronym but the usual VAR equations... I'm pretty sure...). Could you clarify the distinction you are making?
>
> --
> You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> To post to this group, send email to stan-...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

andre.p...@googlemail.com

unread,
Dec 10, 2015, 7:18:03 PM12/10/15
to Stan users mailing list

I'm not familiar with the terminology, but as far as I can
tell the definition in that paper matches this one:

Exactly, I don't see the differences between VAR and MAR also,
which does not mean much. 

Ryan,
ARMA, GARCH can be extended to multidimensional. In case of 
GARCH the co-variance defined as a process over time. 

Shortcut. If you set B entries off diagonal to 0 and same for the innovations (errors),
then the model could also expressed as independent uni-variate models.  

Andre

Ryan Batt

unread,
Dec 10, 2015, 8:21:04 PM12/10/15
to Stan users mailing list
TLDR: Looks like MAR is what ecologists call VAR. But I still have need to extend the great example provided by Rob Trangucci.

Thanks everyone.

Ryan,
ARMA, GARCH can be extended to multidimensional. In case of 
GARCH the co-variance defined as a process over time. 
Shortcut. If you set B entries off diagonal to 0 and same for the innovations (errors),
then the model could also expressed as independent uni-variate models.  
Andre

Broader context: my motivation for the MAR is to have a multivariate autoregressive component to a more complicated model (a multispecies dynamic occupancy model). With off-diagonals as 0, I think I can still increase model speed by using the matrix multiplication. But your comment here reinforces some other distinctions among these models, and that's helping me along. Thank you!

As for the multivariate extensions, that's what I figured. But I also know that there are a lot of functions in Stan for handling special types of matrices. That, combined with some of the other vocabulary I wasn't familiar with made me think about trying to find a multivariate example before diving in.

Thanks --- that paper's a very nice overview.  We're trying to write 
things like that for Stan.  I already did the Dorazio and Royle 
occupancy model --- I went through the whole paper and 
replicated it with Stan (RStan + knitr, specifically, which works 
well for this kind of small-scale coding and presentation): 
https://github.com/stan-dev/example-models/tree/master/knitr/dorazio-royle-occupancy 

The multispecies occupancy model you have there is what I'm extending. I'm extending by 1) adding covariates for occupancy; 2) sites aren't the same; 3) It'll be dynamic, with terms for colonization and extinction (you have a model for that, too, somewhere, but not with some of the other elements). 

In some context you might need to make the VAR account for observation error by using a state space representation. I don't think I'll need it in my model, though.


Are you going to try to impose any constraints on B?  And what kind 
of prior makes sense?  I saw that it has a stationary distribution if B's 
eigenvalues all "live in the unit circle."  We don't usually recommend imposing 
such hard constraints (i.e., assuming the process is stationary), but even 
if we did, I don't know how we'd code that as a constraint in Stan.  ​ 

The only constraints are that I will initially set the off-diagonals of B to 0. I don't like to code things with the constraint of stationarity, either.


It would be fun to take this paper and do the same thing --- replicate it 
with Stan code.  We're trying to collect more case studies like the above 
from users to distribute as a set of Stan case studies.   We're going to 
put them up on our web site. 

Definitely. That paper has the data online. I know those systems too. I'll also keep in mind that you're looking to make a compendium of those case studies. I can share my stuff as a knitr report like you did.

I fit simulated time series with Rob Trangucci's model. The effective sample size is pretty low; I read in another thread that this seems to happen a lot with time series models. I believe it. Some of the covariance parameters had an Rhat of NaN, which was weird (large ess though). Otherwise the model fit the simulated data pretty well.

What worries me about the ESS is that my occupancy model is super slow, but converges at around 50-100 iterations ... very nice! But I guess that'll disappear once it's dynamic.

Tomorrow I'll extend Rob Trangucci's model to have covariates. I might also put it in a state space representation to hand observation error. I can share that code, if I get working. 

Whenever I end up getting the dynamic occupancy model (multispecies, with covariates, non-identical sites) working, I'll share that too. I'll share things using rmarkdown+knitr.

-Ryan

Krzysztof Sakrejda

unread,
Dec 10, 2015, 8:37:44 PM12/10/15
to Stan users mailing list
The other threads about AR models being slow point out that the problems are related to parameterization so making the algebra go faster will not improve the speed of the models. Priors and the right parameterization might. Also, including colonization and extinction in these models will almost certainly make for poorly identified parameters but I'm add excited as everyone else to see them work. Sorry about the top posting, hard to write on a tiny screen.

Michael Betancourt

unread,
Dec 10, 2015, 8:54:56 PM12/10/15
to stan-...@googlegroups.com
Non-centering these time series has the potential to drastically
improve performance, but nonidentifiability is a serious concern,
especially if the model does not have a strong generative
interpretation.

On Dec 10, 2015, at 8:37 PM, Krzysztof Sakrejda <krzysztof...@gmail.com> wrote:

> The other threads about AR models being slow point out that the problems are related to parameterization so making the algebra go faster will not improve the speed of the models. Priors and the right parameterization might. Also, including colonization and extinction in these models will almost certainly make for poorly identified parameters but I'm add excited as everyone else to see them work. Sorry about the top posting, hard to write on a tiny screen.
>

Ryan Batt

unread,
Dec 10, 2015, 10:16:10 PM12/10/15
to Stan users mailing list
The other threads about AR models being slow point out that the problems are related to parameterization so making the algebra go faster will not improve the speed of the models. 
It doesn't solve the AR-related speed problem. But that's no reason to include slow-downs. It's my understanding that the matrix algebra is a basic increase, regardless of other speed problems. In my case, my multivariate time series can have 400-1000 state variables.

Priors and the right parameterization might. 
Some of the LKJ and Cholesky stuff is tough for me still. Not used to it.  Is there something you'd do differently here: https://groups.google.com/d/msg/stan-users/8RerHVzxjUQ/iqj6sW3uBQAJ ?

Also, including colonization and extinction in these models will almost certainly make for poorly identified parameters but I'm add excited as everyone else to see them work.
Yeah, I agree. Right now I'm nervous + excited ... you all know how it goes ;) That said, people have fit models like these before. I'm a little unsure how exactly to implement the model. The models have Psi, the logit of the probability of occupancy, as the target of one of the submodels (the other submodel is Theta, prob of detection). I think I'm going to start by modelling Psi_t = phi * Psi_t-1 +  gamma * (-1 * Psi_t-1) + ..... Where phi is basically the B matrix (AR1 on the diagonal, 0 elsewhere), Psi is on a logit-scale, and gamma is set up like phi. It's weird because in the JAGS models, you only ever get the phi term OR the gamma term activated, because they model a binary (0,1) latent parameter (Z, occupancy status). Lastly, the .... are the other terms predicting presence/ absence in a vanilla multispecies occupancy model. So maybe the phi and gamma aren't really "persistence" and "colonization" like you might like them to be; they're also not functions of sets of other covariates (but they are species-specific, hierarchically).

You might think I'm crazy, and maybe I am, but I'll give it a try.

I'm add excited as everyone else to see them work
Yeah, me too!

Non-centering these time series has the potential to drastically 
improve performance, but nonidentifiability is a serious concern, 
especially if the model does not have a strong generative 
interpretation. 
Like above, is that VAR model not non-centered? I'll look again late tonight, and tomorrow, but I thought it was when I ran it earlier ... Could you elaborate on your comment about nonidentifiability, and the "strong generative interpretation"? Especially that second statement, but with respect to the first. In the Ives paper he points out how systems with slow return times (large eigen value in the B, I believe; I always get confused by which convention is being used, so maybe it's other way around) are easier to identify. Makes sense. Is that what you mean?

Separate note: what does the "Mark as complete" button mean in Google Groups? I never know if I'm supposed to click it. 

Ryan Batt

unread,
Dec 10, 2015, 11:38:29 PM12/10/15
to Stan users mailing list
Psi_t = phi * Psi_t-1 +  gamma * (-1 * Psi_t-1) + ..... 
Yeah, that's wrong. I guess it needs to be:

Psi_t = logit(phi * inv_logit(Psi_t-1)  + gamma * (1 - inv_logit(Psi_t-1) + ...)

Or maybe I convert all of the Psi to 0-1 scale first. Either way, sorry for the bad algebra.

Ryan Batt

unread,
Dec 11, 2015, 12:12:26 PM12/11/15
to Stan users mailing list
I think my basic formulation of the dynamic occupancy model is hugely wrong. So ignore everything I've said there :-/ Marginalizing out the presence-absence parameter is more complicated than simply replacing it with probability of being present, I'm pretty sure.

So when Z is 0 or 1, and you have Psi_t = Z_t-1 * phi + (1 - Z_t-1) * gamma, you can't simply replace Z with Psi and call it a day. 
Here's a a paper w/o Z: https://paperpile.com/shared/xlmG6j
Here's a paper w/ Z (in JAGS/ BUGS): https://paperpile.com/shared/EOSUWP

But this is getting pretty off topic from the point of this thread. I might wind up creating a new one.

Stuff about MAR and VAR still fair though.

Sorry.

Bob Carpenter

unread,
Dec 11, 2015, 12:35:33 PM12/11/15
to stan-...@googlegroups.com

> On Dec 10, 2015, at 8:21 PM, Ryan Batt <bat...@gmail.com> wrote:
>
...

> The only constraints are that I will initially set the off-diagonals of B to 0. I don't like to code things with the constraint of stationarity, either.

Don't literally set the off-diagonals of a covariance
matrix to 0. Better to just use the vectorized non-multivariate
normal. That is, rather than something like this:

vector<lower=0>[K] sigma;
vector[K] mu;
matrix[N, K] x;

for (n in 1:N)
x[n] ~ multi_normal(mu, diag_matrix(sigma));

I'd recommend

for (k in 1:K)
col(x, k) ~ normal(mu[k], sigma[k]);

which gives you the same model but will be much much faster.

- Bob

Bob Carpenter

unread,
Dec 11, 2015, 12:40:25 PM12/11/15
to stan-...@googlegroups.com
Bad algebra will bite you if the results are near boundaries.
There are lots of more stable compound functions in Stan like
log_inv_logit and log1m_inv_logit. Stan does inverse logit
transforms on variables constrained in (0, 1) and applies
the Jacobian to give them a uniform distribution on (0, 1).

The Cholesky factorizations are worth doing for stability as
well as speed.

- Bob

Ryan Batt

unread,
Dec 11, 2015, 3:42:11 PM12/11/15
to stan-...@googlegroups.com
Hey all,

I'm attaching a write-up of a VAR/ MAR model with covariates. Hopefully this is a useful/ meaningful contributions. The Stan model itself is 95% Rob Trangucci's, all I did was add the covariate. But the simulation and ecological writing is me. 

EDIT: Fixed an incorrect statement in the Summary, run with packages already loaded (avoid including a bunch of print outs). Both the .R and .pdf files updated.

-Ryan
VAR_p_cov.stan
MAR_VAR.pdf
MAR_VAR.R
Message has been deleted

Julian King

unread,
Sep 13, 2016, 8:31:42 PM9/13/16
to Stan users mailing list
Hi Ryan,

Thanks for a really useful set of code.

One quick comment - I think you have your lags back to front

You have the following code line:
mus[t] <- mus[t] + B[p] * Y[t+p-1]

Assume you have P=4 lags.

If t is 10, then t in the original data is t+P = 14.

So then:
  • p = 1 means you are referencing 10+1-1 = 10 in the original time step
  • p = 2 means you are referencing 10+2-1 = 11 in the original time step
  • p = 3 means you are referencing 10+3-1 = 12 in the original time step
  • p = 4 means you are referencing 10+4-1 = 13 in the original time step.

Would it not make more sense to use t+P-p instead of t+p-1, so that B[1] corresponds to the first lag?


Julian

Ryan Batt

unread,
Oct 3, 2016, 12:28:23 PM10/3/16
to stan-...@googlegroups.com
I think I grabbed this specific piece of code from someone else on here. Think I linked to it.

--
You received this message because you are subscribed to a topic in the Google Groups "Stan users mailing list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stan-users/PmGMnqXRD28/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stan-users+unsubscribe@googlegroups.com.

To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Ryan D. Batt
Aquatic Ecologist
Postdoctoral Researcher, Rutgers University
Reply all
Reply to author
Forward
0 new messages