fast simulation of VAR (vector autoregressive process)

787 views
Skip to first unread message

josef...@gmail.com

unread,
May 22, 2016, 5:52:58 PM5/22/16
to pystatsmodels
Is there a fast way to simulate a VAR process, or MIMO in signal processing?
Is there anything like a multivariate version of scipy.signal.lfilter?
Can the new statespace model be used for fast VARMA filtering?

6 to 8 years ago the answer to the first two questions was negative, but I haven't been following this area since then.

The problem is that VAR estimation is simple linear algebra, but simulation requires the recursive time loop and is expensive in pure python.



I'm just thinking of whether we have support for Monte Carlo studies for estimators in the VAR neighborhood.
I'd like to try this out for variable selection or penalization in VAR estimators, and it's one of the tools needed by theoretical econometricians (which start to use python really soon now.:)

Josef






Tem Pl

unread,
May 22, 2016, 10:10:28 PM5/22/16
to pystatsmodels
What about an option Numba dependency?

Or, Numba emitting ahead of time compiled code. 

josef...@gmail.com

unread,
May 22, 2016, 10:51:13 PM5/22/16
to pystatsmodels
On Sun, May 22, 2016 at 10:10 PM, Tem Pl <rtem...@gmail.com> wrote:
What about an option Numba dependency?

Or, Numba emitting ahead of time compiled code. 

I thought about it, but I never used Numba.
How good is it now with dot products?

I will think about adding an optional numba dependency as soon as enough users or developers think it's worth it.

Another option is having a simple recipe for cython, so that even inexperienced users or developers like me could add a simple, efficient cython function.



A quick try shows that python is fast enough for my purpose with nobs at 500 or below 1000, but it will get slow with several thousand time periods.

-------
Here is my quickly written version with the main part a double loop nobs times number of lags.

def simulate_var(params, innovation='gaussian', nobs=None):

    p = len(params)

    k_endog = params[0].shape[0]

    

    if innovation in ['g', 'gaussian']:

        if nobs is None:

            raise ValueError('nobs needs to be an integer if innovation is not array')

        y = np.random.normal(size=(nobs + p, k_endog))

    else: 

        y = np.asarray(innovation).copy()

        if y.dtype != np.float:

            import warnings

            warnings.warn('dtype of innovation is not float')

    

    n, k = y.shape

    nobs = n - p

   

    for t in range(p, n):

        y[t] += sum(y[t - k].dot(A[k - 1]) for k in range(1, p+1))

    return y



and a quick check with just a simple 3 variable VAR(2) with 1000 observations


VAR estimate lag 1

[[ 0.62  0.54 -0.04]

 [ 0.01  0.46  0.06]

 [ 0.06 -0.01  0.54]]

true lag 1

[[ 0.6  0.5  0. ]

 [ 0.   0.5  0.1]

 [ 0.   0.   0.5]]


VAR estimate lag 2

[[ 0.06  0.12  0.14]

 [ 0.08  0.1   0.1 ]

 [ 0.08  0.12  0.1 ]]

true lag 2

[[ 0.1  0.1  0.1]

 [ 0.1  0.1  0.1]

 [ 0.1  0.1  0.1]]


time.time for the VAR simulation loop 0.00586


Josef

Chad Fulton

unread,
May 23, 2016, 5:11:19 PM5/23/16
to Statsmodels Mailing List
VAR and VARMAs can be simulated using the statespace library, and in the improved code (https://github.com/statsmodels/statsmodels/pull/2845) the simulation is Cythonized. I'll try and take a look at its performance later this week.

Chad

Chad Fulton

unread,
Jul 16, 2016, 2:20:23 PM7/16/16
to Statsmodels Mailing List
This was a pretty long time ago, but I've finally gotten around to testing it. On my computer the `simulate_var` function above runs a VAR(2) with nobs=1000 in: "100 loops, best of 3: 9.32 ms per loop".

The corresponding state-space code is:

mod = sm.tsa.VARMAX(np.ones((1, 3)), order=(2, 0), trend='nc')
params = ... (raveled versions of the autoregressive matrices + raveled lower triangle of innovation covariance cholesky)
mod.simulate(params, nsimulations=1000)

In the new branch (with the Cython-based simulation smoother doing the simulation), mod.simulate(.) function runs in: "1000 loops, best of 3: 1.36 ms per loop".
Reply all
Reply to author
Forward
0 new messages