Strange error in statsmodels.tsa.stattools.grangercausalitytests?

1,341 views
Skip to first unread message

Paul Sawaya

unread,
Jun 29, 2012, 6:12:50 PM6/29/12
to pystatsmodels
Hi all,

I'm trying to run the grangercausalitytests on a 2D numpy array, and
I'm getting the following error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.6/site-packages/statsmodels/tsa/
stattools.py", line 771, in grangercausalitytests
res2down = OLS(dta[:,0], dtaown).fit()
File "/usr/lib64/python2.6/site-packages/statsmodels/regression/
linear_model.py", line 506, in __init__
super(OLS, self).__init__(endog, exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/regression/
linear_model.py", line 399, in __init__
super(WLS, self).__init__(endog, exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/regression/
linear_model.py", line 154, in __init__
super(GLS, self).__init__(endog, exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/base/model.py",
line 69, in __init__
super(LikelihoodModel, self).__init__(endog, exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/base/model.py",
line 33, in __init__
self._data = handle_data(endog, exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/base/data.py",
line 316, in handle_data
return klass(endog, exog=exog)
File "/usr/lib64/python2.6/site-packages/statsmodels/base/data.py",
line 21, in __init__
self._check_integrity()
File "/usr/lib64/python2.6/site-packages/statsmodels/base/data.py",
line 101, in _check_integrity
if len(self.exog) != len(self.endog):
TypeError: len() of unsized object

I just started playing with numpy/statsmodels, so I wouldn't be
surprised if I'm making a mistake. I'm pretty sure I'm calling the
function correctly, though, as it even fails with just a
grangercausalitytests(array(((1,2),(3,4))),1).

Any ideas?

Thanks,

Paul

josef...@gmail.com

unread,
Jun 29, 2012, 6:25:13 PM6/29/12
to pystat...@googlegroups.com
Awful error message.

I think the array is too short, you need more rows

for more rows :

>>> res = stm.tsa.stattools.grangercausalitytests(np.random.randn(3,2),1)

Granger Causality
number of lags (no zero) 1
ssr based F test: F=0.0000 , p=1.#QNB , df_denom=0, df_num=1
ssr based chi2 test: chi2=19.3580 , p=0.0000 , df=1
likelihood ratio test: chi2=4.7366 , p=0.0295 , df=1
parameter F test: F=-1.#IND , p=1.#QNB , df_denom=0, df_num=1

>>> res = stm.tsa.stattools.grangercausalitytests(np.random.randn(4,2),1)

Granger Causality
number of lags (no zero) 1
ssr based F test: F=0.0000 , p=1.#QNB , df_denom=0, df_num=1
ssr based chi2 test: chi2=348845645175717420000000000000.0000,
p=0.0000 , df=1
likelihood ratio test: chi2=200.7774, p=0.0000 , df=1
parameter F test: F=-1.#IND , p=1.#QNB , df_denom=0, df_num=1

>>> res = stm.tsa.stattools.grangercausalitytests(np.random.randn(5,2),1)

Granger Causality
number of lags (no zero) 1
ssr based F test: F=8.2574 , p=0.2132 , df_denom=1, df_num=1
ssr based chi2 test: chi2=33.0296 , p=0.0000 , df=1
likelihood ratio test: chi2=8.9017 , p=0.0028 , df=1
parameter F test: F=8.2574 , p=0.2132 , df_denom=1, df_num=1

I never thought of calling grangercausality with very small samples,
so I'm not sure what the minimum requirement is (maybe 5
observations). But it won't have any power with tiny samples.

Josef

>
> Thanks,
>
> Paul

Paul Sawaya

unread,
Jun 29, 2012, 7:16:58 PM6/29/12
to pystat...@googlegroups.com
Thanks, Josef. I had made a silly mistake, and had my columns/rows backwards. I transposed them, and my code works.

I am getting still another error with grangercausalitytests, however. It happens with the below matrix:

b=array( ( (555,13),(798,13),(755,14) ) )
grangercausalitytests(b,1)

I see:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.6/site-packages/statsmodels/tsa/stattools.py", line 804, in grangercausalitytests
    ftres = res2djoint.f_test(rconstr)
  File "/usr/lib64/python2.6/site-packages/statsmodels/base/model.py", line 1162, in f_test
    cparams = np.dot(r_matrix, self.params[:, None])
ValueError: objects are not aligned

Obviously it's pointless to look for a causal relationship with these test data, but I'm curious if there's anything else I'm doing wrong, and anything I can do to test for the error before it raises.

Paul

josef...@gmail.com

unread,
Jun 29, 2012, 7:39:28 PM6/29/12
to pystat...@googlegroups.com
On Fri, Jun 29, 2012 at 7:16 PM, Paul Sawaya <m...@paulsawaya.com> wrote:
> Thanks, Josef. I had made a silly mistake, and had my columns/rows
> backwards. I transposed them, and my code works.
>
> I am getting still another error with grangercausalitytests, however. It
> happens with the below matrix:
>
> b=array( ( (555,13),(798,13),(755,14) ) )
> grangercausalitytests(b,1)
>
> I see:
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/lib64/python2.6/site-packages/statsmodels/tsa/stattools.py",
> line 804, in grangercausalitytests
>     ftres = res2djoint.f_test(rconstr)
>   File "/usr/lib64/python2.6/site-packages/statsmodels/base/model.py", line
> 1162, in f_test
>     cparams = np.dot(r_matrix, self.params[:, None])
> ValueError: objects are not aligned
>
> Obviously it's pointless to look for a causal relationship with these test
> data, but I'm curious if there's anything else I'm doing wrong, and anything
> I can do to test for the error before it raises.

With one lag, the full model has 3 parameters to estimate, constant
plus two lag parameters.
Because of the initial condition, using lags, we are loosing one
observation. So three parameters with 2 observations, if my
calculation is correct.
With 4 observations, we would have the same number of observations in
OLS as parameters to estimate. So, 5 is the minumum number of
observations to have (striclty) more observations than parameters in
the full OLS model. Most parts of OLS will require then at leas 5
observations in the granger causality tests to have a "standard"
problem.

I don't think we ever checked the various methods of OLS for the
minimal number of observation, it will vary by method, f_test, pinv,
...
So I'm not surprised that it breaks in this case at different places.
I would have to work my way through the code, to see why in this case
it breaks at a different point.

grangercausality is taking the results from OLS, and we would have to
check the behavior of the OLSResults methods if there is an
insufficient number of observations.

Briefly playing with some random samples, it looks like with 7
observations the different test statistics in grangercausality start
to agree.


Josef

josef...@gmail.com

unread,
Jun 29, 2012, 7:51:14 PM6/29/12
to pystat...@googlegroups.com
nobs - maxlag > 2 * maxlag + addconstant
nobs > 3 * maxlag + addconstant
nobs > 3 * 1 + 1 = 4
to have more equations than parameters

nobs > 3 * 2 + 1 = 7

>>> res = stm.tsa.stattools.grangercausalitytests(np.random.randn(7,2), 2)

breaks again at a different place:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new\statsmodels\statsmodels\tsa\stattools.py",
line 804, in grangercausalitytests
ftres = res2djoint.f_test(rconstr)
File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new\statsmodels\statsmodels\base\model.py",
line 1176, in f_test
cov_p=cov_p))
File "C:\Python26\lib\site-packages\numpy\linalg\linalg.py", line 445, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
File "C:\Python26\lib\site-packages\numpy\linalg\linalg.py", line
328, in solve
raise LinAlgError, 'Singular matrix'
numpy.linalg.linalg.LinAlgError: Singular matrix

Josef

josef...@gmail.com

unread,
Jun 30, 2012, 12:25:53 PM6/30/12
to pystat...@googlegroups.com
Someone found also a small nobs problem with default settings in adfuller

http://stackoverflow.com/questions/11265518/adf-test-in-statsmodels-in-python

https://github.com/statsmodels/statsmodels/issues/347

Paul, thanks for pointing this out. We will add more checks to these functions.

Josef

Paul Sawaya

unread,
Jun 30, 2012, 12:52:39 PM6/30/12
to pystat...@googlegroups.com
I see. Now, at least, I know to check the number of observations and lag size. Thanks for getting to the root of this, Josef!

Paul

justin

unread,
Jul 1, 2012, 11:07:52 PM7/1/12
to pystat...@googlegroups.com
This question might be more appropriate for the Scipy boards but I
thought I would ask here first.

Suppose I have a function g that takes as an input a 1d array. I want
to minimize g, (using fmin_bfgs) but within g, I assign a value to one
of the parameters. For a simple example:

def g(x):
x[0] = 1
return 2 * x[0] * x[1]**2.

In essence I am doing this to constrain the optimization to optimize the
function while holding x[0] constant.

Although I am not sure of the reason because I am not too familiar with
the optimization algorithms, it seems that this might be inefficient
since the optimization algorithm will vary a parameter that remains
constant when trying to find the minimum.

So bottom line, is constraining an optimization this way inefficient on
a larger scale? Please let me know if I should switch this to the scipy
boards.

Thanks,

Justin

josef...@gmail.com

unread,
Jul 2, 2012, 5:35:02 AM7/2/12
to pystat...@googlegroups.com
I never tried to do this, and I'm a bit surprised that it works.
The gradient will be zero. If you start with the same initial value,
then finding the minimum might work without much extra work.
However, I think, the Hessian will be singular, which might cause
problems with Hessian approximation based methods like bfgs, and it
won't be available/invertible after the optimization.

It would be better to work with the bound version, fmin_l_bfgs_b, or
to shrink the parameter space for the optimization.

I have some code similar to this

mask = [True, False, False]
fixed = np.array([1, nan, nan])

def expand_params(x_reduced):
x = fixed.copy()
x[~mask] = x_reduced


def g(x):
x = expand_params(x):
return 2 * x[0] * x[1]**2.

Josef

>
> Thanks,
>
> Justin
Reply all
Reply to author
Forward
0 new messages