ValueError: data already contains a constant

860 views
Skip to first unread message

Francesco

unread,
Jul 28, 2015, 1:16:28 PM7/28/15
to pystatsmodels
Hello,

I am trying to use the vector_ar module in statsmodels, but running into some problems, feeding it a numpy array named grid_dT. It's my first use of statsmodels.
_______________ _______________ _______________ _______________
import datetime
import pandas as pd
import numpy as np
import statsmodels.tsa.vector_ar.var_model as tsa

numVariables,time_bin = grid_dT.shape
base = datetime.datetime(2008,1,1)
date_list = [base + datetime.timedelta(days=x) for x in range(0, time_bin)]

TEST_DATA = pd.DataFrame(np.transpose(grid_dT),columns=[codes])
zero_columns = TEST_DATA.apply(lambda x: np.all(x==0)) #remove all columns that only have zeros
zero_columns = zero_columns[zero_columns].index.tolist()
TEST_DATA = TEST_DATA.drop(zero_columns,1)
TEST_DATA.index = date_list

MODEL_TEST = tsa.VAR(TEST_DATA)
#print(TEST_DATA)

results_TEST = MODEL_TEST.fit(14)
_______________ _______________ _______________ _______________ 

However, I get an error using fit: 

/Users/Francesco/anaconda/lib/python3.4/site-packages/statsmodels/tsa/vector_ar/var_model.py in fit(self, maxlags, method, ic, trend, verbose)
    433         self.nobs = len(self.endog) - lags
    434 
--> 435         return self._estimate_var(lags, trend=trend)
    436 
    437     def _estimate_var(self, lags, offset=0, trend='c'):

/Users/Francesco/anaconda/lib/python3.4/site-packages/statsmodels/tsa/vector_ar/var_model.py in _estimate_var(self, lags, offset, trend)
    452         y = self.y[offset:]
    453 
--> 454         z = util.get_var_endog(y, lags, trend=trend, has_constant='raise')
    455         y_sample = y[lags:]
    456 

/Users/Francesco/anaconda/lib/python3.4/site-packages/statsmodels/tsa/vector_ar/util.py in get_var_endog(y, lags, trend, has_constant)
     31     if trend != 'nc':
     32         Z = tsa.add_trend(Z, prepend=True, trend=trend,
---> 33                           has_constant=has_constant)
     34 
     35     return Z

/Users/Francesco/anaconda/lib/python3.4/site-packages/statsmodels/tsa/tsatools.py in add_trend(X, trend, prepend, has_constant)
     40     trend = trend.lower()
     41     if trend == "c":    # handles structured arrays
---> 42         return add_constant(X, prepend=prepend, has_constant=has_constant)
     43     elif trend == "ct" or trend == "t":
     44         trendorder = 1

/Users/Francesco/anaconda/lib/python3.4/site-packages/statsmodels/tools/tools.py in add_constant(data, prepend, has_constant)
    316         if np.any(var0):
    317             if has_constant == 'raise':
--> 318                 raise ValueError("data already contains a constant.")
    319             elif has_constant == 'skip':
    320                 return data

ValueError: data already contains a constant.
__________________ __________________ __________________ __________________ __________________ 

Does anyone know what I'm doing wrong? I am following the test example on http://statsmodels.sourceforge.net/stable/vector_ar.html#var
The dT matrix is quite sparse, so dT is usually zero, but may obviously vary. 

Appreciate any suggestions.

Cheers,

Francesco

josef...@gmail.com

unread,
Jul 28, 2015, 2:24:15 PM7/28/15
to pystatsmodels
check if you have a constant column or a column with variance equal to zero in your TEST_DATA.

The exception message indicates that there is already a constant when it tries to add a constant.

Currently most tsa models only allow constants to be added through the option. But since VAR doesn't take any exog, I guess you are trying to fit a VAR model with one of the endogenous variables being a constant.

Just guessing.

Josef


 

Cheers,

Francesco

Reply all
Reply to author
Forward
0 new messages