Dear Josef,
I am also trying to compute the Driscoll-Kraay standard errors. But I always get a MemoryError issue.
For instance:
index Y X1 X2 ...... X17 GroupID
2012-01-25 12:30:00 -1.809030 2.126177 0.522877 1
2012-01-25 12:31:00 -0.434571 -1.809030 2.126177 1
2012-01-25 12:32:00 0.500806 -0.434571 -1.809030 1
2012-01-25 12:33:00 -0.877922 0.500806 -0.434571 1
2012-01-25 12:34:00 0.427819 -0.877922 0.500806 1
The data is of length 1410 by 17. I have four groups so GroupID goes from 1 to 4.
Now if I try the following:
time = [(t-datetime.datetime(1970,1,1)).total_seconds() for t in df.index] # convert my time index to number of seconds
res = sm.OLS(df.Y, df.X).fit(cov_type='nw-groupsum', cov_kwds={'time': time, 'groups': np.array(df.GroupID), 'maxlags': 5})
I get this error:
Traceback (most recent call last):
File "<ipython-input-81-be983d62f538>", line 4, in <module>
'groups': np.array(dec_all.Pid), 'maxlags':1})
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 211, in fit
cov_type=cov_type, cov_kwds=cov_kwds, use_t=use_t)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 1099, in __init__
use_t=use_t, **cov_kwds)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\regression\linear_model.py", line 1873, in get_robustcov_results
use_correction=use_correction)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\stats\sandwich_covariance.py", line 871, in cov_nw_groupsum
S_hac = S_hac_groupsum(xu, time, nlags=nlags, weights_func=weights_func)
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\stats\sandwich_covariance.py", line 477, in S_hac_groupsum
x_group_sums = group_sums(x, time).T #TODO: transpose return in grou_sum
File "C:\Users\chamar.stu\AppData\Local\Continuum\Anaconda\lib\site-packages\statsmodels\stats\sandwich_covariance.py", line 437, in group_sums
for col in range(x.shape[1])])
MemoryError
What am I doing wrong? Thanks Josef