Dear all,
is it possible to specify both 'hac-panel' and an index to cluster standard error for robust OLS covariance calculations?
Thanks,
Giacomo M
On Thursday, April 14, 2016 at 4:01:44 PM UTC+2, josefpktd wrote:On Thu, Apr 14, 2016 at 9:01 AM, Giacomo Marangoni <jack...@gmail.com> wrote:Dear all,
is it possible to specify both 'hac-panel' and an index to cluster standard error for robust OLS covariance calculations?I'm not sure I understand what you mean`hac-panel` where the keyword is actually `nw-panel` calculates the hac kernel sum for each time series defined by groups, and then aggregates, if I read the code and remember correctly.Based on the code in regression:the group_idx is internally calculated based on the time index, under the assumption that we have equal spaced time periods with no missing values in the interior (times series for individual panel units can differ in length as in unbalanced panel but only by truncation at the beginning or end).It looks like the time index is only used to calculate where panel units begin in the array. The time index or period labels themselves are not used.`nw-groupsum` (Driscoll Kraay) uses time periods as labels to sum over all observations with the same time label, and then calculates the hac kernel over the sums for each period, assuming that the array with cross-section sums is a time series with equal spaced periods.cluster_2groups: this just aggregates according to the labels of the two groups.not implemented:unequal spaced hac plus groups:An *obvious* extension would have been to allow for kernels as in newey west or similar for arbitrary distance measures based on time periods interpreted in continuous time (or points in space, or any other distance measure) and allow for groups in another direction.This would interpret the "time" index as actual location for calculating the distance between two observations, and `groups` as index for discrete 0-1 distance.I gave up on implementing this because I didn't find a reference and it got a bit messy to implement. IIRC I stopped half way through implementing this generic kernel covariance.(Now that I think about it again, this might be a similar application as the product kernels for mixed continuous and discrete variables in kde and kernel regression.)Does this help, or can you clarify your question?
Thanks a lot Josef. It definitely helps. Just a few more questions: if I'm fitting an OLS object where my variables are defined on both "individual" and "time" indices, and I have individuals fixed effects, and potential serial correlations over time, I could use .fit(cov_type='nw-panel', cov_kwds={'groups':'individual', 'time':'time'}), correct?
If I have both individuals and time fixed effects, should I use cluster_2groups? In this case how do I specify two groups in cov_kwds?
Fortunately I have equally spaced time series, even though not always full, I'll have to interpolate then.