Pandas rolling window OLS: how do I get the coefficients and intercept out?

2,912 views
Skip to first unread message

Thomas Browne

unread,
Mar 11, 2010, 3:58:07 PM3/11/10
to pystatsmodels
Hello,

So thanks to your useful help yesterday I am now playing around with a
big DataMatrix of 22 foreign exchange series daily returns of about
2500 points each (since 3 Jan 2000).

For each currency pair I want to find the linear combination of the
other 21 pairs which "best" replicates it, at any point in history,
always using a 262 day historical sample (so 1 year).

Please can you explain how I get those coefficients, (and the
intercept ideally, if I were to decide to regress the actual series
rather than their returns, for RV purposes), out of the
pandas.stats.ols.MovingOLS function? I don't see which method to use.

Can you also confirm that that is what the MovingOLS function does,
ie. put a rolling window onto the series and provide regression stats
for each window "snapshot" (I am specifying window_type = 'ROLLING' in
the call).

What does window_type 'EXPANDING' do?

Thanks.

Wes McKinney

unread,
Mar 11, 2010, 4:09:21 PM3/11/10
to pystat...@googlegroups.com

Well, I apologize for the lack of documentation. That should change soon enough.

First off-- I would use the ols function in pandas.stats.api for all
of these, so you'd do:

model = ols(y=y, x=x, window_type='rolling', window=262, min_periods=100)

or something like that. It's going to compute statistics for a moving
window with each regression being labeled by the last period in the
window.

For coefficients: model.beta
If you include an intercept (the default), you can see the coefficient
by: model.beta['intercept']

There are lots of other model results very similar to the
scikits.statsmodels results classes except there will be an extra
dimension due to many regressions being run.

window_type='expanding' grows the size of the estimation window as
more data is available.

Like you might want to do:

model = ols(y=y, x=x, window_type='expanding', min_periods=262)

to start the window size at 262, and it will grow to the full sample
size at the end, so the last set of coefficients / results should
match as though you did a full sample estimation like:

model = ols(y=y, x=x)

The ols function is very useful-- it decides what class (OLS,
MovingOLS, PanelOLS, MovingPanelOLS) to use based on the types (Series
/ TimeSeries / DataFrame, etc.) of your inputs.

Hope this helps,
Wes

Reply all
Reply to author
Forward
0 new messages