Using SARIMAX...?

458 views
Skip to first unread message

Jeremy Pepper

unread,
Apr 19, 2015, 5:06:19 PM4/19/15
to pystat...@googlegroups.com
I'm trying to understand how to use SARIMAX.  To learn, I'm using examples found at:


Based on what I see there, I believe I've done what I need to do, but graphing is giving me trouble.  I think this is really due to data type issues.

The most relevant example references 'friedman2.dta' as a data file.  It does the sort of, multi-step-ahead predictions I need to do.  The problem is, I can't access the 'friedman2.dta' dataset to determine what the example dataset looks like.  That means I can't actually run the example, I can just sort of try to copy it...

As of the present time, using my code to make predictions, I'm getting numpy array outputs that are annoying because I don't know how to use them for much of anything.  Worse yet, I ONLY get the predicted value, not the confidence interval.  Herein lies my biggest problem... I switched APIs believing I would be able to get confidence interval outputs from my code, but I can't seem to figure that part out.  The relevant lines are:

# In-sample one-step-ahead predictions and 95% confidence intervals (forecast without data)

predict = res.predict(alpha=0.05)

ax.plot(predict.index[-npredict-npre:], predict[0, -npredict-npre:], 'r--', label='One-step-ahead forecast');


 On the left hand on the equal sign I used to have: "predict, cov, ci, idx", as in the example.  Now I just have: "predict."

When I include the "cove, ci, idx" terms, I get an error.  When I don't, I get a lovely output of predictions, but I don't get bounds on those predictions.  As the version of statsmodels I'm using is pre-release, there also don't seem to be any help lines on available functionality and proper implementation.

Can someone help me understand why I might be having this problem?

-Jeremy

josef...@gmail.com

unread,
Apr 19, 2015, 5:52:24 PM4/19/15
to pystatsmodels
General reply, not to the specific questions

There were some changes to SARIMAX before we merged it. The latest
version of the examples are the notebooks in the example folder in
statsmodels master, names starting with `statesspace_`

https://github.com/statsmodels/statsmodels/tree/master/examples/notebooks

Those should be compatible with the current interface.

Unfortunately, our automatic doc updating doesn't work right now, so
they cannot be seen on the documentation website

Josef

Jeremy Pepper

unread,
Apr 19, 2015, 6:17:59 PM4/19/15
to pystat...@googlegroups.com
Josef,

Perhaps I'm displaying my ignorance, but I followed the link and it looked more like issues logs to me.  I'm not an experienced enough coder to know what exactly I'm looking at.  I saw something related to confidence intervals, but I can't make heads or tails out of what to do with what I see.

-Jeremy

josef...@gmail.com

unread,
Apr 19, 2015, 6:35:34 PM4/19/15
to pystatsmodels
On Sun, Apr 19, 2015 at 6:17 PM, Jeremy Pepper <aac...@gmail.com> wrote:
> Josef,
>
> Perhaps I'm displaying my ignorance, but I followed the link and it looked
> more like issues logs to me. I'm not an experienced enough coder to know
> what exactly I'm looking at. I saw something related to confidence
> intervals, but I can't make heads or tails out of what to do with what I
> see.

Those files are the raw notebooks without any output. Raw notebooks
are in json format and mainly for reading by the computer.
If you are setup to run notebooks, you can download them either one by
one or by checking out the repository on github.

The content can be seen in human readable form by copying the link
into nbviewer.ipython.org
for example
http://nbviewer.ipython.org/github/statsmodels/statsmodels/blob/master/examples/notebooks/statespace_sarimax_stata.ipynb

It doesn't show the output of the python code in it, since we don't save that.

It looks like, in the example the forecast confidence interval is
calculated explicitly using the provided forecast standard errors.

I remember we dropped extra results from predict, but don't remember
if they were added in another way.

Josef

Chad Fulton

unread,
Apr 19, 2015, 9:06:25 PM4/19/15
to Statsmodels Mailing List
Hi Jeremy,


On Sun, Apr 19, 2015 at 1:47 PM, Jeremy Pepper <aac...@gmail.com> wrote:
> I'm trying to understand how to use SARIMAX. To learn, I'm using examples
> found at:
>
> http://nbviewer.ipython.org/gist/ChadFulton/5127108f4c7025ed2648
>
> Based on what I see there, I believe I've done what I need to do, but
> graphing is giving me trouble. I think this is really due to data type
> issues.
>
> The most relevant example references 'friedman2.dta' as a data file. It
> does the sort of, multi-step-ahead predictions I need to do. The problem
> is, I can't access the 'friedman2.dta' dataset to determine what the example
> dataset looks like. That means I can't actually run the example, I can just
> sort of try to copy it...

The friedman2.dta file is from Stata's documentation, and is available
at http://www.stata-press.com/data/r12/ts.html

>
> As of the present time, using my code to make predictions, I'm getting numpy
> array outputs that are annoying because I don't know how to use them for
> much of anything. Worse yet, I ONLY get the predicted value, not the
> confidence interval. Herein lies my biggest problem... I switched APIs
> believing I would be able to get confidence interval outputs from my code,
> but I can't seem to figure that part out. The relevant lines are:
>
> # In-sample one-step-ahead predictions and 95% confidence intervals
> (forecast without data)
>
> predict = res.predict(alpha=0.05)
>
> ax.plot(predict.index[-npredict-npre:], predict[0, -npredict-npre:], 'r--',
> label='One-step-ahead forecast');
>
>
> On the left hand on the equal sign I used to have: "predict, cov, ci, idx",
> as in the example. Now I just have: "predict."

As Josef suggested, this was due to a change before the merge with
Statsmodels to make it compatible with other models' `predict` and
`forecast` methods.

>
> When I include the "cove, ci, idx" terms, I get an error. When I don't, I
> get a lovely output of predictions, but I don't get bounds on those
> predictions. As the version of statsmodels I'm using is pre-release, there
> also don't seem to be any help lines on available functionality and proper
> implementation.
>

To get the confidence intervals, you need to retrieve the full results
object (using `full_results=True` as an argument to the `predict` or
`forecast` methods). Then the predictions are available as the
`forecasts` attribute (shape = (k_endog, nobs)), and the associated
covariance matrices are available as the `forecasts_error_cov`
attribute (shape = (k_endog, k_endog, nobs)).

The full code looks like the following:

# In-sample one-step-ahead predictions
predict_res = res.predict(full_results=True)

predict = predict_res.forecasts
cov = predict_res.forecasts_error_cov
idx = res.data.predict_dates._mpl_repr()

# 95% confidence intervals
critical_value = norm.ppf(1 - 0.05 / 2.)
std_errors = np.sqrt(cov.diagonal().T)
ci = np.c_[
(predict - critical_value*std_errors)[:, :, None],
(predict + critical_value*std_errors)[:, :, None],
]


You can then use the confidence intervals using the covariance matrics
as usual. To see a full example, go the the "ARIMA Postestimation:
Example 1 - Dynamic Forecasting" section of the notebook link that
Josef referenced.


Chad

Jeremy Pepper

unread,
Apr 20, 2015, 9:59:15 PM4/20/15
to pystat...@googlegroups.com
Thanks for the help guys.  I'm not done implementing all this yet, but I do have a graph with prediction intervals (bad ones at present, but I'll get there).  Thanks again!  I'll probably be in touch more, but I've got something now (which is more than I had before).
Reply all
Reply to author
Forward
0 new messages