Generating random samples from ARIMA(p,d,q) model

179 views
Skip to first unread message

Katatonia Sh

unread,
Mar 18, 2021, 7:19:20 AM3/18/21
to pystatsmodels
I wanna generate samples from ARIMA(p,d,q) in Python, but I could not find any method to this. There is a method to generate samples ARMA(p,q) but not ARIMA. 
Moreover, how is it possible to generate samples with an initial value? For example, in the current model, samples start from 0 but I want to generate samples starting at 50.

Regards

jseabold

unread,
Mar 18, 2021, 9:46:23 AM3/18/21
to pystatsmodels
On Thursday, March 18, 2021 at 6:19:20 AM UTC-5 katato...@gmail.com wrote:
I wanna generate samples from ARIMA(p,d,q) in Python, but I could not find any method to this. There is a method to generate samples ARMA(p,q) but not ARIMA. 
Moreover, how is it possible to generate samples with an initial value? For example, in the current model, samples start from 0 but I want to generate samples starting at 50.

I believe you can use the unintegrate functions for this.

In [276]: from statsmodels.tsa.arima_model import unintegrate, unintegrate_levels

In [277]: from statsmodels.tsa.arima_process import arma_generate_sample

In [278]: y = arma_generate_sample([1, -.2, -.1], [1, .7], nsample=1000)

In [279]: levels = [50]

In [280]: unstationary_y = unintegrate(y, levels)

In [281]: ARIMA(y, order=(2, 0, 1), trend='c').fit().summary()
Out[281]:
<class 'statsmodels.iolib.summary.Summary'>
"""
                               SARIMAX Results
==============================================================================
Dep. Variable:                      y   No. Observations:                 1000
Model:                 ARIMA(2, 0, 1)   Log Likelihood               -1394.270
Date:                Thu, 18 Mar 2021   AIC                           2798.540
Time:                        08:41:11   BIC                           2823.079
Sample:                             0   HQIC                          2807.866
                               - 1000
Covariance Type:                  opg
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0471      0.072      0.654      0.513      -0.094       0.188
ar.L1          0.2572      0.059      4.354      0.000       0.141       0.373
ar.L2          0.0264      0.049      0.540      0.589      -0.069       0.122
ma.L1          0.6636      0.048     13.887      0.000       0.570       0.757
sigma2         0.9509      0.045     21.368      0.000       0.864       1.038
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 1.56
Prob(Q):                              0.99   Prob(JB):                         0.46
Heteroskedasticity (H):               0.94   Skew:                             0.06
Prob(H) (two-sided):                  0.58   Kurtosis:                         2.85
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
"""

In [282]: ARIMA(unstationary_y, order=(2, 1, 1), trend='t').fit().summary()
Out[282]:
<class 'statsmodels.iolib.summary.Summary'>
"""
                               SARIMAX Results
==============================================================================
Dep. Variable:                      y   No. Observations:                 1001
Model:                 ARIMA(2, 1, 1)   Log Likelihood               -1394.270
Date:                Thu, 18 Mar 2021   AIC                           2798.540
Time:                        08:41:28   BIC                           2823.079
Sample:                             0   HQIC                          2807.866
                               - 1001
Covariance Type:                  opg
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0471      0.072      0.654      0.513      -0.094       0.188
ar.L1          0.2572      0.059      4.354      0.000       0.141       0.373
ar.L2          0.0264      0.049      0.540      0.589      -0.069       0.122
ma.L1          0.6636      0.048     13.887      0.000       0.570       0.757
sigma2         0.9509      0.045     21.368      0.000       0.864       1.038
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 1.56
Prob(Q):                              0.99   Prob(JB):                         0.46
Heteroskedasticity (H):               0.94   Skew:                             0.06
Prob(H) (two-sided):                  0.58   Kurtosis:                         2.85
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
"""

 

Katatonia Sh

unread,
Mar 18, 2021, 1:03:10 PM3/18/21
to pystatsmodels
Thank you so much. 
I checked the results and there are some negative values. These time-series samples correspond to demand which should be non-negative. Is it possible to generate TS with non-negative values? In addition, is it possible to generate TSs which fall in an interval [l,u]?
What does "unintegrate" do in your code? I would be thankful if you can give me a reference on math behind generating these samples.

Regards

Skipper Seabold

unread,
Mar 18, 2021, 9:38:27 PM3/18/21
to pystat...@googlegroups.com
On Thu, Mar 18, 2021 at 5:03 PM Katatonia Sh <katato...@gmail.com> wrote:
> I checked the results and there are some negative values. These time-series samples correspond to demand which should be non-negative. Is it possible to generate TS with non-negative values? In addition, is it possible to generate TSs which fall in an interval [l,u]?

Hmm. Nothing obvious comes to mind but I don't have a deep intuition about this.

I guess you'll have to calibrate the parameters for the simulated
values and/or play with the level value. Maybe clip the series with
some noise, if you don't really care about the ARMA parameter values.

> What does "unintegrate" do in your code? I would be thankful if you can give me a reference on math behind generating these samples.

It's what we use to forecast ARIMA models from differenced series.

https://github.com/statsmodels/statsmodels/blob/6d4d588b00547296f678af2e8de8bf3bbea1102e/statsmodels/tsa/tsatools.py#L758
> --
> You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/8fe73c58-2786-4854-8cf2-ad88b23ea449n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages