Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Time Series Analysis Help

0 views
Skip to first unread message

Idgarad

unread,
Nov 20, 2007, 6:24:20 PM11/20/07
to
Ok I am not a statistics guru I admit but I have trying to do some
basic forecasting that would meeting some basic statistical
requirements. I have the following data:

Date MIPS
1/5/2004 306.203
1/12/2004 364.29
1/19/2004 384.779
1/26/2004 387.91
2/2/2004 339.041
2/9/2004 414.383
2/16/2004 313.764
2/23/2004 335.001
3/1/2004 323.978
3/8/2004 312.729
3/15/2004 343.589
3/22/2004 333.252
3/29/2004 376.878
4/5/2004 390.825
4/12/2004 356.892
4/19/2004 383.517
4/26/2004 325.227
5/3/2004 254.279
5/10/2004 255.221
5/17/2004 266.575
5/24/2004 270.073
5/31/2004 293.269
6/7/2004 309.114
6/14/2004 311.633
6/21/2004 350.444
6/28/2004 296.203
7/5/2004 332.153
7/12/2004 306.23
7/19/2004 368.466
7/26/2004 334.271
8/2/2004 349.002
8/9/2004 378.682
8/16/2004 333.731
8/23/2004 380.037
8/30/2004 298.417
9/6/2004 288.728
9/13/2004 342.81
9/20/2004 382.866
9/27/2004 419.828
10/4/2004 379.289
10/11/2004 400.749
10/18/2004 453.514
10/25/2004 388.742
11/1/2004 333.935
11/8/2004 341.659
11/15/2004 281.586
11/22/2004 305.749
11/29/2004 310.391
12/6/2004 317.704
12/13/2004 380.804
12/20/2004 319.389
12/27/2004 361.442
1/3/2005 369.1764612
1/10/2005 416.6238169
1/17/2005 459.5359423
1/24/2005 365.4009445
1/31/2005 413.3630776
2/7/2005 291.3910135
2/14/2005 305.105
2/21/2005 464.8482752
2/28/2005 363.0336105
3/7/2005 264.7677899
3/14/2005 344.880868
3/21/2005 325.8519595
3/28/2005 321.1775701
4/4/2005 404.5693965
4/11/2005 392.0416371
4/18/2005 430.7946661
4/25/2005 427.1631644
5/2/2005 411.8648374
5/9/2005 386.8547968
5/16/2005 383.4840298
5/23/2005 381.5493873
5/30/2005 315.0086187
6/6/2005 354.5324168
6/13/2005 327.772
6/20/2005 369.0157653
6/27/2005 408.0830566
7/4/2005 434.5275972
7/11/2005 371.5106324
7/18/2005 408.1991382
7/25/2005 405.0429881
8/1/2005 373.8240641
8/8/2005 364.0034462
8/15/2005 369.6471424
8/22/2005 382.0108071
8/29/2005 410.7909099
9/5/2005 330.9051756
9/12/2005 368.7685134
9/19/2005 270.4893379
9/26/2005 404.0606091
10/3/2005 383.8872826
10/10/2005 466.5515718
10/17/2005 486.673
10/24/2005 448.0580021
10/31/2005 373.5319544
11/7/2005 358.4208151
11/14/2005 398.9761027
11/21/2005 318.3299946
11/28/2005 358.0366431
12/5/2005 344.9174087
12/12/2005 386.8313941
12/19/2005 294.1100542
12/26/2005 293.881162
1/2/2006 433.7141952
1/9/2006 476.274226
1/16/2006 475.7067041
1/23/2006 459.1203218
1/30/2006 361.2039406
2/6/2006 363.7221527
2/13/2006 380.1952852
2/20/2006 442.1721436
2/27/2006 357.9469694
3/6/2006 395.7442366
3/13/2006 450.9923943
3/20/2006 367.7855186
3/27/2006 402.778072
4/3/2006 493.4095257
4/10/2006 493.468
4/17/2006 469.1306141
4/24/2006 450.0128534
5/1/2006 442.5117675
5/8/2006 428.8031172
5/15/2006 470.2158386
5/22/2006 446.2431756
5/29/2006 317.8183222
6/5/2006 369.3162037
6/12/2006 410.4558021
6/19/2006 443.1421911
6/26/2006 397.1971946
7/3/2006 481.3922888
7/10/2006 525.2947246
7/17/2006 473.5077361
7/24/2006 517.5520329
7/31/2006 466.9906984
8/7/2006 431.1475016
8/14/2006 399.5471642
8/21/2006 440.8823488
8/28/2006 439.6991779
9/4/2006 362.8644597
9/11/2006 406.762618
9/18/2006 363.0828509
9/25/2006 491.8909378
10/2/2006 527.5336233
10/9/2006 516.9000381
10/16/2006 554.2020878
10/23/2006 650.9110702
10/30/2006 527.429268
11/6/2006 520.5231633
11/13/2006 419.1709031
11/20/2006 441.3769311
11/27/2006 407.7421329
12/4/2006 423.0796675
12/11/2006 541.489909
12/18/2006 395.1153918
12/25/2006 407.3078582
1/1/2007 555.9770864
1/8/2007 484.9516878
1/15/2007 554.6924101
1/22/2007 547.1910996
1/29/2007 498.570364
2/5/2007 532.9759432
2/12/2007 432.4194752
2/19/2007 497.8181418
2/26/2007 407.4818148
3/5/2007 463.2326725
3/12/2007 547.1052888
3/19/2007 499.1447529
3/26/2007 441.1002226
4/2/2007 435.5250358
4/9/2007 510.0561347
4/16/2007 460.6838179
4/23/2007 508.6014031
4/30/2007 514.7918906
5/7/2007 506.1699276
5/14/2007 538.0826675
5/21/2007 497.6096175
5/28/2007 434.4788358
6/4/2007 528.1184467
6/11/2007 432.9866137
6/18/2007 510.1264458
6/25/2007 487.4279266
7/2/2007 495.274668
7/9/2007 508.7542205
7/16/2007 572.8591187
7/23/2007 657.6611519
7/30/2007 594.0857848
8/6/2007 590.5344634
8/13/2007 604.0715949
8/20/2007 533.396821
8/27/2007 498.3182266
9/3/2007 491.3865539
9/10/2007 548.296464
9/17/2007 459.3107549
9/24/2007 543.1050647

That data is weekly usage of a system. I have done what research I
have and done some basic forecasting comparing previous year and doing
forecasts based on that. I am trying to find a more accurate way to
forecast this and my research has brought me to the ARIMA method for
looking at seasonal data.

Pouring through that resources I have I have found Gretl as a
potential tool. I need to generate a forecast up to 24 weeks in
advance. But I am at a loss. Each time I try, to the best of my
ability to process a forecast I am not getting any results that are
realistic due to my lack of statistical knowledge and a poor
understanding of most statistical software (Gretl included.) I keep
coming back to ARIMA(0,1,1)(0,1,1) with a seasonal period of 12 weeks.
I know this to be wrong but without a strong math background (I am a
technical guru, not a statistical guru) and I have hit a brick wall.

Can someone help explain what I need to do, using Gretl or some
similar tool in how to do accurate forecasting based on the above
data. I need to repeat this process weekly.

The activity is roughly quarterly but there is some drift on when a
quarter starts and ends (by up to two weeks either direction) so ARIMA
seemed to be the best method for forecasting.

Help!

da...@autobox.com

unread,
Nov 20, 2007, 8:33:51 PM11/20/07
to


Idgarad,

Please review http://www.autobox.com/idgarad and find some output from
AUTOBOX.

http://www.autobox.com/idgarad/accff.jpg

You will find in this case

1. There are significant level shifts at time point 65 and 114 ...both
to the upside ...NO TREND HERE ...just two level shifts.
2. There are a number of anomalous observations which need to be
accomodated so that they don't mask the model.
3. A number of Holidays are important.
4. There is a strong week of the year effect.
5. the ARIMA MODEL is simply a (1,1)

[(1- .746B** 1)]**-1 [(1- .276B** 1)]

At this juncture you can simply buy AUTOBOX or some similar commercial
program or simply program the following

a. Detect simultaneously the presence of

Pulses, Level Shifts, Seasonal Pulses , Local Time Trends

The point(s)in time where the parameters of the model may have
changed suggesting too much data

The form of the SARIMA MODEL

Any needed transformations to homogeneize the variance of the
errors

What Holiday indicators are important and what the temporal
response is to each ( viz. contemporaneous , lag , lead )

What weeks of the year are important.

Pursue all of these until the plot of your residuals looks like

http://www.autobox.com/idgarad/res.jpg which suggests that the signal
has been removed from the data

http://www.autobox.com/idgarad/actfore.jpg

The R-Squared for the final model was 86.5%

T

There are a number of success stories on our web site regarding daily
and weekly predictive models.

If I can help please give me a call.

Dave Reilly

da...@autobox.com

unread,
Nov 20, 2007, 8:53:10 PM11/20/07
to
On Nov 20, 6:24 pm, Idgarad <idga...@gmail.com> wrote:


idgarad,

Please review http://www.autobox.com/idgarad

and note that a reasonable weekly model yielding an r_squared of 86%
can be accomplished by programming
a procedure to detect level shifts and local time trends
the importance of a number of possible holidays
tests for constancy of parameters over time
tests for homogeneity of variance of the errors

http://www.autobox.com/idgarad/ab50pro.123
http://www.autobox.com/idgarad/accff.jpg
http://www.autobox.com/idgarad/actfore.jpg
http://www.autobox.com/idgarad/actres.jpg
http://www.autobox.com/idgarad/res.jpg
http://www.autobox.com/idgarad/fore.jpg
http://www.autobox.com/idgarad/stat.htm
http://www.autobox.com/idgarad/model.bmp
http://www.autobox.com/idgarad/verbal.txt

You can try it out by downloading the FREEWARE VERSION of AUTOBOX
called FREEFORE

http://www.autobox.com/freef.exe

Just form your data like http://www.autobox.com/idgarad/idgard.asc

and you should be able to run the free software each week ...develop a
model automatically ...and even get your 1 week ahead forecast all
without charge.

Hope this helps

Dave Reilly
Automatic Forecasting Systems
http://www.autobox.com
215-675-0652


Idgarad

unread,
Dec 3, 2007, 12:01:06 PM12/3/07
to
> Please reviewhttp://www.autobox.com/idgarad

>
> and note that a reasonable weekly model yielding an r_squared of 86%
> can be accomplished by programming
> a procedure to detect level shifts and local time trends
> the importance of a number of possible holidays
> tests for constancy of parameters over time
> tests for homogeneity of variance of the errors
>
> http://www.autobox.com/idgarad/ab50pro.123http://www.autobox.com/idgarad/accff.jpghttp://www.autobox.com/idgarad/actfore.jpghttp://www.autobox.com/idgarad/actres.jpghttp://www.autobox.com/idgarad/res.jpghttp://www.autobox.com/idgarad/fore.jpghttp://www.autobox.com/idgarad/stat.htmhttp://www.autobox.com/idgarad/model.bmphttp://www.autobox.com/idgarad/verbal.txt

>
> You can try it out by downloading the FREEWARE VERSION of AUTOBOX
> called FREEFORE
>
> http://www.autobox.com/freef.exe
>
> Just form your data likehttp://www.autobox.com/idgarad/idgard.asc

>
> and you should be able to run the free software each week ...develop a
> model automatically ...and even get your 1 week ahead forecast all
> without charge.
>
> Hope this helps
>
> Dave Reilly
> Automatic Forecasting Systemshttp://www.autobox.com
> 215-675-0652- Hide quoted text -
>
> - Show quoted text -

I will try to go through it today, the only thing is how do I
backforecast (I have to show the last 6 months and the next 6 months
from the current date that I process.

aruzinsky

unread,
Dec 3, 2007, 2:21:05 PM12/3/07
to
On Nov 20, 5:24 pm, Idgarad <idga...@gmail.com> wrote:
> ...

> coming back to ARIMA(0,1,1)(0,1,1) with a seasonal period of 12 weeks.
> ...

Thirty years ago, I was well versed in time series analysis but have
forgotten > 90%.

What is the significance of the second "(0,1,1)?"

According to

http://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average

, it doesn't belong there.

Idgarad

unread,
Dec 3, 2007, 4:08:31 PM12/3/07
to

The seasonal portion

aruzinsky

unread,
Dec 3, 2007, 8:01:47 PM12/3/07
to
> The seasonal portion- Hide quoted text -

>
> - Show quoted text -

The notation doesn't ring a bell. I suspect you differenced, 1 -
B^12, where B is backshift operator, to remove a periodic component.
I seem to recall that a common mistake (in my day) is to unnecessarily
combine (1 - B) to remove a trend with (1 - B^n) to remove a periodic
compomponent, because any difference, (1 - B^k), also removes a trend
in the original series. Did you make this mistake?


Idgarad

unread,
Dec 6, 2007, 2:46:21 PM12/6/07
to
> in the original series. Did you make this mistake?- Hide quoted text -

>
> - Show quoted text -

---(From Duke)----
A seasonal ARIMA model is classified as an ARIMA(p,d,q)x(P,D,Q) model,
where P=number of seasonal autoregressive (SAR) terms, D=number of
seasonal differences, Q=number of seasonal moving average (SMA) terms
---()---

That's what I was referring to, as I mentioned I am on my own on
learning all this, feel free to slap some sense into me as needed.

I have gone back through though and found a calendar of events that I
can factor into the projection, the seasonality is now 52 weeks rather
then 12 as there are considerable differences in the activities from
quarter to quarter.

Regardless though I have run into an additional snag (i.e.
Requirement) is that any model I use I have to backforecast to show
the accuracy of the model against know existing data. Oh brother, I
feel like Charlie Brown today.

What this is all about is there is a mainframe with different virtual
computers inside. I have to forecast each virtual computer's usage and
factor that against capacity to figure out when all hell is going to
break loose.

In short:

A : Is production, anything A doesn't use can be borrowed.
B-G : are virtual computers. They get to use a given amount but if
they need to can borrow A's left overs.

I need to learn how to do a seasonally sensitive forecast of A-G
(separately) so I can determine how much they can borrow (if any).

aruzinsky

unread,
Dec 6, 2007, 3:57:48 PM12/6/07
to
> (separately) so I can determine how much they can borrow (if any).- Hide quoted text -

>
> - Show quoted text -

The big P and Q doesn't ring a bell.

I understand that you want an estimator that incorporates seasonally
periodic information, but how far in the future do you want to
forecast?


Idgarad

unread,
Dec 7, 2007, 9:13:09 AM12/7/07
to
> forecast?- Hide quoted text -

>
> - Show quoted text -

Me personally, I am only interested in going 2 quarters ahead (roughly
24 weeks) but, if a reasonable amount of accuracy is possible a year
at most (Which would allow for some nice "What-If" checks). The graph
I produce is a sliding 52 week graph so 1/2 of which is existing known
data and the second half would be the projections (Thus the last known
data point is always in the middle of the graph.)

aruzinsky

unread,
Dec 7, 2007, 8:43:27 PM12/7/07
to
> data point is always in the middle of the graph.)- Hide quoted text -

>
> - Show quoted text -

I don't think I can help you much with canned software. I always did
my own programming. But, as I suggested before:

1. If you see a periodic component of period k, first apply 1 - B^k to
the data.

2. Then, only if you see a trend in the result of 1, also apply 1 -
B. The reason is that (1 - B) is a factor of the polynomial (1 - B^k)
therefore (1 - B^k) already removed any trend in the original data.
That info is missing in many text books.

Also, after the data is made stationary by differencing, a constant
representing the mean might remain. Differencing again will remove
it, but it would be better to estimate the constant as a regression
parameter.

Another small advice that I can give you is that general purpose LS
can be used to estimate parameters of an AR series that is
nonstationary due to poles outside and on the unit circle. You can
examine the estimated poles to see whether differencing is needed to
make the series stationary Unfortunately, you can't tell what is
under the hood of canned software and maybe the estimated poles were
constrained to lie inside the unit circle.

0 new messages