ARIMA forecast method

1,796 views
Skip to first unread message

dirceu.s...@gmail.com

unread,
Apr 14, 2016, 11:34:13 AM4/14/16
to Time Series for Spark (the spark-ts package)
Hello,
I'm very knew to Data Science and I'm trying to implement a product forecast using Arima.
After having an ARIMAModel, why should we pass a timeseries vector to the ARIMA forecast method?
I'm using a monthly based timeseries, how do I use ARIMA to be fit with a timeseries bigger than 1 year? 

Kind Regards,
Dirceu


Sandy Ryza

unread,
Apr 18, 2016, 12:10:13 PM4/18/16
to dirceu.s...@gmail.com, Time Series for Spark (the spark-ts package)
Hi Dirceu,

Regarding your first question, an ARIMA model is something that helps you turn observations about the recent past into predictions about the future.  With the forecast method, you pass in a time series vector representing the recent past, and it spits out a vector of predictions about the future.  Is that helpful?

I'm not sure I understand your problem with using ARIMA with a time series bigger than a year.  Are you running into issues?

-Sandy  

--
You received this message because you are subscribed to the Google Groups "Time Series for Spark (the spark-ts package)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-ts+u...@googlegroups.com.
To post to this group, send email to spar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/spark-ts/9672ec20-6de5-4a63-8c0b-5adf0b640047%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dirceu Semighini Filho

unread,
Apr 18, 2016, 1:43:32 PM4/18/16
to Sandy Ryza, Time Series for Spark (the spark-ts package)
Hi Sandy, thanks for the reply 
In fact I didn't understand how to use the ARIMA in sparkts.
I've used ARIMA to execute forecasts prediction in R, and there I train a model using a recent past timeseries and execute a prediction about the future, without having to use a timeseries again.
Let's say that my timeseries is 
ts = (10,34,54,78) 
each value representing a month.
model = ARIMA.fitModel(1,0,1,ts)
For me to predict the next 4 months, what do I have to pass to my model.
forecast = model.forecast(ts,4)
Is that what should be done?
I didn't understand why should we have to pass ts again in the forecast method, if it has already been used to generate the model?

Dirceu

Sandy Ryza

unread,
Apr 18, 2016, 6:13:44 PM4/18/16
to Dirceu Semighini Filho, Time Series for Spark (the spark-ts package)
That's correct that you'd pass ts again.  I'm not familiar with the R implementation, but the reason that spark-ts forecast accepts a ts param is that conceptually there's no reason the model parameters (i.e. the differencing order and coefficients for the AR and MA terms) need to be tied to the time series that was used to determine them.  For example, what if you collected another four months of data, and then wanted apply the model that you had trained earlier on those four new months to make predictions?

Another thing to be aware of is that if you call model.forecast(n, ts), it will actually return a forecast time series with more than n elements.  The last n elements in the returned series will be the forecasted values, and the earlier elements in the returned series will be the model's forecasted values for time points you've already observed.  These earlier elements allow you to compare the model's forecasts with what you've observed in the past.

-Sandy

celio...@objective.com.br

unread,
Apr 19, 2016, 10:30:58 AM4/19/16
to Time Series for Spark (the spark-ts package), dirceu.s...@gmail.com
Hi Sandy,

I've tested the ARIMA forecast method and I realized that the last n elements in the returned forecast arrayt converge to a constant value. It does not appear to be expect result for me.

There is a time series of 250 values.
So, I split the data:
- I generated a model with the first 200 values of the time series 
- And I want to use the last 50 values of the time series to test the model.

In the forecast method, I've passed the same time series that I've used to train and n=50. So, an array of length 250 was returned.

If I want to compare the forecast for the last 50 values of the time series and the test data , what I should do? Use the first 50 positions of the array returned in the forecast method or the last 50 positions? For me the last 50 positions are converging to a constant value.

Please, tell me if you see a mistake in my approach.
Thank you

Sandy Ryza

unread,
Apr 19, 2016, 12:07:33 PM4/19/16
to celio...@objective.com.br, Time Series for Spark (the spark-ts package), Dirceu Semighini Filho
Hi Celio,

You should use the last 50 positions to compare with your holdout set.

If you think the model is incorrect (because it's converging to a constant value, and it shouldn't), I would take a look at the residuals, i.e. the difference between the model's predictions for the first 200 and the actual values for the first 200.  It's not a simple task, but http://www.itl.nist.gov/div898/handbook/pmc/section6/pmc624.htm has some good info on how to interpret these.  Regrettably, spark-ts does not currently natively provide the plots mentioned in that link, so if you want to use them you'd need to generate them yourself.

-Sandy

Esta mensagem pode conter informações confidenciais ou privilegiadas. Se você recebeu esta mensagem por engano, você não deve usar, copiar, divulgar ou tomar qualquer atitude com base nestas informações. Solicitamos que você apague a mensagem imediatamente. Opiniões, conclusões ou informações contidas nesta mensagem não necessariamente refletem a posição oficial da Objective Solutions. Caso assinada digitalmente, a autenticidade desta mensagem pode ser confirmada pela Autoridade Certificadora Privada Objective Solutions, disponível em www.objective.com.br.

--
You received this message because you are subscribed to the Google Groups "Time Series for Spark (the spark-ts package)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-ts+u...@googlegroups.com.
To post to this group, send email to spar...@googlegroups.com.

hanc...@gmail.com

unread,
Feb 1, 2017, 2:18:00 AM2/1/17
to Time Series for Spark (the spark-ts package), celio...@objective.com.br, dirceu.s...@gmail.com
Hi Sandy, 

I'm using the ARIMA.autofit function to fit the model , with 100 datapoints and plan to forecast 20 future values. My model is learning the pattern pretty well, within the first 100 values when i look at the residuals.But as mentioned above, the forecasted values are converging to a single value. Please suggest a solution for this as early as possible. 


It consists of the actual and predicted values for first 100, and also the forecasted 20 values.

Thanks in advance. Also, I'm saying sparkts 0.4.0 version.

sophi...@gmail.com

unread,
Jun 12, 2017, 2:21:48 AM6/12/17
to Time Series for Spark (the spark-ts package), celio...@objective.com.br, dirceu.s...@gmail.com, hanc...@gmail.com
you need to refit the model with new data for predicting more than a few time units into the future

sajal...@gmail.com

unread,
Jun 15, 2017, 3:20:16 AM6/15/17
to Time Series for Spark (the spark-ts package), celio...@objective.com.br, dirceu.s...@gmail.com, hanc...@gmail.com, sophi...@gmail.com
Hi,

I am new at spark ts and I don't know where to start the code .Can you please post an example of ARIMAModel in python.
Reply all
Reply to author
Forward
0 new messages