Hi again, after forecasting using statsmodels.tsa, how do I reverse np.diff(np.log(data), ..) on the predicted time series, since prediction will be in terms of this new scale. Should I do first np.cumsum and then np.exp ?
Hmm.. now that things are working, the prediction seems to be way off. It climbs too fast.
import numpy as np
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
import matplotlib.pyplot as plt
def pad(data):
bad_indexes = np.isnan(data)
good_indexes = np.logical_not(bad_indexes)
good_data = data[good_indexes]
interpolated = np.interp(bad_indexes.nonzero()[0], good_indexes.nonzero()[0], good_data)
data[bad_indexes] = interpolated
return data
data = np.genfromtxt('/home/burak/daily3.csv', skip_header=1, delimiter=',')
a = np.squeeze(data[:,0])
b = np.squeeze(data[:,1])
data = np.vstack((a,b)).T
last = data[-1,:]
print "last", last
data = np.diff(np.log(data), axis=0)
data = np.apply_along_axis(pad, 0, data)
model = VAR(data)
res = model.fit(4)
f = res.forecast(data[4:], 30)
print f
a=np.append(a,last[0] * np.cumprod(1+f[:,0]))
b=np.append(b,last[1] * np.cumprod(1+f[:,1]))
plt.plot(range(len(a)),a,'.')
plt.show()
plt.plot(range(len(b)),b,'.')
plt.show()
If you want to run it, I can send the data.
I tested the code with macrodata.csv too, and the direction of the prediction is always the same. It sort of fits to realcons, but realinv forecasting does not fit. There is not interpolation in this one, so the error cannot be there. Unless I am doing some other thing wrong of course.. I used 4 lag and prediction to 30 quarters ahead.
data = np.genfromtxt('macrodata.csv', skip_header=1, delimiter=',')
import numpy as np
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
import matplotlib.pyplot as plt
data = data[:,3:5]
orig = data[:]
data = np.diff(np.log(data), axis=0)
model = VAR(data)
res = model.fit(4)
f = res.forecast(data[4:], 30)
print fa=np.append(orig[:,0],orig[-1,0] * np.cumprod(1+f[:,0]))
b=np.append(orig[:,1],orig[-1,1] * np.cumprod(1+f[:,1]))
plt.plot(range(len(a)),a,'.')
plt.show()
plt.plot(range(len(b)),b,'.')
plt.show()
Negative indexing? Do you mean using
f = res.forecast(data[-4:], 30)
instead of
f = res.forecast(data[4:], 30)