Hello!
I'm trying to see how feasible it would be to fit a full ARIMA process in INLA. In the econometrics literature, they define an MA process as lags of the errors, and so I thought if I could use INLA's AR(1) model to analogously fit that.
Here's a snippet of code where I simulated an AR(1) and an ARIMA(1,0,1):
require(INLA)
require(data.table)
## AR and MA params
rho = 0.8
phi = 0.3
prec = 10
n = 1000
## Simulate ARIMA(1,0,1)
dt1 = as.vector(arima.sim(list(order = c(1,0,1), ar = rho, ma = phi), n = n,
sd = sqrt(1/prec)))
## Simulate ARIMA(1,0,0)
dt2 = as.vector(arima.sim(list(order = c(1,0,0), ar = rho), n = n,
sd = sqrt(1/prec)))
## Make sure we can recover params by fitting the ARIMAs
arima(dt1, c(1,0,1))
arima(dt2, c(1,0,0))
Here's the results from first fitting an AR(1) process with the second dataset. I fit the AR(1) using both the 'ar1' model and also by using the lag of the variable.
## Fit the AR(1) from dt2 in INLA
ar1_dt2 <- data.table(y = dt2, t = 1:n)
ar1_dt2[, lag_y := shift(y)]
## Fit with model
inla_ar1_dt2_1 <- inla(formula = y ~ f(t, model = 'ar1'),
data = ar1_dt2)
summary(inla_ar1_dt2_1)
## Fit as lag FE
inla_ar1_dt2_2 <- inla(formula = y ~ lag_y,
data = ar1_dt2)
summary(inla_ar1_dt2_2)
### Both the models recovered the AR(1) parameter in either the FE or the RE ###
The results are quite similar and this is where I got a bit confused, because an AR(1) on residuals versus an AR(1) on the dependent variable aren't equivalent right (or maybe I'm wrong..)?
Finally, I attempt to fit an ARIMA(1,0,1) by including the lag value and the 'ar1' model (with the hope that the former will pick up the AR(1) part, and the latter will pick up the MA(1) part):
## What if we try to fit the ARIMA in INLA?
arima_dt1 = data.table(y = dt1, t = 1:n)
arima_dt1[, lag_y := shift(y)]
inla_arima_dt1 <- inla(formula = y ~ lag_y + f(t, model = 'ar1'),
data = arima_dt1)
summary(inla_arima_dt1)
"inla(formula = y ~ lag_y + f(t, model = \"ar1\"), data = arima_dt1)"
Time used:
Pre-processing Running inla Post-processing Total
0.3056 2.3598 0.1302 2.7956
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.0500 0.0328 -0.0104 0.0482 0.1196 0.0441 0
lag_y 0.4967 0.0896 0.3591 0.4733 0.6622 0.4615 0
Random effects:
Name Model
t AR1 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for the Gaussian observations 1.872e+04 1.841e+04 1354.9766 1.333e+04 6.718e+04 3768.1493
Precision for t 5.877e+00 1.238e+00 4.1083 5.644e+00 8.891e+00 5.1237
Rho for t 6.334e-01 8.810e-02 0.4215 6.494e-01 7.612e-01 0.6879
Expected number of effective parameters(std dev): 998.53(1.969)
Number of equivalent replicates : 1.002
Marginal log-Likelihood: -303.71
Neither the rho nor the lag_y coefficient are what I expected it to be...
I guess my main question is: what's the closest implementation we can do to estimate an ARIMA model in INLA?
Thanks!
N