Arima() in R vs SARIMAX in Python(statmodels) for multivariate forecasting

229 views
Skip to first unread message

T H

unread,
Sep 13, 2023, 6:18:15 PM9/13/23
to pystatsmodels

Hi, I am trying to replicate the results from Arima() in R, using Python for multivariate forecasting. I used SARIMAX in Python with the same p,d,q which I used in Arima(). I was aware that SARIMAX uses 'lbfgs' as the default method while 'bfgs' is used in Arima in R. So I forced the method to be 'bfgs' in SARIMAX. I am still not getting the same results. Can anyone please help me figure this out? or What could be the function that I should use to re-produce the data from Arima() in R? Any help would be much appreciated.

the inputs for both models are exactly the same. I have updated my question including the plot of results that I got. No big differences. But I need exact same results. If it is not possible, I wonder why. Also my coefficients from R-Arima and Python- SARIMAX are different too.

Here is the R code :

''' h1 <- dplyr::select(sample_data_dev, x1,x2,x3) %>% as.matrix() 

 ts_h <- ts(sample_data_dev$y, start=c(year_start_dev,quarter_start_dev), end=c(year_end_dev,quarter_end_dev), frequency = 4) h_model1 <- Arima(y = ts_h, order = c(1,1,1),season = c(0,0,0),xreg = h1) 

 h_pred1 <- as_tibble((forecast::forecast(h_model1, xreg=dplyr::select(out_data_forecast[1:h,], x1,x2,x3) %>% s.matrix()))$mean) '''

Here is the python code that I use :

''' import statsmodels.api as sm 

 dev_start = '1992-06-30

 dev_end = '2017-12-31' 

 date_range = pd.date_range(start= dev_start,end= dev_end,freq= 'Q'

 h_data = pd.Series(sample_data_dev['y'],index=date_range) 

 from statsmodels.tsa.statespace.sarimax import SARIMAX 

 h_model = SARIMAX(endog = h_data, exog = hel , method = 'bfgs',order = (1,1,1), season = (0,0,0,0)) 

 h_results = h_model.fit() 

 forecast_steps = h 

forecast_x = out_data_forecast.loc[0:h-1, ['x1','x2','x3']] 

 forcst = h_results.get_forecast(steps = h, exog=forecast_x) 

forcst forecaset_mean = forcst.predicted_mean forecaset_mean ''' 


This is the plot of R and Python results


Here is the plot of results from R and Python
Posted this in StackOverflow as well. 

https://stackoverflow.com/questions/77100478/arima-in-r-vs-sarimax-in-pythonstatmodels-for-multivariate-forcasting?noredirect=1#comment135919827_77100478

Reply all
Reply to author
Forward
0 new messages