I'm going to start working towards improving the usability / polish of some of the TSA package (at least starting with state space models where I know the code well, but maybe others too). If there are things that people know that don't work, let me know.Although little things are also important, here are two examples of bigger things I recently was reminded of when I was doing some forecasting with tsa models.- Model selection (esp. information criteria, maybe including pseudo-out-of-sample RMSE or something).
- Dates for `fit` vs dates of given data: I wonder if it would pay to be more flexible with the data for tsa models. For example, we may want to fit the model on a subset of the available date range (for example for pseudo out-of-sample forecasting), or we may want to allow passing in "too much" `exog` data, so that we don't have to worry about passing new exog back in when forecasting.
I find it so cumbersome to have to split my exog so that it fits into the model, and then try and figure out how to grab just the part I need for forecasting.
I have created a project for this topic https://github.com/statsmodels/statsmodels/projects/3 (hopefully helpful for organization, although I still don't quite understand how projects work).I'm hoping that as people run across things that may work but not very "nicely", or if there are common tasks that are annoying to perform, that we could collect those things and hopefully improve some of them.
Chad
Cross-validation for timeseries models would be a very welcome addition.IIUC, it may require in-depth knowledge of the model structure so may be difficult to implement generically, outside of statsmodels?-Dave
Chad
I'm going to start working towards improving the usability / polish of some of the TSA package (at least starting with state space models where I know the code well, but maybe others too). If there are things that people know that don't work, let me know.Although little things are also important, here are two examples of bigger things I recently was reminded of when I was doing some forecasting with tsa models.- Model selection (esp. information criteria, maybe including pseudo-out-of-sample RMSE or something).
- Dates for `fit` vs dates of given data: I wonder if it would pay to be more flexible with the data for tsa models. For example, we may want to fit the model on a subset of the available date range (for example for pseudo out-of-sample forecasting), or we may want to allow passing in "too much" `exog` data, so that we don't have to worry about passing new exog back in when forecasting.
I find it so cumbersome to have to split my exog so that it fits into the model, and then try and figure out how to grab just the part I need for forecasting.I have created a project for this topic https://github.com/statsmodels/statsmodels/projects/3 (hopefully helpful for organization, although I still don't quite understand how projects work).I'm hoping that as people run across things that may work but not very "nicely", or if there are common tasks that are annoying to perform, that we could collect those things and hopefully improve some of them.
On Friday, January 6, 2017 at 6:06:33 PM UTC-8, Chad Fulton wrote:I'm going to start working towards improving the usability / polish of some of the TSA package (at least starting with state space models where I know the code well, but maybe others too). If there are things that people know that don't work, let me know.Although little things are also important, here are two examples of bigger things I recently was reminded of when I was doing some forecasting with tsa models.- Model selection (esp. information criteria, maybe including pseudo-out-of-sample RMSE or something).What do you think about a mixin class for the information criteria?1) cut boilerplate code that shows up in a bunch of different places,2) some classes calculate AIC etc using formulas equivalent to those from tools.eval_measures; others use formulas equivalent to tsa.vector_ar.var_model. It would be nice to have them in one place to clarify when each is appropriate.3) Similar situation with resid vs wresid etc.
- Dates for `fit` vs dates of given data: I wonder if it would pay to be more flexible with the data for tsa models. For example, we may want to fit the model on a subset of the available date range (for example for pseudo out-of-sample forecasting), or we may want to allow passing in "too much" `exog` data, so that we don't have to worry about passing new exog back in when forecasting.A few days ago when plotting a forecast from a VAR I noticed that the labels on the X-axis were not date-like. I'll look into this sometime in the next month or so; working with matplotlib always feels like pulling teeth.I find it so cumbersome to have to split my exog so that it fits into the model, and then try and figure out how to grab just the part I need for forecasting.I have created a project for this topic https://github.com/statsmodels/statsmodels/projects/3 (hopefully helpful for organization, although I still don't quite understand how projects work).I'm hoping that as people run across things that may work but not very "nicely", or if there are common tasks that are annoying to perform, that we could collect those things and hopefully improve some of them.Any preference between bringing these up here vs. Issues vs the project page?