Overwrite `params` to get different predictions

VincentAB

unread,

Jun 22, 2023, 5:07:35 PM6/22/23

to pystatsmodels

Hi all,

I'm not sure if this is the right place to ask this newbie question. Please point me in the right direction if it isn't.

I would like to overwrite the `params` of a fitted model object in such a way that calling `res.predict()` will make different predictions, based on the arbitrary parameter values that I supplied instead of the original (estimated) ones.

Background: I want to use numerical differentiation to get derivatives of predictions (and functions of) w.r.t. parameters, for some Delta Method applications. I'm exploring the possibility of porting my `marginaleffects` package for R to Python and `statsmodels`: https://vincentarelbundock.github.io/marginaleffects/

Concretely, this is what I need:

# load and estimate
import pandas as pd
import statsmodels.formula.api as smf
df = sm.datasets.get_rdataset("Guerry", "HistData").data
mod = smf.ols("Literacy ~ Pop1831 + Desertion", df)
res = mod.fit()

# overwrite the `params` attribute of the results object
res2 = res
res2.params = pd.Series([1., 2., 3.], index=res.params.index)

# These two commands should now make different predictions, based on their different `params`
res.predict(df.head())
res2.predict(df.head())

Thanks for your time!

Vincent

josef...@gmail.com

unread,

Jun 22, 2023, 5:49:56 PM6/22/23

to pystat...@googlegroups.com

Hi Vincent,

You can use model predict which takes `params` as the first argument.

The other difference is that model.predict expects exog to be an numpy array, while results predict can take a pandas DataFrame that is transformed with the formula in the same way as the training sample data.

> Background: I want to use numerical differentiation to get derivatives of predictions (and functions of) w.r.t. parameters, for some Delta Method applications. I'm exploring the possibility of porting my `marginaleffects` package for R to Python and `statsmodels`: https://vincentarelbundock.github.io/marginaleffects/

That would be great. I looked at it and similar R packages in the last year.

I did most of the background implementation already, e.g. delta method for prediction is available through `_test_wald_nonlinear` which can take user provided functions.

It's currently used in get_prediction.

I have notebooks to illustrate how to use it for computing predictive margins and marginal/partial effects (with some unit tests against get_margeff)

The two main missing pieces

- creating "interesting" exog, sets of explanatory variables that can be used in predict. (I showed some of my experiments with pandas in an earlier comment in the mailing list)

- figuring out terms in the formulas and their derivative, e.g. interaction effects, or polynomials and similar

https://github.com/statsmodels/statsmodels/issues/8746

https://github.com/statsmodels/statsmodels/issues/7071

main issue for discussing implementation is https://github.com/statsmodels/statsmodels/issues/5387

My notebooks are not published, so I have to look for them

Cheers,

Josef

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/1409382b-6c30-4e20-89e1-e961b191b55cn%40googlegroups.com.

josef...@gmail.com

unread,

Jun 22, 2023, 6:01:32 PM6/22/23

to pystat...@googlegroups.com

On Thu, Jun 22, 2023 at 5:49 PM <josef...@gmail.com> wrote:

Hi Vincent,

You can use model predict which takes `params` as the first argument.
The other difference is that model.predict expects exog to be an numpy array, while results predict can take a pandas DataFrame that is transformed with the formula in the same way as the training sample data.

> Background: I want to use numerical differentiation to get derivatives of predictions (and functions of) w.r.t. parameters, for some Delta Method applications. I'm exploring the possibility of porting my `marginaleffects` package for R to Python and `statsmodels`: https://vincentarelbundock.github.io/marginaleffects/

That would be great. I looked at it and similar R packages in the last year.

I did most of the background implementation already, e.g. delta method for prediction is available through `_test_wald_nonlinear` which can take user provided functions.
It's currently used in get_prediction.
I have notebooks to illustrate how to use it for computing predictive margins and marginal/partial effects (with some unit tests against get_margeff)

The two main missing pieces
- creating "interesting" exog, sets of explanatory variables that can be used in predict. (I showed some of my experiments with pandas in an earlier comment in the mailing list)
- figuring out terms in the formulas and their derivative, e.g. interaction effects, or polynomials and similar

https://github.com/statsmodels/statsmodels/issues/8746
https://github.com/statsmodels/statsmodels/issues/7071

main issue for discussing implementation is https://github.com/statsmodels/statsmodels/issues/5387

My notebooks are not published, so I have to look for them

https://gist.github.com/josef-pkt/c4f31a650210e5cdb032db9c7b487c02

https://gist.github.com/josef-pkt/c2a00519351a3fe09d3cce84a9515abb

both notebooks are "dirty". They are just a collection of experiments to see how margins for nonlinear terms and interaction terms can be implemented based on nonlinear delta covariance

VincentAB

unread,

Jun 23, 2023, 8:40:30 AM6/23/23

to pystatsmodels

Thanks Josef, this is great.

I tried it this morning and things seem to work as expected. Excellent!

If you look at the `marginaleffects` website, you'll see that things have changed a lot in the last year, and that there are now *a ton* of features. It'll take me a while to get anywhere close to parity (and I'm leaving on vacation next week). But once I have a working python prototype I'll ping yoy. We can then chat to see if it makes sense to integrate it in `statsmodels` or if it would be best as a standalone product.

Cheers!

Vincent

josef...@gmail.com

unread,

Jun 23, 2023, 9:26:34 AM6/23/23

to pystat...@googlegroups.com

The core computation will have to be integrated in the models.

We need the supporting model methods, e.g. derivatives https://github.com/statsmodels/statsmodels/issues/8833 (margeff ignores offset).

But marginal/partial/predictive effects are in high demand and we will need to support it directly.

That's why I was working on it during the last year.

I extended `get_prediction` for 0.14 to already support some of the computation, but I was focused mainly on discrete models.

related issue:

margeff follows the Stata implementation.

In the tradition of "causal" analysis similar computation as margins "overall" are for average treatment effect ATE. However, the variance computation for the ATE differs from the delta method in margeff.

(Greene versus Wooldridge in econometrics)

https://github.com/statsmodels/statsmodels/issues/8767

Essentially, `margins` assumes parametric model, ATE allows for non-parametric identification with heterogeneity.

This might share some of the code with margeff.

The topic is relatively new to me and I don't have a clear overview yet of what we need to do.

I will also be on vacation in July.

Josef

To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/92084329-06e9-4787-8380-504850987516n%40googlegroups.com.

VincentAB

unread,

Jun 23, 2023, 9:35:44 AM6/23/23

to pystatsmodels

Sounds great.

I'm not sure much needs to be changed in the models themselves. As you can see, the `marginaleffects` package for R support 80+ different modeling packages, and I didn't have to make any changes to those model fitting functions (maintained by different developers). The only thing that mattered was that their `predict()` methods supported the required options (e.g., offsets).

I've become pretty familiar with this area, and have dealt with a lot of user feedback, so I feel like I have a good sense of what (many) users are asking for. Will show you a prototype when it's ready.

Vincent

Reply all

Reply to author

Forward