Does 'predict' with a fitted GLM use the offset?

109 views
Skip to first unread message

jordan....@gmail.com

unread,
Nov 17, 2021, 12:10:29 PM11/17/21
to pystatsmodels
Hello,

I've fit a model with offsets from a different model like so:

offset_formula = "cm_pure_premium ~ new_auto_m_score - 1"
y,x = patsy.dmatrices(offset_formula, df_d1,
                      return_type = 'matrix')

weight_factor = np.array(df_d1['comp_eu'])
offset_factor = np.array(df_d1['offset_factor'])

model_d_m = sm.GLM(y, x, family = sm.families.Poisson(),
                         freq_weights=weight_factor,
                         offset = offset_factor).fit(scale="x2")

I've tested this and the offset is working correctly.  

When I run:

'model_d_m.predict(x)' can anyone confirm if the model is calculating the offset in the prediction?  Or is the offset only considered in the fit?

josef...@gmail.com

unread,
Nov 17, 2021, 12:45:41 PM11/17/21
to pystatsmodels
It's a bit tricky, and we had some bugs in this.

If exog x is not specified in predict, then all model arrays, exog, offset and exposure, ... are used.

If exog x in predict is user provided, then 
if offset is also provided, then it is used, (similar for other extra arrays in different models/families)
if offset is NOT provided, then the default is 0.

So 
'model_d_m.predict(x)' will not use offset (will set offset=0)
'model_d_m.predict(x_predict, offset=offset_predict)' will use offset_predict as offset.
'model_d_m.predict()' uses insample arrays for exog and offset from the model
 

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/2efe7fc1-d25a-4385-9f71-39516e10076dn%40googlegroups.com.

josef...@gmail.com

unread,
Nov 17, 2021, 12:47:25 PM11/17/21
to pystatsmodels
On Wed, Nov 17, 2021 at 12:43 PM <josef...@gmail.com> wrote:


On Wed, Nov 17, 2021 at 12:10 PM jordan....@gmail.com <jordan....@gmail.com> wrote:
Hello,

I've fit a model with offsets from a different model like so:

offset_formula = "cm_pure_premium ~ new_auto_m_score - 1"
y,x = patsy.dmatrices(offset_formula, df_d1,
                      return_type = 'matrix')

weight_factor = np.array(df_d1['comp_eu'])
offset_factor = np.array(df_d1['offset_factor'])

model_d_m = sm.GLM(y, x, family = sm.families.Poisson(),
                         freq_weights=weight_factor,
                         offset = offset_factor).fit(scale="x2")

I've tested this and the offset is working correctly.  

When I run:

'model_d_m.predict(x)' can anyone confirm if the model is calculating the offset in the prediction?  Or is the offset only considered in the fit?

It's a bit tricky, and we had some bugs in this.

If exog x is not specified in predict, then all model arrays, exog, offset and exposure, ... are used.

If exog x in predict is user provided, then 
if offset is also provided, then it is used, (similar for other extra arrays in different models/families)
if offset is NOT provided, then the default is 0.

So 
'model_d_m.predict(x)' will not use offset (will set offset=0)
'model_d_m.predict(x_predict, offset=offset_predict)' will use offset_predict as offset.
'model_d_m.predict()' uses insample arrays for exog and offset from the model

best is always to verify
eg. these two should differ
'model_d_m.predict(x_predict, offset=offset_predict)
'model_d_m.predict(x_predict, offset=1 + offset_predict)
 
these two should be the same if exog offset are the model data
'model_d_m.predict(exog[:5], offset=offset[:5])
'model_d_m.predict()[:5]'
but should differ from 
'model_d_m.predict(exog[:5])'
Reply all
Reply to author
Forward
0 new messages