Hello,
I've fit a model with offsets from a different model like so:
offset_formula = "cm_pure_premium ~ new_auto_m_score - 1"
y,x = patsy.dmatrices(offset_formula, df_d1,
return_type = 'matrix')
weight_factor = np.array(df_d1['comp_eu'])
offset_factor = np.array(df_d1['offset_factor'])
model_d_m = sm.GLM(y, x, family = sm.families.Poisson(),
freq_weights=weight_factor,
offset = offset_factor).fit(scale="x2")
I've tested this and the offset is working correctly.
When I run:
'model_d_m.predict(x)' can anyone confirm if the model is calculating the offset in the prediction? Or is the offset only considered in the fit?
It's a bit tricky, and we had some bugs in this.
If exog x is not specified in predict, then all model arrays, exog, offset and exposure, ... are used.
If exog x in predict is user provided, then
if offset is also provided, then it is used, (similar for other extra arrays in different models/families)
if offset is NOT provided, then the default is 0.
So
'model_d_m.predict(x)' will not use offset (will set offset=0)
'model_d_m.predict(x_predict, offset=offset_predict)' will use offset_predict as offset.
'model_d_m.predict()' uses insample arrays for exog and offset from the model