Weights and Offsets do not seem to move the model results

49 views
Skip to first unread message

jordan....@gmail.com

unread,
Sep 21, 2021, 2:19:44 PM9/21/21
to pystatsmodels
Hello, 

I have ran a tweedie, multi-variate GLM three times.  One without weights and offsets, one with weights only and one with offsets only.  The results are the exact same everytime.  

Is that normal?

josef...@gmail.com

unread,
Sep 21, 2021, 2:45:50 PM9/21/21
to pystatsmodels
It's not normal in the general case.
Any non-zero offset should at least change the constant in params

which weights? var_weights or freq_weights?

If they are non-constant, then params should change.

Prediction might not change much, e.g. if the model adjusts to compensate for offset.

 

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/180a449d-646e-404e-8066-9fc05edad29bn%40googlegroups.com.

Jordan Howell

unread,
Sep 21, 2021, 3:02:56 PM9/21/21
to pystat...@googlegroups.com
I just used "weights".  But it should probably be freq_weights. 



--
Respectfully,

Jordan Howell
253-266-8088

josef...@gmail.com

unread,
Sep 21, 2021, 3:03:41 PM9/21/21
to pystatsmodels
On Tue, Sep 21, 2021 at 2:45 PM <josef...@gmail.com> wrote:


On Tue, Sep 21, 2021 at 2:19 PM jordan....@gmail.com <jordan....@gmail.com> wrote:
Hello, 

I have ran a tweedie, multi-variate GLM three times.  One without weights and offsets, one with weights only and one with offsets only.  The results are the exact same everytime.  

Is that normal?

It's not normal in the general case.
Any non-zero offset should at least change the constant in params

which weights? var_weights or freq_weights?

If they are non-constant, then params should change.

Prediction might not change much, e.g. if the model adjusts to compensate for offset.

Most of GLM, and essentially all estimation with irls is family independent.
It's unlikely that there is a bug in regular cases.

So most likely it's something specific to your data or to what you are doing.
There can always be problems with corner cases.
Did your fit converge?

Jordan Howell

unread,
Sep 21, 2021, 3:04:10 PM9/21/21
to pystat...@googlegroups.com
Yep. That was the issue.

josef...@gmail.com

unread,
Sep 21, 2021, 3:05:53 PM9/21/21
to pystatsmodels
On Tue, Sep 21, 2021 at 3:02 PM Jordan Howell <jordan....@gmail.com> wrote:
I just used "weights".  But it should probably be freq_weights. 

weights will just be swallowed by the **kwargs, and not do anything.

We still don't have a proper check in the models to which kwargs are allowed.


 

jordan....@gmail.com

unread,
Sep 21, 2021, 3:18:08 PM9/21/21
to pystatsmodels
Is using the offset argument the same as adding the offset column into the formula?  So instead of:

target ~ var1 * var2 * var3

I do:

target~ var1 * var2 * var3 + offset_factor

josef...@gmail.com

unread,
Sep 21, 2021, 3:23:07 PM9/21/21
to pystatsmodels
On Tue, Sep 21, 2021 at 3:18 PM jordan....@gmail.com <jordan....@gmail.com> wrote:
Is using the offset argument the same as adding the offset column into the formula?  So instead of:

target ~ var1 * var2 * var3

I do:

target~ var1 * var2 * var3 + offset_factor

you need to use GLM(..., offset=my_offset_factor)
then it will be included as in your second expression.

If you add it to the design matrix, exog, then the coefficient for it will not be fixed to 1.
It would get an estimated parameter just like all other explanatory variables.



 

Jordan Howell

unread,
Sep 21, 2021, 3:28:41 PM9/21/21
to pystat...@googlegroups.com
Ok.  I'm getting the exact same coefficients when using or taking out the offset.  The offset is derived from multiple coefficients multiplied together from a previous model.  I'm not sure why it's not changing the resulting parameters. 

josef...@gmail.com

unread,
Sep 21, 2021, 3:39:59 PM9/21/21
to pystatsmodels
On Tue, Sep 21, 2021 at 3:28 PM Jordan Howell <jordan....@gmail.com> wrote:
Ok.  I'm getting the exact same coefficients when using or taking out the offset.  The offset is derived from multiple coefficients multiplied together from a previous model.  I'm not sure why it's not changing the resulting parameters. 

Is it close to perfectly collinear?
e.g. run OLS(offset_factor, exog_in_glm)
and see whether Rsquared is close to 1 and residual scale is close to zero 
Close to perfect collinearity could be a reason that it doesn't have any effect.
The algorithm will find an "arbitrary" solution with perfect collinearity, where "arbitrary" is defined by `pinv`

And to check that you are using offset correctly:

offset2 = offset_factor + s * np.random.randn(len(offset_factor))

use `s` large enough compared to the magnitude of values in offset_factor.



 

josef...@gmail.com

unread,
Sep 21, 2021, 3:46:11 PM9/21/21
to pystatsmodels
On Tue, Sep 21, 2021 at 3:39 PM <josef...@gmail.com> wrote:


On Tue, Sep 21, 2021 at 3:28 PM Jordan Howell <jordan....@gmail.com> wrote:
Ok.  I'm getting the exact same coefficients when using or taking out the offset.  The offset is derived from multiple coefficients multiplied together from a previous model.  I'm not sure why it's not changing the resulting parameters. 

Is it close to perfectly collinear?
e.g. run OLS(offset_factor, exog_in_glm)
and see whether Rsquared is close to 1 and residual scale is close to zero 
Close to perfect collinearity could be a reason that it doesn't have any effect.
The algorithm will find an "arbitrary" solution with perfect collinearity, where "arbitrary" is defined by `pinv`

Does the previous model use the same as or a subset of explanatory variables in the current models.
Then any linear combination would have to be perfectly collinear.

You need to have a least one extra variable in the previous model.  
(no formal proof, but in analogy to similar two stage estimation problems)

josef...@gmail.com

unread,
Sep 26, 2021, 3:43:49 PM9/26/21
to pystatsmodels
On Tue, Sep 21, 2021 at 3:05 PM <josef...@gmail.com> wrote:


On Tue, Sep 21, 2021 at 3:02 PM Jordan Howell <jordan....@gmail.com> wrote:
I just used "weights".  But it should probably be freq_weights. 

weights will just be swallowed by the **kwargs, and not do anything.

We still don't have a proper check in the models to which kwargs are allowed.

 I added the invalid kwarg check to several models.
This will issue a ValueWarning in upcoming release 0.13

It's a bit tricky because classes in the hierarchy use different valid kwargs

Josef
Reply all
Reply to author
Forward
0 new messages