class PoissonMixed2(MixedMixin, Poisson): pass
class PoissonPenalized2(PenalizedMixin, Poisson):
pass
class PoissonPenalizedMixed(PenalizedMixin, MixedMixin, Poisson):
pass
and we have penalized maximum likelihood for Poisson model with cluster specific random effects.
- Poisson provides the underlying distribution,
- MixedMixin integrates and aggregates and
- PenalizedMixin adds a penalty term.
`PoissonPenalizedMixed` has the submodels including Poisson as special cases, and would be the only one that we really need, except for complex signatures.
And, in two more lines we can do the same for GLM
class GLMMixedPenalized(PenalizedMixin, MixedMixin, GLM):
pass
I like it. Will the penalized mixin override fit, or provide a separate fit_regularized method?
On Fri, May 8, 2015 at 12:03 AM, Kerby Shedden <kshe...@umich.edu> wrote:I like it. Will the penalized mixin override fit, or provide a separate fit_regularized method?Neither, either or both.We can always add extra methods that are not in the inheritance chain for either internal use or user facing, like `fit_regularized` or `_fit_regularized`.However, the user will always have the `fit` method available whether inherited or modified, so it needs to work and will have to be part of the user facing API.Right now my PenalizedMixin does neither, I just use the inherited `fit` method, which calls the standard optimizer and creates the standard model specific results class.(To clean up my PenalizedMixin, I will have to override fit at least for GLM because it will not work with method IRLS which is still our default.)We need to override and modify the inherited fit method, if we don't use the default optimizers, if we want to adjust the created results instance or if the inherited methods don't work without changes.For example, if you only want to provide regularized fit with a special optimizer, then you could just name you fit_regularized as `fit` and it replaces the inherited fit, or add a switch between optimizers (similar to GLM.fit).As example for not working fit: MixedMixin adds additional parameters to the `params`, and some inherited methods won't work. I need to override the inherited `predict` to strip the extra parameters for the super().predict call, and I have to override `fit` because the default start_params don't have the extra parameters.I think that we should provide several user API `fit_xxx` only if they provide clearly distinct functionality, like currently `fit` is unregularized, while `fit_regularized` has the penalized fit (and in discrete_models also returns a different results class.)Another possible reason to provide a second official fit function is if the signature is very differentMy guess for the specific case with elastic net:If we provide special `XXXPenalized` classes, then it would be better to override fit and delegate to an internal `_fit_elasticnet` or `_fit_regularized` method.My main worry is about what attributes we have to attach to the model and whether they could get out of sync if we have several fit methods. It might be easier to have one main fit method that is in "control" overall.But, I don't have a very strong opinion about this yet. I have to go through the standard fit channel because I'm using the standard inherited optimizers. For elastic net, both ways can be made to work.
class GLMMixedNested3(MixedMixin, MixedMixin, MixedMixin, GLM):
pass
When you merge your PenalizedMixin I can rebase #2385 on it.For non-smooth penalties and coordinate-descent type algorithms, we don't want to include the non-smooth part of the penalty in like/score/hess. We use like/score/hess to obtain a quadratic approximation to the likelihood plus the smooth part of the penalty, then we use the structure of the non-smooth terms to do each one-dimensional optimization. This can still be handled within your mixin, but the penal_func, score, etc. would not reflect the entire penalty.
Also for coordinate-descent, it might be useful to have a method that "spawns" a 1-dimensional restricted model along a given coordinate. For PHReg this would have the ability to recycle a lot of the setup calculations that only depend on endog.