new category "other" MLE models, similar to gamlss

30 views

Skip to first unread message

josef...@gmail.com

unread,

Jan 4, 2021, 2:57:27 PM1/4/21

to pystatsmodels

I have problems coming up with a folder name and location for maximum
likelihood models based on a variety of distributions.

The structure of the models will be similar to `discrete` which are
all based on explicitly coded likelihood models.
However, `discrete` has discrete in the name and adding other models
for example for continuous on (0, 1) interval or on positive real
line is a misnomer.

Some non-discrete models that I would like to have in statsmodels are

- those used for survival and lifetime applications but without censoring first,
- and flexible models based on 3 or 4 parameter distributions when we
want to have more control over skew and kurtosis
- mixed discrete and continuous distribution like zero-inflated beta
distributions.

We have currently also statsmodels.miscmodel but that is mainly for
prototype models and as testing ground for `GenericLikelihoodModels`.

I didn't manage to come up with a good directory name, only something
like `othermod`.

Any better ideas?

Aside
I'm currently trying to go into two opposite directions

- allow inference and similar under misspecification, "all models are
wrong" but we can still use misspecification robust methods
- add more support for modelling the "right" distribution for when we
are interested in more than mean parameters and robust inference
https://github.com/statsmodels/statsmodels/issues/7142
This started with count models for which we have a resonable good
selection (although still missing several basic models).

And as link between the two: specification tests and diagnostics to
figure out how wrong our model is and in which direction.

Josef

josef...@gmail.com

unread,

Jun 15, 2021, 5:02:13 PM6/15/21

to pystatsmodels

In another detour, I added count distributions based on discretizing a continuous distribution.

This allows us to go into flexible discrete distributions with stronger under dispersion or heavier tails than Poisson and than our other pure count distributions.

Generalized Poisson allows some under dispersion but not too much, and lacks flexibility.

https://github.com/statsmodels/statsmodels/pull/7488

https://gist.github.com/josef-pkt/c94158a8fd0c7b60670c0a8c03f69419

I don't have regression models for this yet, only fitting distribution parameters without explanatory variables.

We are still missing a setup for multiple links for model or distribution parameters.

These models are good for fitting quantiles and predictive distributions but don't have a mean parameterization and usually do not have a closed form expressions for moments.