towards second moments

11 views
Skip to first unread message

josef...@gmail.com

unread,
Apr 21, 2016, 9:50:11 PM4/21/16
to pystatsmodels
I like big generic topics.

Starting with weights and Tweedie for GLM I looked for some time into
various approaches for modeling variance functions,
heteroscedasticity, dispersion in GLM/LEF and related models.

Tweedie has a messy likelihood function (it's compound Poisson with an
infinite sum). However, the mean function for given dispersion
parameter is just GLM where only the mean and variance as function of
mean are relevant.
So it's easy to estimate.

There are various approaches to estimating the dispersion parameters,
but especially the full MLE sounded much to expensive to me, and
approximation to loglikelihood function are sometimes shaky.

So the easiest or cheapest solution seemed to be to use a moment
estimator, which is referred to as Pearson function estimator in some
of the literature. It's analogous to but more complicated than the
simple Pearson dispersion estimator.

One approach in the literature is to use double exponential models for
similar modeling of the variance or dispersion function, using
GLM-Gamma to essentially solve the weighted moment condition for the
dispersion parameters.


This is essentially extending feasible generalized least squares with
heteroscedasticity (GLSHet) to the exponential family and GLM.

GEE does something a bit similar but on the correlation and not the
variance itself.

And if we reduce the estimators to their bare minimum, then they are
just GMM estimators. Looking at it as GMM has the advantage that we
can combine moment conditions for inference if needed, which is not so
easy in general two equation models without full likelihood function.


The simplest next step seems to be to implement double exponential
models, and normal MLE with heteroscedasticity function.

It will also require to finally decide how to handle
multi-part/multi-equation models on top of existing models.


one possible usecase: inference on rates and proportions if dispersion
various by groups.


the main idea for now is to extend approaches that are pretty common
for linear models to GLM/LEF

- specification tests: to check if we have strong violation of
underlying assumptions
- robust covariance matrices, robust estimation: use estimators and
inference that are valid even if some basic assumptions are violated,
or more precise where we don't impose those assumptions.
- model deviations from the simple models (explicitly model
correlation and dispersion in the current case)


Josef
Reply all
Reply to author
Forward
0 new messages