Convergence Criteria

jordan....@gmail.com

unread,

May 11, 2021, 1:30:35 PM5/11/21

to pystatsmodels

Hello,

At my place of employment, we're trying to put a proprietary software to bed and just use statsmodels.

Between the two (willis towers watson and pystatsmodels), using the same model (Poisson GLM), and same data, we're getting convergence warnings in pystatsmodels.

I noticed the following on the documents page:

"ncomplete convergence in maximum likelihood estimation¶

In some cases, the maximum likelihood estimator might not exist, parameters might be infinite or not unique (e.g. (quasi-)separation in models with binary endogenous variable). Under the default settings, statsmodels will print a warning if the optimization algorithm stops without reaching convergence. However, it is important to know that the convergence criteria may sometimes falsely indicate convergence (e.g. if the value of the objective function converged but not the parameters). In general, a user needs to verify convergence."

How does one check for convergence? What is driving whether a model converges or not?

Peter Quackenbush

unread,

May 11, 2021, 2:09:46 PM5/11/21

to pystat...@googlegroups.com

There is an iterative process.

For IRLS fits, You can either check that the deviance converges or the Params converge. Note I think default convergence criterion assumes a smaller dataset. Not sure how big yours is.

https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_linear_model.GLM.fit.html

I’d recommend trying method=‘newton’ or method=‘lbfgs’ along with optim_hessian=‘eim’. Experiment with the options… depends on the data …

optim_hessian=‘oim’ assumes the data follow the family selected. Crazy corner cases might hurt you.

On May 11, 2021, at 12:30 PM, jordan....@gmail.com <jordan....@gmail.com> wrote:

Hello,

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/201d5264-7393-46bb-aa04-203d02a50558n%40googlegroups.com.

josef...@gmail.com

unread,

May 11, 2021, 3:07:34 PM5/11/21

to pystatsmodels

On Tue, May 11, 2021 at 2:09 PM Peter Quackenbush <pqu...@gmail.com> wrote:

There is an iterative process.

For IRLS fits, You can either check that the deviance converges or the Params converge. Note I think default convergence criterion assumes a smaller dataset. Not sure how big yours is.

https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_linear_model.GLM.fit.html

I’d recommend trying method=‘newton’ or method=‘lbfgs’ along with optim_hessian=‘eim’. Experiment with the options… depends on the data …

optim_hessian=‘oim’ assumes the data follow the family selected. Crazy corner cases might hurt you.

On May 11, 2021, at 12:30 PM, jordan....@gmail.com <jordan....@gmail.com> wrote:

Hello,

At my place of employment, we're trying to put a proprietary software to bed and just use statsmodels.

Between the two (willis towers watson and pystatsmodels), using the same model (Poisson GLM), and same data, we're getting convergence warnings in pystatsmodels.

Which optimization `method` are you using?

AFAICS, irls doesn't print a warning

I noticed the following on the documents page:

"ncomplete convergence in maximum likelihood estimation¶
In some cases, the maximum likelihood estimator might not exist, parameters might be infinite or not unique (e.g. (quasi-)separation in models with binary endogenous variable). Under the default settings, statsmodels will print a warning if the optimization algorithm stops without reaching convergence. However, it is important to know that the convergence criteria may sometimes falsely indicate convergence (e.g. if the value of the objective function converged but not the parameters). In general, a user needs to verify convergence."
How does one check for convergence? What is driving whether a model converges or not?

There should be a `converged` boolean attribute in the results instance.

Also the results instance when fitting with scipy optimizers has `mle_retvals` but it looks like we don't have added it to irls fits.

https://github.com/statsmodels/statsmodels/issues/1722

For IRLS, there is a `fit_history` attribute non-convergence only can be when maxiter is reached, i.e. number of `iterations` = maxiter.

For nice datasets, convergence is usually fast.

There are many possible problems with messy datasets that can prevent convergence.

Our defaults don't handle very difficult cases very well, and changing options or repeated fit with different starting values are needed in those cases.

(e.g. maxiter could be too low or convergence tolerance too tight.)

Some models also break down if the model is not appropriate for the data at all, but that shouldn't be the case with GLM or Poisson.

(e.g. Negative Binomial assumes over dispersion relative to Poisson and can break down if the data has under dispersion.)

Josef

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/201d5264-7393-46bb-aa04-203d02a50558n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/13DD278B-17AC-4AB8-B639-2DCFBB91DE70%40gmail.com.

Jordan Howell

unread,

May 11, 2021, 3:20:15 PM5/11/21

to pystat...@googlegroups.com

I’m not picking the method so I guess whatever is the default.

Jordan

On May 11, 2021, at 3:07 PM, josef...@gmail.com wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/CAMMTP%2BCpmdS-g6qvVFy7SfLGq2r-q8AmciRRRuAq-LN7PWgeOQ%40mail.gmail.com.

josef...@gmail.com

unread,

May 11, 2021, 4:30:33 PM5/11/21

to pystatsmodels

On Tue, May 11, 2021 at 3:20 PM Jordan Howell <jordan....@gmail.com> wrote:

I’m not picking the method so I guess whatever is the default.

Are you really using GLM, and not discrete Poisson?

I cannot find a convergence warning in the code for the default irls GLM.fit

Josef

To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/908D4C09-DCF0-41EE-B773-B409E638211E%40gmail.com.

Reply all

Reply to author

Forward