Convergence Criteria

23 views
Skip to first unread message

jordan....@gmail.com

unread,
May 11, 2021, 1:30:35 PM5/11/21
to pystatsmodels
Hello,

At my place of employment, we're trying to put a proprietary software to bed and just use statsmodels.  

Between the two (willis towers watson and pystatsmodels), using the same model (Poisson GLM),  and same data, we're getting convergence warnings in pystatsmodels.  

I noticed the following on the documents page:

"ncomplete convergence in maximum likelihood estimation

In some cases, the maximum likelihood estimator might not exist, parameters might be infinite or not unique (e.g. (quasi-)separation in models with binary endogenous variable). Under the default settings, statsmodels will print a warning if the optimization algorithm stops without reaching convergence. However, it is important to know that the convergence criteria may sometimes falsely indicate convergence (e.g. if the value of the objective function converged but not the parameters). In general, a user needs to verify convergence."

How does one check for convergence?  What is driving whether a model converges or not?


Peter Quackenbush

unread,
May 11, 2021, 2:09:46 PM5/11/21
to pystat...@googlegroups.com
There is an iterative process.

For IRLS fits, You can either check that the deviance converges or the Params converge.  Note I think default convergence criterion assumes a smaller dataset. Not sure how big yours is.


I’d recommend trying method=‘newton’ or method=‘lbfgs’ along with optim_hessian=‘eim’. Experiment with the options… depends on the data …

optim_hessian=‘oim’ assumes the data follow the family selected. Crazy corner cases might hurt you. 

On May 11, 2021, at 12:30 PM, jordan....@gmail.com <jordan....@gmail.com> wrote:

Hello,
--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/201d5264-7393-46bb-aa04-203d02a50558n%40googlegroups.com.

josef...@gmail.com

unread,
May 11, 2021, 3:07:34 PM5/11/21
to pystatsmodels
On Tue, May 11, 2021 at 2:09 PM Peter Quackenbush <pqu...@gmail.com> wrote:
There is an iterative process.

For IRLS fits, You can either check that the deviance converges or the Params converge.  Note I think default convergence criterion assumes a smaller dataset. Not sure how big yours is.


I’d recommend trying method=‘newton’ or method=‘lbfgs’ along with optim_hessian=‘eim’. Experiment with the options… depends on the data …

optim_hessian=‘oim’ assumes the data follow the family selected. Crazy corner cases might hurt you. 

On May 11, 2021, at 12:30 PM, jordan....@gmail.com <jordan....@gmail.com> wrote:

Hello,

At my place of employment, we're trying to put a proprietary software to bed and just use statsmodels.  

Between the two (willis towers watson and pystatsmodels), using the same model (Poisson GLM),  and same data, we're getting convergence warnings in pystatsmodels.  

Which optimization `method` are you using?

AFAICS, irls doesn't print a warning
 

I noticed the following on the documents page:

"ncomplete convergence in maximum likelihood estimation

In some cases, the maximum likelihood estimator might not exist, parameters might be infinite or not unique (e.g. (quasi-)separation in models with binary endogenous variable). Under the default settings, statsmodels will print a warning if the optimization algorithm stops without reaching convergence. However, it is important to know that the convergence criteria may sometimes falsely indicate convergence (e.g. if the value of the objective function converged but not the parameters). In general, a user needs to verify convergence."

How does one check for convergence?  What is driving whether a model converges or not?

There should be a `converged` boolean attribute in the results instance.

Also the results instance when fitting with scipy optimizers has `mle_retvals` but it looks like we don't have added it to irls fits.
For IRLS, there is a `fit_history` attribute non-convergence only can be when maxiter is reached, i.e. number of `iterations` = maxiter.

For nice datasets, convergence is usually fast. 
There are many possible problems with messy datasets that can prevent convergence. 
Our defaults don't handle very difficult cases very well, and changing options or repeated fit with different starting values are needed in those cases.
(e.g. maxiter could be too low or convergence tolerance too tight.)

Some models also break down if the model is not appropriate for the data at all, but that shouldn't be the case with GLM or Poisson.
(e.g. Negative Binomial assumes over dispersion relative to Poisson and can break down if the data has under dispersion.)

Josef


--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pystatsmodels/201d5264-7393-46bb-aa04-203d02a50558n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "pystatsmodels" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pystatsmodel...@googlegroups.com.

Jordan Howell

unread,
May 11, 2021, 3:20:15 PM5/11/21
to pystat...@googlegroups.com
I’m not picking the method so I guess whatever is the default. 

Jordan

On May 11, 2021, at 3:07 PM, josef...@gmail.com wrote:



josef...@gmail.com

unread,
May 11, 2021, 4:30:33 PM5/11/21
to pystatsmodels
On Tue, May 11, 2021 at 3:20 PM Jordan Howell <jordan....@gmail.com> wrote:
I’m not picking the method so I guess whatever is the default. 

Are you really using GLM, and not discrete Poisson?

I cannot find a convergence warning in the code for the default irls GLM.fit

Josef

 
Reply all
Reply to author
Forward
0 new messages