"SVD did not converge" for GLM

R S

unread,

Jan 14, 2015, 4:45:01 PM1/14/15

to pystat...@googlegroups.com

Hey,

I'm experiencing the following issue:

In [552]: glm_binom = sm.GLM(endog, exog, family=sm.families.Binomial())

In [553]: glm_binom.fit()

---------------------------------------------------------------------------

LinAlgError Traceback (most recent call last)

<ipython-input-553-814e0115c842> in <module>()

----> 1 glm_binom.fit()

....

101 def get_linalg_error_extobj(callback):

LinAlgError: SVD did not converge

I have seen that this is a common issue where there are NaNs; this is not the case here. The data is attached for debugging. I am clueless as to how to continue. Thanks!

R

exog.txt

endog.txt

josef...@gmail.com

unread,

Jan 14, 2015, 5:27:40 PM1/14/15

to pystatsmodels

It's still possible that the logit transformation introduces nans during the optimization.

Which statsmodels version are you using? IIRC we had one change for the corner case recently.

If I read your data correctly into pandas, I don't get any SVD failure, fit finishes, but the results look a bit strange

>>> print(res.summary())
Generalized Linear Model Regression Results
==============================================================================
Dep. Variable: [0, 1] No. Observations: 36
Model: GLM Df Residuals: 26
Model Family: Binomial Df Model: 9
Link Function: logit Scale: 1.0
Method: IRLS Log-Likelihood: nan
Date: Wed, 14 Jan 2015 Deviance: nan
Time: 17:13:01 Pearson chi2: 1.98e+18
No. Iterations: 13
==============================================================================
coef std err z P>|z| [95.0% Conf. Int.]
------------------------------------------------------------------------------
const 2.545e+14 2.44e+07 1.04e+07 0.000 2.54e+14 2.54e+14
x1 4.829e+14 8.76e+05 5.52e+08 0.000 4.83e+14 4.83e+14
x2 -3.812e+14 2.75e+06 -1.39e+08 0.000 -3.81e+14 -3.81e+14
x3 -1.647e+13 3.86e+04 -4.27e+08 0.000 -1.65e+13 -1.65e+13
x4 1.631e+12 5.22e+04 3.12e+07 0.000 1.63e+12 1.63e+12
x5 8.522e+12 7.05e+04 1.21e+08 0.000 8.52e+12 8.52e+12
x6 1.235e+11 423.555 2.92e+08 0.000 1.23e+11 1.23e+11
x7 1.021e+11 344.035 2.97e+08 0.000 1.02e+11 1.02e+11
x8 -1.074e+11 782.263 -1.37e+08 0.000 -1.07e+11 -1.07e+11
x9 -5.571e+10 577.962 -9.64e+07 0.000 -5.57e+10 -5.57e+10
==============================================================================

It looks like a perfect prediction case, we warn or raise in discrete Logit but I guess we don't have a check for it in GLM, But I don't know whether Binomial with counts has a perfect prediction problem, I've never heard of it.

>>> res.fittedvalues.values

array([ 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

0., 0., 0., 1., 0., 0., 0., 0., 0., 0.])

>>> res.model.endog

array([ 0. , 0. , 0.05645161, 0. , 0.0546875 ,

0. , 0.00234742, 0. , 0. , 0. ,

0. , 0. , 0. , 0. , 0.00409836,

0. , 0. , 0.01744186, 0.04268293, 0. ,

0.03846154, 0.5 , 0. , 0.04545455, 0.02325581,

0. , 0. , 0. , 0.01639344, 0. ,

0. , 0. , 0. , 0. , 0. , 0. ])

or something else is strange in this case.

Josef

>
> R
>

R S

unread,

Jan 15, 2015, 7:34:00 AM1/15/15

to pystat...@googlegroups.com

I was using version 0.5.0 (which is the default in anaconda). I updated to 0.6.1, ran into this issue, downgraded scipy to 0.14, and it ran without crashing.

Thanks!

josef...@gmail.com

unread,

Jan 15, 2015, 8:15:06 AM1/15/15

to pystatsmodels

On Thu, Jan 15, 2015 at 7:34 AM, R S <reg...@gmail.com> wrote:

I was using version 0.5.0 (which is the default in anaconda). I updated to 0.6.1, ran into this issue, downgraded scipy to 0.14, and it ran without crashing.

Do you get the same or similar numbers, parameter estimates and so on, as I did?

My impression is still that those numbers are "useless" and we should find out what "corener case" this is hitting.

Maybe parameters not identified, convergence problems or something else.

Josef

R S

unread,

Jan 15, 2015, 8:20:22 AM1/15/15

to pystat...@googlegroups.com

This is what I'm getting:

(14:54:04) In [10]: exog = loadtxt("exog.txt")

(15:17:11) In [11]: endog = loadtxt("endog.txt")

(15:17:14) In [12]: glm_binom = sm.GLM(endog, exog, family=sm.families.Binomial())

(15:17:45) In [13]: print glm_binom.fit().summary()

Generalized Linear Model Regression Results

==============================================================================

Dep. Variable: ['y1', 'y2'] No. Observations: 36

Model: GLM Df Residuals: 26

Model Family: Binomial Df Model: 9

Link Function: logit Scale: 1.0

Method: IRLS Log-Likelihood: -28.227

Date: Thu, 15 Jan 2015 Deviance: 26.371

Time: 15:18:02 Pearson chi2: 3.19

No. Iterations: 20

==============================================================================

coef std err z P>|z| [95.0% Conf. Int.]

------------------------------------------------------------------------------

const 208.4345 603.534 0.345 0.730 -974.469 1391.339

x1 4.9253 12.208 0.403 0.687 -19.001 28.852

x2 -29.0834 81.505 -0.357 0.721 -188.830 130.664

x3 -0.0937 0.338 -0.277 0.782 -0.756 0.569

x4 -0.1737 0.372 -0.466 0.641 -0.904 0.556

x5 1.1037 2.992 0.369 0.712 -4.761 6.969

x6 -0.0007 0.001 -0.575 0.565 -0.003 0.002

x7 0.0044 0.010 0.433 0.665 -0.016 0.024

x8 -0.0006 0.006 -0.105 0.916 -0.011 0.010

x9 -0.0122 0.032 -0.378 0.705 -0.076 0.051

==============================================================================

It looks much better...

josef...@gmail.com

unread,

Jan 15, 2015, 8:25:18 AM1/15/15

to pystatsmodels

Yes, that looks much better, and the numbers look reasonable.

I guess I messed up in my pandas datahandling.

Thanks for the feedback.

Josef

Skipper Seabold

unread,

Jan 15, 2015, 8:48:50 AM1/15/15

to pystat...@googlegroups.com

FWIW, I also got wild numbers in the solution trying this on master yesterday.

Skipper

Reply all

Reply to author

Forward

"SVD did not converge" for GLM - no NaNs

R S

josef...@gmail.com

R S

josef...@gmail.com

R S

josef...@gmail.com

Skipper Seabold