Re: [pystatsmodels] NaNs in Multinomial Logistic Regression (MNLogit)

Skipper Seabold

unread,

Sep 4, 2012, 11:59:18 AM9/4/12

to pystat...@googlegroups.com

On Tue, Sep 4, 2012 at 11:58 AM, Bokononisms <na...@voxy.com> wrote:
> I am creating Multinomial Logistic Regression Models using MNLogit from
> statsmodels.discrete.discrete_model. Occasionally, I receive NaN as the
> function's value after termination.
>
> I have equivalent code in R, and I do not receive NaN values. Can anyone
> explain this behavior? Basically, I am trying to predict one of three
> categories from several explanatory variables that are continuous in nature.

Can you send me some code and data to replicate the problem?

Skipper

josef...@gmail.com

unread,

Sep 4, 2012, 12:18:29 PM9/4/12

to pystat...@googlegroups.com

And making it into a test case would be good, if there is some problem
with unusual cases.

One options that sometimes helps, is to try different optimizers, for
example method='ncg' or a plain method='nm' (which might require
increasing maxfun and maxiter)

Josef

>
> Skipper

Bokononisms

unread,

Sep 4, 2012, 12:28:43 PM9/4/12

to pystat...@googlegroups.com

Skipper,

Thanks for the super fast reply. Due to the project's nature, I cannot expose the code and data. However, I can put up the function I used to create the model.

def create_regression_model(train_data):
    endog = train_data['x13']
    exog = np.column_stack( (train_data['x3'], train_data['x4'],\
        train_data['x5'], train_data['x6'], train_data['x7'], \
        train_data['x8'], train_data['x9'], \
        train_data['x10'], train_data['x11'], train_data['x12']) )
    mod = MNLogit(endog, exog); #print mod.exog_names; print mod.endog_names
    return mod

Would it help if the explanatory variables were scaled between 0 and 1? In the meantime, I will try the solution mentioned by Josef.

Best,

Na'im

Skipper Seabold

unread,

Sep 4, 2012, 12:34:44 PM9/4/12

to pystat...@googlegroups.com

On Tue, Sep 4, 2012 at 12:28 PM, Bokononisms <na...@voxy.com> wrote:
> Skipper,
>
> Thanks for the super fast reply. Due to the project's nature, I cannot
> expose the code and data. However, I can put up the function I used to
> create the model.

Can you send me (a subset of) the data off-list with the understanding
that I will not publish or use this data anywhere. You can also strip
it of all identifying information. If I can't replicate the problem I
can't figure out why you're seeing these results. You can also send
some similar random data if you're able to replicate the failure.

Another guess is that you have missing data somewhere and we're not by
default doing anything to handle missing data where your R code might
be, though this is likely to change in the next release.

Note also that we do not include a constant by default so you need to
add it in to exog using sm.add_constant in the below code to match R
(unless this is intentional).

Reply all

Reply to author

Forward