nominal logistic regression with formula/patsy

33 views
Skip to first unread message

Thomas Haslwanter

unread,
May 13, 2013, 12:19:19 PM5/13/13
to pystat...@googlegroups.com
Is is possible to do a nominal logistic regression with a formula/patsy in statsmodels?
And what about setting the frequencies?

In R, one can set the frequency in "multinom" (in the library "nnet") with the parameter "weights=freq".

In statsmodels, I currently somewhat clunkily hand-generate the endog- and exog-matrices, with
(whole code under https://github.com/thomas-haslwanter/dobson/blob/master/dobson.py)

    inFile = r'GLM_data/Table 8.1 Car preferences.xls'
    df = get_data(inFile)   
   
    pm = patsy.dmatrices('response~age+sex', data=df)
    endog_ind = np.zeros(len(pm[0]))
    for ii in range(len(pm[0])):
        endog_ind[ii] = np.where(pm[0][ii])[0]

    endog = np.repeat(endog_ind, df['frequency'].values.astype(int), axis=0)
    exog = np.array(np.repeat(pm[1], df['frequency'].values.astype(int), axis=0))
    model = sm.MNLogit(endog, exog).fit()
    print  model.summary(
)

What I would like to do, is something like (currently does not work for me)

      model = sm.MNLogit.from_formula('response~age+sex', data=df).fit()

josef...@gmail.com

unread,
May 13, 2013, 1:09:35 PM5/13/13
to pystat...@googlegroups.com
you would have to add a cweights argument, I guess.


It's not possible right now. I thinks this is the problem with the
missing case weights that I mentioned before.

It wouldn't be difficult to do, but quite a bit of work to add the
caseweights/frequencies everywhere and the write the unit test to make
sure they work correctly.

Josef
Reply all
Reply to author
Forward
0 new messages