Multinomial Logistic Regression

462 views
Skip to first unread message

Benjamin Deonovic

unread,
Mar 25, 2016, 12:51:51 PM3/25/16
to julia-stats
Can someone point me to the best place to do multinomial logistic regression in Julia? I've tried with https://github.com/lindahua/Regression.jl but I am not getting the right results according to my simulation, I'm not sure how to interpret the output from that package (documentation is rather lacking), and I am not getting any responses on their issues (https://github.com/lindahua/Regression.jl/issues/14).


Andreas Noack

unread,
Mar 25, 2016, 12:58:43 PM3/25/16
to julia...@googlegroups.com
I think that Dahua doesn't have as much time for Julia package development as he used to have. I'm not aware of other implementations but there could easily be some that I don't know of. For now, it will probably be easiest to roll your own version or fork Dahua's package.

On Fri, Mar 25, 2016 at 12:51 PM, Benjamin Deonovic <bdeo...@gmail.com> wrote:
Can someone point me to the best place to do multinomial logistic regression in Julia? I've tried with https://github.com/lindahua/Regression.jl but I am not getting the right results according to my simulation, I'm not sure how to interpret the output from that package (documentation is rather lacking), and I am not getting any responses on their issues (https://github.com/lindahua/Regression.jl/issues/14).


--
You received this message because you are subscribed to the Google Groups "julia-stats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to julia-stats...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cedric St-Jean

unread,
Mar 27, 2016, 1:34:13 PM3/27/16
to julia-stats
Hi Benjamin,

You can use the scikit-learn version via ScikitLearn.jl if you don't mind the (moderate) cost of sending your data to Python.

using ScikitLearn
@sk_import linear_model: LogisticRegression

model = fit!(LogisticRegression(multi_class="multinomial"), X_train, y_train)
predict(model, X_test)

It was released recently, don't hesitate to file an issue if you get a problem with it.

Cédric

Benjamin Deonovic

unread,
Apr 6, 2016, 5:30:27 PM4/6/16
to julia-stats
I got the following error:

julia> model = ScikitLearn.fit!(LogisticRegression(multi_class="multinomial"), X, y)
ERROR: PyError (:PyObject_Call) <type 'exceptions.ValueError'>
ValueError('Solver liblinear does not support a multinomial backend.',)
  File "/home/benjamin/bin/miniconda/lib/python2.7/site-packages/sklearn/linear_model/logistic.py", line 1148, in fit
    self.dual, sample_weight)
  File "/home/benjamin/bin/miniconda/lib/python2.7/site-packages/sklearn/linear_model/logistic.py", line 412, in _check_solver_option
    "a multinomial backend." % solver)

 [inlined code] from /home/benjamin/.julia/v0.4/PyCall/src/exception.jl:81
 in pycall at /home/benjamin/.julia/v0.4/PyCall/src/PyCall.jl:344
 in call at /home/benjamin/.julia/v0.4/PyCall/src/PyCall.jl:372
 in fit! at /home/benjamin/.julia/v0.4/ScikitLearn/src/Skcore.jl:75

Benjamin Deonovic

unread,
Apr 6, 2016, 5:32:48 PM4/6/16
to julia-stats
I figured out I just need to change the solver from "liblinear" to "lbfgs"

Benjamin Deonovic

unread,
Apr 6, 2016, 5:40:31 PM4/6/16
to julia-stats
How do I extract the coefficient values after I have fit the model? 


On Sunday, March 27, 2016 at 12:34:13 PM UTC-5, Cedric St-Jean wrote:

Cedric St-Jean

unread,
Apr 6, 2016, 6:06:09 PM4/6/16
to julia-stats
All Python attributes are accessible with [:attribute_name]. model[:coef_] and model[:intercept_] should work.

Benjamin Deonovic

unread,
Apr 6, 2016, 6:28:57 PM4/6/16
to julia-stats
Great thanks! I'm having trouble reproducing some R results for multinomial regression. Could you take a look if I am doing things right with the SciKitLearn package? http://stackoverflow.com/questions/36463072/julia-multinomial-regression-with-time-series-lagged-values

Cedric St-Jean

unread,
Apr 6, 2016, 7:09:09 PM4/6/16
to julia-stats
The ScikitLearn.jl call looks OK. Some ideas:

- scikit-learn's regression is regularized by default. You can disable it with LogisticRegression(C=1.e9)
- you convert y to Vector{Int64}, might want to try Vector{Float64}
- isn't multinomial logistic regression non-identifiable under certain parametrizations? I would try computing the output (i.e. predict(model, X)) in both models and see if they differ

Otherwise, trying a third package might be useful in figuring out which one is wrong.

Cédric

Cedric St-Jean

unread,
Apr 6, 2016, 7:19:38 PM4/6/16
to julia-stats
Note that not all of the \beta_k vectors of coefficients are uniquely identifiable. This is due to the fact that all probabilities must sum to 1, making one of them completely determined once all the rest are known. As a result there are only k-1 separately specifiable probabilities, and hence k-1 separately identifiable vectors of coefficients. One way to see this is to note that if we add a constant vector to all of the coefficient vectors, the equations are identical:

From Wikipedia. So it's expected that the coefficients are different, even if the predictions are the same. What do you want to use the coefficients for?

Benjamin Deonovic

unread,
Apr 7, 2016, 3:00:23 PM4/7/16
to julia-stats
Thanks, I don't need the coefficients but the estimated probabilityes, (in wikipedia's notation) $P(Y_i = k) = exp(X\beta_k)/(1 + \sum_u exp(X\beta_u))$. These should be identifiable, correct? 

Cedric St-Jean

unread,
Apr 7, 2016, 3:06:24 PM4/7/16
to julia...@googlegroups.com
Yeah. Use `predict_proba(model, X)` to get them.

--
You received this message because you are subscribed to a topic in the Google Groups "julia-stats" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/julia-stats/2pHziogT9LI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to julia-stats...@googlegroups.com.

Benjamin Deonovic

unread,
Apr 7, 2016, 4:20:10 PM4/7/16
to julia-stats
Thanks Cedric, you've been a great help. It looks like all of the predicted probabilities are the same across all the different implementations (R, Regression.jl, and ScikitLearn) thanks for taking your time to help me with this. 
Reply all
Reply to author
Forward
0 new messages