comparing CVXPY to python sci-kit learn yields different results for logistic regression with L1 regularization [code included]

488 views
Skip to first unread message

Dan Caspi

unread,
Mar 9, 2016, 5:59:05 AM3/9/16
to cvxpy
Hey,
I am pretty sure it's a "me" problem and not a CVXPY bug, but maybe i misunderstanding something of how cvxpy works:
I trained a logistic regression model and got completely different results - could someone please tell me what am i missing? i tried to emulate the logistic loss function from here:

many thanks in advance,
Dan

from cvxpy import *
import numpy as np
import pandas as pd
import cvxopt
from multiprocessing import Pool
from scipy import sparse
import pickle
from sklearn.cross_validation import train_test_split
import matplotlib.pyplot as plt

from sklearn import linear_model, datasets
import matplotlib.pyplot as plt

iris = datasets.load_iris()
X = iris.data[:, :5]
y = iris.target
y[y>0]=1
y[y==0]=-1

clf = linear_model.LogisticRegression(C=1.0,penalty='l1')
m=clf.fit(X,np.ravel(y))
acc = m.score(X,np.ravel(y))
coefs = m.coef_
intercept = m.intercept_
coefs = np.hstack((intercept.reshape(1,1),coefs))


#### now compare coefs with CVXPY
gamma=Parameter(sign='positive')
gamma.value=1
beta=Variable(X.shape[1]+1) ## accomodate for intercept

loss = sum(logistic(vstack(1, X[i,:]).T*beta*-y[i]) for i in range(X.shape[0]))/X.shape[0] # multiply each row of X with the variable vector beta and take the average logistic loss

penalty = gamma*norm(beta,1)
objective = Minimize(loss + penalty)
p = Problem(objective)
res = p.solve(verbose=True, max_iters=10000)


res2=np.column_stack((np.ones(X.shape[0]),X)).dot(coefs.T)    #
res1=np.column_stack((np.ones(X.shape[0]),X)).dot(beta.value)

 

Steven Diamond

unread,
Mar 9, 2016, 8:53:06 PM3/9/16
to cvxpy
I looked into this and I'm not sure why they're different. My best guess is gamma has a different value in the two problems. Could you compute the regularization path for cvxpy or scikit-learn and see if the answers ever match?

Dan Caspi

unread,
Mar 11, 2016, 8:07:19 AM3/11/16
to cvxpy
Thank you Steven for your response and please excuse my delayed reply.
Of course , I will reply with the results , but Note that for both examples I specified the gamma manually. I suspect the logistic loss function might be off in the CVXPY formula up there - any thought of that? or is that the correct CVXPY-way of writing it?

thanks in advance,
Dan

Steven Diamond

unread,
Mar 12, 2016, 4:36:05 PM3/12/16
to cvxpy
Yes, I saw that you specified gamma manually. But you're use of the logistic function looks right to me, and it's been used successfully by many people. It's more likely that the regularization is different in the two problems, even though it looks like it should be the same.

Dan Caspi

unread,
Mar 15, 2016, 4:45:50 AM3/15/16
to cvxpy
Sorry for the late reply and thanks for your response. I'll plot the solutions to see what's the "meaning" of the regularization factor in each of the models (we clearly know what it means in CVXPY). 


thanks again for your help, it means a lot - CVXPY is awesome!

Dan

Dan Caspi

unread,
Mar 28, 2016, 6:18:22 AM3/28/16
to cvxpy
Clearly I am new to sci-kit learn, but, here is what i did:
from cvxpy import *
import numpy as np
from sklearn import linear_model, datasets
import matplotlib.pyplot as plt

iris = datasets.load_iris()
X = iris.data[:, :5]
y = iris.target
y[y>0]=1
y[y==0]=-1


reg_params = [1.0,0.1,10,0.5,2,0.25,4]
coefs1=[]
for c in [1.0,0.1,10,0.5,2,0.25,4]:
clf = linear_model.LogisticRegression (C=1.0,penalty='l1')

m=clf.fit(X,np.ravel(y))
#acc = m.score(X,np.ravel(y))
    coefs = m.coef_
intercept = m.intercept_
coefs = np.hstack((intercept.reshape(1,1),coefs))
    coefs1.append(coefs)



#### now compare coefs with CVXPY

gamma=Parameter(sign='positive')

beta=Variable(X.shape[1]+1) ## accomodate for intercept
loss = sum(logistic(vstack(1, X[i,:]).T*beta*-y[i]) for i in range(X.shape[0]))/X.shape[0] # multiply each row of X with the variable vector beta and take the average logistic loss
penalty = gamma*norm(beta,1)
objective = Minimize(loss + penalty)
p = Problem(objective)

coefs2=[]
for c in [1.0,0.1,10,0.5,2,0.25,4]:
gamma.value=c
res = p.solve(verbose=False, max_iters=10000)
coefs2.append(beta.value)


# res2=np.column_stack((np.ones(X.shape[0]),X)).dot(coefs.T) #
# res1=np.column_stack((np.ones(X.shape[0]),X)).dot(beta.value)

for coef in zip([1.0,0.1,10,0.5,2,0.25,4],coefs1,coefs2):
print "regularization param="+str(coef[0])
print "scikit-learn LoigsticRegression - L1-norm of solution :"+str(np.linalg.norm(coef[1]))
print "cvxpy - L1-norm of solution:"+str(np.linalg.norm(coef[2]))
print "============================="


Dan Caspi

unread,
Mar 28, 2016, 6:21:15 AM3/28/16
to cvxpy
with the following results - it seems that CVXPY actually makes more sense. I'd say that i have a misunderstanding of what C actually means - even though the documentation says: "Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization."

anyway - here is the output:


regularization param=1.0
scikit-learn LoigsticRegression - L1-norm of solution :3.80566096219
cvxpy - L1-norm of solution:0.0926064256608
=============================
regularization param=0.1
scikit-learn LoigsticRegression - L1-norm of solution :3.80368534628
cvxpy - L1-norm of solution:1.39968956845
=============================
regularization param=10
scikit-learn LoigsticRegression - L1-norm of solution :3.80517942108
cvxpy - L1-norm of solution:2.87368151163e-13
=============================
regularization param=0.5
scikit-learn LoigsticRegression - L1-norm of solution :3.79877657619
cvxpy - L1-norm of solution:0.230277128165
=============================
regularization param=2
scikit-learn LoigsticRegression - L1-norm of solution :3.80383768267
cvxpy - L1-norm of solution:2.98233182005e-12
=============================
regularization param=0.25
scikit-learn LoigsticRegression - L1-norm of solution :3.80094993068
cvxpy - L1-norm of solution:0.542690976078
=============================
regularization param=4
scikit-learn LoigsticRegression - L1-norm of solution :3.80353468611
cvxpy - L1-norm of solution:1.45434345533e-12
=============================


i don't feel there is a bug here - so i think this can be marked resolved/closed? 

Dan

Steven Diamond

unread,
Mar 28, 2016, 9:56:14 PM3/28/16
to cvxpy
Alright, it's too bad the scikit learn docs aren't clearer.

Shamus Patry

unread,
Mar 29, 2016, 9:51:09 AM3/29/16
to cvxpy
clf = linear_model.LogisticRegression (C=1.0,penalty='l1')

Should be

clf = linear_model.LogisticRegression
(C=c,penalty='l1')

Reply all
Reply to author
Forward
0 new messages