Cant reproduce test(ll) locally with logloss on test set

Michael Pearmain

unread,

Mar 7, 2016, 4:27:38 AM3/7/16

to libFM - Factorization Machines

Hi,

I've been testing libFM using the pywFM package which is a wrapper for cli interface to libFM,

My question involves understanding how the predictions links to the information that is produced in the output

My specific example: If i run libFM with a train and test dataset, i can see in the output test(ll) drops to 0.515385, if i take the predictions and run the test predictions against the test label i get logloss values of 2.XXX.

I would have though that the predictions and Test(ll) value should match? or am i misunderstanding the output in some way?

FYI the labels are 0,1 an param equivalent (im running in a wrapper so data is converted) would be:

./libFM -task c -train XXX.libfm -test XXX.libfm -dim '1,1,5' -iter 10 -method mcmc -init_stdev 0.1

#Iter=  0   Train=0.666074  Test=0.668503   Test(ll)=0.665911
#Iter=  1   Train=0.693502  Test=0.694918   Test(ll)=0.606683
#Iter=  2   Train=0.707983  Test=0.693256   Test(ll)=0.570645
#Iter=  3   Train=0.730493  Test=0.711274   Test(ll)=0.526459
#Iter=  4   Train=0.731135  Test=0.711274   Test(ll)=0.513271
#Iter=  5   Train=0.692782  Test=0.714686   Test(ll)=0.515833
#Iter=  6   Train=0.702832  Test=0.70419    Test(ll)=0.516339
#Iter=  7   Train=0.698818  Test=0.7097     Test(ll)=0.514831
#Iter=  8   Train=0.709859  Test=0.707076   Test(ll)=0.515032
#Iter=  9   Train=0.714223  Test=0.711624   Test(ll)=0.515385

Full thread on Kaggle:

https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/forums/t/19319/help-with-libfm/110652#post110652

sre...@libfm.org

unread,

Mar 27, 2016, 4:07:27 PM3/27/16

to libFM - Factorization Machines

You can add the flag "-out filename". This will create an output file with one line for each sample in the test dataset "-test". Each line in the output file contains a probability (i.e., a number between 0 and 1) that denotes the predicted probability that the test instance is positive.Computing the log-loss over this output should give in your case 0.515385.

Michael Pearmain

unread,

Mar 31, 2016, 6:37:52 AM3/31/16

to libFM - Factorization Machines, sre...@libfm.org

Thanks for the reply,

I've tested with the -out flag and used test labels (0 and 1 in my case) and still cannot reproduce the test(LL), i can see in many predictions there are values of 1.0 and 0.0 so i have bounded the logloss 1e-15, and 1- 1e-15 but do not get anything close the test logloss reported.

I'm convinced this is user error and no bug, however i cannot fatom where that is.

Again thanks for the reponse

Bharat Prabhakar

unread,

Oct 23, 2016, 2:38:05 PM10/23/16

to libFM - Factorization Machines

Hey Michael,

Just stumbled upon your post. I was wondering if you were finally able to resolve the issue of log-loss discrepancy? I'm asking because I'm facing the issue myself and have been unable to fathom why's this happening. Been banging my head against the wall for a while now.

Thanks.

Michael Pearmain

unread,

Oct 23, 2016, 3:42:20 PM10/23/16

to Bharat Prabhakar, libFM - Factorization Machines

https://github.com/srendle/libfm/issues/21

They are using log10 not log

--
You received this message because you are subscribed to a topic in the Google Groups "libFM - Factorization Machines" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/libfm/dOONjvw_VRY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to libfm+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward