Save Model Coefficient Question

284 views
Skip to first unread message

Alexey Renyov

unread,
Jun 6, 2016, 11:21:48 AM6/6/16
to libFM - Factorization Machines

So I'm currently exploring your package libfm.
And stuck on some problem. I'm trying to extract coefficient from libfm and write my own predictor - that's how i understand all this things much better.
1) I numerate coefficients strarting from 1.
But, when I'm extracting coefficient I'm getting the next statistics:

num_rows=2002828 num_values=50070700 num_features=33945 min_target=0 max_target=1

But when I load saved model coefficient.

I correctly see bias. But then I somehow see 33945 coefficients values.
The problem that my data don't contain value with number index 33945. So if I grep my data i can find feature 33944, but not 33945.
So before feeding data into model, I prepare feature map with index ~ value. But this moment disappointed me a lot.

The same situation for matrix V, it one more row that number of my features.

Can you somehow clarify what I'm missing?

-----

P.S. I'm using mcmc opt. But anyway save_model works... =) Maybe that's can be a problem. 

Thank you a lot.

Alexey Renyov

unread,
Jun 7, 2016, 3:04:41 AM6/7/16
to libFM - Factorization Machines
Hi again.
I looked into source and find this line in Data.h . (https://github.com/srendle/libfm/blob/master/src/libfm/src/Data.h#L219-L221)

if (has_feature) {
   num_features
++;
}

What is the reason to do this?

Alexey Renyov

unread,
Jun 7, 2016, 3:25:19 AM6/7/16
to libFM - Factorization Machines
Because from article I see that
w in R^n, where n is number of features.
V in R^(n x k)

But i'm getting  w in R^(n+1) and V in R^((n+1) x k).

I'm sure there is exists a reason for this.

Alexey Renyov

unread,
Jun 7, 2016, 7:49:51 AM6/7/16
to libFM - Factorization Machines
Found bug in my code.
Everything looks ok now.

hzzen...@gmail.com

unread,
Oct 8, 2016, 11:34:40 PM10/8/16
to libFM - Factorization Machines
I have same problem with you, is it my code has a bug, how do you solve?

Alexey Renyov

unread,
Oct 9, 2016, 4:12:19 PM10/9/16
to libFM - Factorization Machines, hzzen...@gmail.com
Hi!
Check that your features indices starts from 0. That was my problem.

hzzen...@gmail.com

unread,
Oct 9, 2016, 9:59:09 PM10/9/16
to libFM - Factorization Machines, hzzen...@gmail.com
在 2016年10月10日星期一 UTC+8上午4:12:19,Alexey Renyov写道:
> Hi!
> Check that your features indices starts from 0. That was my problem.
>
>
> On Sunday, October 9, 2016 at 6:34:40 AM UTC+3, hzzen...@gmail.com wrote:I have same problem with you, is it my code has a bug, how do you solve?

Hi!
I make my features indices starts from 0 instead from 1, now the number of feature and matrix V is right . Thanks a lot
I use model to write my own predictor, I find my own prob is not same as the libfm's prob,my code is:
def main():
nfactor = int(os.environ.get('nfactor'))
num_feature = int(os.environ.get('num_feature'))
try:
weight_name = os.environ.get('weight_name')
except:
weight_name = 'weight'
weight = open(weight_name)

weights = weight.readlines()

if(len(weights) != (num_feature + num_feature * nfactor + 1)):
print "weights size and num_feature size is wrong!"
sys.exit()

for line in sys.stdin:
end =""
data = line.strip().split("\t")
clk = data[0]
pv = data[1]
sum_wx = 0.0
sum_wx = float(weights[len(weights)-1].split("\t")[1])
prop = 0.0
# single feature
for i in range(2,len(data)):
featureid = int(data[i].split(":")[0])
value = float(data[i].split(":")[1]) if len(data[i].split(":"))>1 else 1.0
weight_value = float(weights[featureid].split("\t")[1])
sum_wx += weight_value * value
# pair wised interation
for i in range(nfactor):
sum1 = 0.0
sum2 = 0.0
for j in range(2,len(data)):
featureid = int(data[j].split(":")[0])
value = float(data[j].split(":")[1]) if len(data[j].split(":"))>1 else 1.0
n = num_feature + featureid * nfactor + i
weight_value = float(weights[n].split("\t")[1])
sum1 += (weight_value * value)
sum2 += (weight_value * value) * (weight_value * value)
sum_wx += 0.5 * (sum1 * sum1 - sum2)
prop = 1.0/(1.0+math.exp(-sum_wx))
prop = round(prop, 8)
#now print the value we want

print"%s\t%s"%(prop , clk )

is it some wrong? If you can, give me a copy of your code about predict to me
thanks a lot ,my email is :zak...@163.com

Alexey Renyov

unread,
Oct 10, 2016, 2:12:21 AM10/10/16
to libFM - Factorization Machines, hzzen...@gmail.com
Hi!
What optimization algorithm do you use?
I'm asking this because for mcmc you can't replicate libfm results. MCMC uses not only last model state to generate predictions.
Better to try sgd. 

hzzen...@gmail.com

unread,
Oct 10, 2016, 4:36:00 AM10/10/16
to libFM - Factorization Machines, hzzen...@gmail.com
在 2016年10月10日星期一 UTC+8下午2:12:21,Alexey Renyov写道:
Hi!
I use als opt,my sgd doesn't work,
when i use sgd, I met problem like this, https://groups.google.com/forum/#!topic/libfm/O-nAFfaRxvE
Have you ever experienced the same problem?
Reply all
Reply to author
Forward
0 new messages