Word2Vec / Prod2Vec - Complementarity calculation in Gensim

188 views
Skip to first unread message

Tjadi Peeters

unread,
Feb 22, 2021, 10:42:03 AM2/22/21
to Gensim
Hi all,

I am trying to recreate the Complementarity and Exchangeability metrics from this paper (https://arxiv.org/pdf/2005.10402.pdf) using a Word2Vec with baskets of products that have been sold using Gensim. An example of a possible input product_corpus and the code:

from gensim.models import Word2Vec 
products_corpus = [['sku a', 'sku b'], ['sku a', 'sku c']] 

model = Word2Vec(product_corpus, window=6, size=100, workers=4, min_count=200, negative=5) 


I am just wondering currently how to go about computing the following using the Gensim model: 

complimentarity_calculation.PNG

If I look at equation 1 and 2, I can see that V and V' are the input and output product vectors:

equation_1_and_2.PNG

Based on: https://stackoverflow.com/questions/41162876/get-weight-matrices-from-gensim-word2vec I found that the input and output vectors are stored in syn0 and syn1neg. I recreated the dot product for these for a combination:
dot_product.png

This is giving me a number > 1 though (which you wouldn't expect for a probability), so I feel like I am making a mistake somewhere. Does someone know what exactly? Thanks in advance!


kazam...@gmail.com

unread,
Feb 22, 2021, 11:15:55 AM2/22/21
to Gensim
Hi

input_vector^T.dot(output_vector) can be over 1 because input_vector and output_vector are not normalized.
The equation(4) doesn't say P(A|B) is equal to v_a^T v_b, but P(A|B) is proportional to v_a^T v_b. 

Masa

2021年2月23日火曜日 0:42:03 UTC+9 tjadi.jes...@gmail.com:
Reply all
Reply to author
Forward
0 new messages