Aggregating via weighted sum

10 views

Skip to first unread message

mangled data

unread,

Jan 6, 2022, 4:01:01 PM1/6/22

to Gensim

I have a product catalogue that I am trying to model using gensim word2vec. The product is laid out as [product name, product description, seller name etc.] All fields are plain text for my use case. Each of these features may carry different weights (for example, incoming string might say "laptop from seller X", which means I need to weigh seller name more).

Appreciate if anyone has inputs on designing this. My approach is below

#1) Build a word2vec model for product name, description, seller name etc - so multiple models for each of those features.

#2) For a given input, from each of the models above, arrive at a cosine similarity. Then simply do a weighted sum of these cosine similarity and use that to rank the document.

Does this make sense or are there better approaches ? I do have the issue of defining weights - but even if I somehow arrive at that, wonder if the approach makes sense. For example, may be I should use cosine distance instead of cosine similarity during averaging ?

Thanks

Kris

Reply all

Reply to author

Forward

0 new messages