word2vec support for phrases

Amir H. Jadidinejad

unread,

Sep 3, 2014, 4:16:39 AM9/3/14

to gen...@googlegroups.com

Hi All,

I'm playing with word2vec in GenSim to know more about distributed representation models, it's interesting. First I want to thank all contributors.

I want to calculate semantic relatedness between two phrases or sentences, what is the easiest way?
I can download GoogleNews pre-trained model and calculate word similarity but what about phrases or sentences?
Clearly, the problem is that how can I map a phrase to an appropriate vector using word2vec package?

I see the following papers in this field but I really looking for an environment to practically evaluate and learn these models:

Distributed Representations of Sentences and Documents
Distributed Representations of Words and Phrases and their Compositionality

Any comments and suggestions are welcomed.

igor.b...@ucdconnect.ie

unread,

Sep 4, 2014, 10:50:37 AM9/4/14

to gen...@googlegroups.com

In the original code, there's a word2phrase preprocessing step you can do - essentially extracting most common bigrams from text. "language model" becomes "language_model" training is performed as usual. You'll now have a vector for the phrase "language model".

Sentences can be a bit tricky, depending on length, you can get far with just an element wise sum of vectors (instead of individual word similarity you're comparing the resultant sum of word vectors from a sentence). This works well for short sentences, but breaks down for longer ones.

Adam Smith

unread,

Sep 4, 2014, 6:08:01 PM9/4/14

to gen...@googlegroups.com

Looks like the first implementation of `phrase2vec` was recently released on github: https://github.com/zseymour/phrase2vec/commits/master

Adam

Amir H. Jadidinejad

unread,

Sep 4, 2014, 9:16:04 PM9/4/14

to gen...@googlegroups.com

Dear Friends,

What's the difference between phrase2vec and word2phrase?

Unfortunately, there is no documentation in practical point of view.

Thanks.

Amir

Radim Řehůřek

unread,

Sep 5, 2014, 3:27:50 AM9/5/14

to gen...@googlegroups.com

On Friday, September 5, 2014 12:08:01 AM UTC+2, Adam Smith wrote:

Looks like the first implementation of `phrase2vec` was recently released on github: https://github.com/zseymour/phrase2vec/commits/master

It's also been ported to gensim some time ago, but the pull request is not finished:

https://github.com/piskvorky/gensim/issues/204

Your help welcome!

Radim

Ramya Y S

unread,

Jul 10, 2017, 7:22:39 AM7/10/17

to gensim

Hi!
Were you able to compare phrases successfully? Also, which is better for this, word2vec or doc2vec? Appreciate a quick reply!

Thanks and Regards

Reply all

Reply to author

Forward