gensim.models.word2vec: Is it input vector or output vector when I use model[#the word of interest#]

gladys0313

unread,

Apr 4, 2016, 5:09:45 AM4/4/16

to gensim

Hi all, just as I said in the subject, when I use gensim.models.word2vec, I know I can use model[#the word or interest#] to check the word vector. However, I don't know whether this is the input vector or output vector of the word. Is it input vector when I use SG model and output vector when I use CBOW model? Or what else?
Any idea is appreciated, thanks!

Gordon Mohr

unread,

Apr 4, 2016, 11:03:07 PM4/4/16

to gensim

As per the original Word2Vec papers & word2vec.c implementation, looking up a word returns the vector that contributes to the *inputs* to the neural-network. (In skip-gram, one word's vector is the entire NN input for a single training example; in CBOW, many words' vectors are summed or averaged to form the NN input for a single training example.)

Some have observed that a Word2Vec model has output representations of words that can also be useful. In particular, <http://research.microsoft.com/apps/pubs/default.aspx?id=260928> suggests such vectors are more reflective of topical similarity, and useful for determining document 'aboutness'. (Separately, <https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/> suggests that another way to get word vectors reflecting broad-topicality, rather than focused-interchangeability, is to use larger context windows.)

Such output-vectors are most clearly identifiable inside the model when using negative-sampling, where each predictable word has its own output-node with `vector_size` weights leading in. There's no formal gensim API for accessing those by string-key, but you could manually pull them from `model.syn1neg`, using the same word-to-index mappings as are used to find the `syn0` vectors. (I don't know any similarly-tidy way to pull such an 'output vector' for individual words from a hierarchical-softmax model. There might be a way to calculate one, but I'm not sure of the practicality/utility of such an exercise.)

- Gordon

gladys0313

unread,

Apr 5, 2016, 2:46:59 AM4/5/16

to gensim

Thanks, Gordon, this is very helpful!

Best
Jing

Bhaskar Mitra

unread,

Apr 14, 2016, 12:26:10 AM4/14/16

to gensim

I am one of the co-authors on the paper you cited. Just want to point out that it's the combination of the input and output embeddings (i.e., the IN-OUT cosine similarity) that produces the more topical notion of relatedness.

Cheers,

Bhaskar

bob.2...@gmail.com

unread,

Oct 12, 2017, 5:40:19 PM10/12/17

to gensim

Hi Bhaskar,

Does this mean the same word ? e.g., the CosineSimiliary(input vec of word_i, output vec word_i) ?

Further, how does this cosine similarity from "one word" reflects the topic of the whole paragraph ? Is there any institutional illustration ?