Understand the gensim.models.word2vec model in detail

gladys0313

unread,

Apr 3, 2016, 11:37:02 AM4/3/16

to gensim

Hi everybody! I am a newbie using gensim. Now I am doing a project and I use gensim.models.word2vec to train a word embedding model. In my case, I created some words based on my study and get the model trained. Now I want to fully understand the model itself, to do so I plan to look at the final training error after the model is well trained. If this is possible, then it would be feasible to plot error v.s. parameters, for example, I tune iteration times from 5 to 50, from the plot I can see whether there is an inflection point on performance curve.

I have studied the word2vec.py and I think neule in train_sg_pair and train_cbow_pair is the value I should care, is this correct? I hope I have explained my question clearly, actually I also post a question on stack overflow http://stackoverflow.com/questions/36374414/python-word2vec-understand-the-trained-model-itself-in-detail about this.

Any idea is appreciated and thanks in advance!!

Enzo

unread,

Apr 3, 2016, 7:11:41 PM4/3/16

to gensim

See my comment in stackoverflow.

Gordon Mohr

unread,

Apr 3, 2016, 9:28:04 PM4/3/16

to gensim

You're correct that the NN-training aims to minimize its prediction-task errors over the course of training. However, neither the original word2vec.c code (upon which the gensim implementation is modeled) nor any of the other implementations I've seen make the errors for individual examples, or a full training epoch, easy to access.

The `neu1e` variable you've noted is the error as it back-propagates to the word-representation layer (`syn0`); the error on the output layer (predictions) is a few lines earlier in those methods, `ga` or `gb` (depending on HS/negative branches).

The `score_*_pair` and `score_sentence_*` methods may be of some use to your analysis, as they tally prediction-errors to determine how well new text examples match a trained model's expectations. Note, though, that they're currently only implemented for hierarchical-softmax mode.

- Gordon

gladys0313

unread,

Apr 4, 2016, 3:09:10 AM4/4/16

to gensim

Hey yes, I saw your comments, thanks a lot!

gladys0313

unread,

Apr 4, 2016, 3:15:55 AM4/4/16

to gensim

Hi Gordon, your reply is really clear and helpful, thanks a lot! I'll take a look.

gladys0313

unread,

Apr 6, 2016, 8:04:54 AM4/6/16

to gensim

Dear Gordon,

    I have tried `score_*_pair` and `score_sentence_*` methods. Thanks for your instruction. However, I didn't understand the results very much, although I tried to figure it out by reading some explanation paper, so could you kindly help me:
    1. why all returned values are NEGATIVE?
    2. If negative is correct, then if the model is better (has smaller prediction-task error), the value should be larger or the absolute value should be larger?

    Thank you very much for your help in advance!

Best

On Monday, 4 April 2016 03:28:04 UTC+2, Gordon Mohr wrote:

Reply all

Reply to author

Forward