How to normalize syn1neg vectors?

mralex...@gmail.com

unread,

Mar 9, 2019, 5:04:24 PM3/9/19

to Gensim

I trained a gensim model with negative sampling. I want to take a look at the vectors trained (model.vectors and model.trainables.syn1neg). I could norm the first vector by calling init_sims method, however, is there an API that I could normalize the second vector? And ideally, after sufficient training, those two vectors should be close?

Gordon Mohr

unread,

Mar 9, 2019, 6:09:44 PM3/9/19

to Gensim

It's not a common neeed, so there's no API to do that. You could look at the source code for `init_sims()` and mimic what it does to perform your unit-normalization of `syn1neg`. You can review the source in your own local gensim installation, or online:

https://github.com/RaRe-Technologies/gensim/blob/029a1338afc1a243e39500b74aad45da46c0057d/gensim/models/keyedvectors.py#L1314

(Note that the varied raw magnitudes in both `vectors` and `syn1neg` are a necessary part of the model's training state during training and may have some interpretive value for some applications, so don't clobber/replace those values unless you're done training and are sure you won't want the raw magnitudes.)

It's not inherently the case that the traditional word-vector (the 'input' or 'projection' vector in `vectors`) will become "close" to the negative-sampling-mode output-node-weights vector for that same word (in `syn1neg`) after "sufficient training".