question about default estimation method of gensim's word2vec skipgram

22 views
Skip to first unread message

Sang Won Han

unread,
Oct 4, 2021, 6:54:29 PM10/4/21
to Gensim
Hi,

I am now trying to use word2vec by estimating skipgram embeddings via NCE (noise contrastive estimation) rather than conventional negative sampling method, as a recent paper did (https://asistdl.onlinelibrary.wiley.com/doi/full/10.1002/asi.24421?casa_token=uCHp2XQZVV8AAAAA%3Ac7ETNVxnpqe7u9nhLzX7pIDjw5Fuq560ihU3K5tYVDcgQEOJGgXEakRudGwEQaomXnQPVRulw8gF9XeO paper also attached to this conversation). The paper has a replication github repository (https://github.com/sandeepsoni/semantic-progressiveness), and it mainly relied on gensim for implementing word2vec, but the repository is not well organized and in a mess, so I have no clue about how the authors implemented NCE estimation via gensim's word2vec.

The authors just used gensim's word2vec as a default status without including any options, so my question is what is the default estimation method for gensim's word2vec under skipgram embeddings. NCE? According to your manual,  it just says there is an option for negative sampling, and if set to 0, then no negative sampling is used. But then what estimation method is used?
  • negative (intoptional) – If > 0, negative sampling will be used, the int for negative specifies how many “noise words” should be drawn (usually between 5-20). If set to 0, no negative sampling is used.

Thanks you in advance, and look forward to hearing from you soon!


Best,

Sang Won

Sang Won Han

unread,
Oct 4, 2021, 6:56:17 PM10/4/21
to Gensim
Woops, I forgot attaching the corresponding paper. Here it is. Thanks!

2021년 10월 5일 화요일 오전 7시 54분 29초 UTC+9에 Sang Won Han님이 작성:
Soni, Lerman, Eisenstein_2021(2).pdf
Reply all
Reply to author
Forward
0 new messages