Central word and context

14 views
Skip to first unread message

刘士渤

unread,
Jul 15, 2024, 7:25:01 PM (12 days ago) Jul 15
to Gensim
Excuse me,if I want the window to see the whole document and set it large enough,will the gensim choose every word to be the central word and other words to be context?

Gordon Mohr

unread,
Jul 16, 2024, 2:16:43 PM (11 days ago) Jul 16
to Gensim
See the documentation for the parameters `window` & `shrink_window` for ways to arrange for a much-larger (including full-length-of-text) window where all words have equal weight.

In particular, if `window` is larger than your largest text (and the `Word2Vec`/`Doc2Vec`/`FastText` models only support 10,000-token texts), every word could be considered for every context.

And, if `shrink_windows` is turned off (set to `0` or `False`), then the default technique of always using some random smaller-than-`window` window each context, as an efficient way to weight nearer words more, will be turned off – meaning every word within `window` token-positions will be considered an equal part of the context. 

Note that such large & never-shrunk windows will result in relatively-longer runtimes. 

- Gordon
Reply all
Reply to author
Forward
0 new messages