I'm just starting out on my journey with word2vec. I have a fairly large corpus (~6m lines/sentences) of text that I am using to train a word2vec model.
I have an idea that I can use the model to infer words from a string of untokenized text, e.g: given "astringoftext", I will be able to generate "a string of text", as the most likely tokenization of the string. The way I see this working is to evaluate every possible tokenization of a string and keep the one with the best score. For example, "astrin gof te xt" will not score as well as "a string of text".
The way I plan to calculate the score, is to calculate the probability of every word/token, given a context. So for example, the probability of "string", given the context ["a", "of", "text'], the probability of "string" given ["a", "oftext"] etc And then keep the tokens with the highest probability that make a valid tokenization of the input string.
I figure word2vec can help me with this problem, because of the way word2vec trains - it calculates probabilities of words given a context.
However, I'm struggling to work out how I can use the trained model to give me the probability of a word given a context... I get that a word2vec model can identify synonyms, related concepts, and analogies. But I guess my question boils down to, how do I access the trained hidden layer(?), so I can compute probabilities of potentially previously unseen words, and thus the probability of a word given a context? - if that makes sense?
Thanks