Using the Genism library, we can load the model and update the vocabulary when the new sentence will be added. That’s means If you save the model you can continue training it later. I checked with sample data, let’s say I have a word in my vocabulary that was previously trained (i.e. “women”). And after that let’s say I have new sentences and using model.build_vocab(new_sentence, update=True) and model.train(new_sentence), the model is updated. Now, in my new_sentence I have some word that already exists(“women”) in the previous vocabulary list and have some new word(“girl”) that not exists in the previous vocabulary list. After updating the vocabulary, I have both old and new words in the corpus. And I checked using model.wv[‘women’], the vector is updated after update and training new sentence. Also, get the word embedding vector for a new word i.e. model.wv[‘girl’]. All other words that were previously trained and not in the new_sentence, those word vectors not changed.
However, I don’t understand the inside depth explanation of how the online training is working. I get the code but want to understand how the online training working in theoretically. Is it re-train the model on the old and new training data from scratch?


--
You received this message because you are subscribed to a topic in the Google Groups "Gensim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gensim/kM8lYl_QjMo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gensim+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gensim/bb20f6fd-4f5b-4443-84fe-267577719302n%40googlegroups.com.