First off, gensim is beautiful. And I am newbie. I do know statistics but I am new to the world of unsupervised learning, particularly topic modelling.
I am running my own tweaked version of ldamodel which plots me the graph between number of passes and topic diff, so that I may do a trial and error with the number of passes chosen in order to get some kind of convergence with my model.
Things are working all right, but I have a few questions:
1. In some of the associated posts on convergence of model, topic_diff was highlighted as one of the parameters to show convergence of the model, which basically is how different is the new topic distribution with a new chunk, than the one which was created without this chunk earlier. If we are able to get a constant or near-constant topic diff, means the model has converged, Now, this is fine but I am sure there must be other ways to show convergence than let's say putting in value of number of passes to 200 and then check with the graph (matplotlib) what's the optimum number of passes to run the model on specific corpus. I heard of VARMAXTER or something (not sure) from one of the replies by Radim in other posts, however couldn't find it anywhere in gensim. So, any ideas there would be appreciated.
Note: I am running ldamodel on a corpus of around 4k documents. My idea is to keep adding documents and updating the corpus. However, veritably when documents and numbers of passes are fewer gensim gives me a warning asking me either to increase the number of passes or the iterations. This is fine and it is clear from the code as well. Hence, my choice of number of passes is 200 and then checking my plot to see convergence.
2. In some of the replies on related topic, I heard of something like "model converging on xx/xx documents". I have never gotten this kind of log at all in the result of my running the model. Does it mean my model is not converging at all. Or, has it been removed from gensim package. Asking because any statement of this order is not present in any script in gensim package. Any confirmation here would be appreciated.