I am not familiar with python and was hoping I might get a nudge in the right direction tracking down this error I am getting trying to generate and evaluate an LdaModel.
warning (from warnings module):
File "/Library/Python/2.7/site-packages/gensim-0.10.3-py2.7-macosx-10.10-intel.egg/gensim/models/ldamodel.py", line 474
perwordbound = self.bound(chunk, subsample_ratio=subsample_ratio) / (subsample_ratio * corpus_words)
RuntimeWarning: divide by zero encountered in double_scalars
I am using the default parameters to the LdaModel (chunk = 2000, etc) so looking at the code it seems subsample_ratio can't be 0 based on this definition;
subsample_ratio = 1.0 * total_docs / len(chunk)
I assume this means "corpus_words" is somehow evaluating as 0, but with my near 0 knowledge of python I haven't yet deciphered how this line would do that;
corpus_words = sum(cnt for document in chunk for _, cnt in document)
I am not used to the format of the looping statements, but my best guess is this might be the count of words across all docs in the current chunk? If that's the case I would think it would have to encounter 2000 empty documents in order to cause this which doesn't seem likely.
Thanks
Kevin