...
number_of_words = sum(cnt for document in test_corpus for _, cnt in document) # calculate number of words in test set
for k in range(5, 21, 5):
model = ldamodel.LdaModel(corpus=training_corpus, id2word=dictionary, num_topics=k)
perplexity = numpy.exp2(-model.bound(test_corpus) / number_of_words) # calculate perplexity
...
| topics | perplexity |
| 5 | 383,145163 |
| 10 | 502,494997 |
| 15 | 675,212526 |
| 20 | 949,664566 |