LDA perplexity - different number of topics

Christoph Winkler

unread,

Nov 12, 2017, 1:42:05 PM11/12/17

to gensim

Hi,

I would liked to compare perplexity values for different number of topics. Currently I use a training set and a test set and I normalize the lower bound of the test set by dividing it by the number of words. Therefore the perplexity value should be comparable for different test sets.

But I've read some posts here that gensim's perplexity value can not be compared to different number of topics.

...
number_of_words = sum(cnt for document in test_corpus for _, cnt in document) # calculate number of words in test set
for k in range(5, 21, 5):
        model = ldamodel.LdaModel(corpus=training_corpus, id2word=dictionary, num_topics=k)
        perplexity = numpy.exp2(-model.bound(test_corpus) / number_of_words) # calculate perplexity
...

topics	perplexity
5	383,145163
10	502,494997
15	675,212526
20	949,664566

The perplexity is increasing but as far as I understand it should decrease. Can someone give a concrete example how to make the results comparable, please?

Ivan Menshikh

unread,

Nov 13, 2017, 12:23:29 AM11/13/17

to gensim

Hi Christoph,

Unfortunately, perplexity isn't a good metric, look to the similar thread.

I advise you to use upstream task and make evaluation end2end (best solution) or try to use topic coherence.

Christoph Winkler

unread,

Nov 13, 2017, 3:40:00 AM11/13/17

to gensim

Thank you for your answer. I know coherence, but I would like to test perplexity (rate of perplexity change), too. What does end2end mean?

Ivan Menshikh

unread,

Nov 13, 2017, 5:08:48 AM11/13/17

to gensim

Standart use case for topic modeling:

- Fit vector representation

- Use this representation as input of another algorithm to solve concrete (supervised) task

I suggest evaluating the quality of final task (instead of the direct quality of topic model).

Reply all

Reply to author

Forward