Coherence values for LDA topic modeling (u_mass)

2,168 views
Skip to first unread message

alis_andr

unread,
May 25, 2021, 8:32:19 AM5/25/21
to Gensim
Hi,

could anyone please help me to figure out how to interpret coherence values based on the u_mass metric? As far I as could find on the Internet, the range of possible values is -14 to 14 (?). Does higher u_mass value = higher coherence? So, +14 is the highest achievable coherence?

I would also really recommend the developers to include this info in the documentation of the gensim package, it seems like a pretty important to skip.

ben.r...@gmail.com

unread,
May 25, 2021, 2:15:54 PM5/25/21
to Gensim
A quick Google search shows that while the formulae for them are well defined, their interpretations are not consistent. My takeaways are: u_mass is easier to calculate but c_v is better correlated with quality of inferred topics. (and yes u_mass should be low, c_v should be high)

As for what value is "good," the best I could find are on stackoverflow:

But if the developer puts these on the website, I'm sure that he'll be deluged by many disagreeing opinions! Maybe he should just refer to this paper and leave it at that

Austen Mack-Crane

unread,
May 26, 2021, 12:17:12 PM5/26/21
to gen...@googlegroups.com
UMass is an average across word pairs (in a topic) of log(p(wi,wj) / p(wj)). This log term will be non-positive because the probability of two words co-occurring is no greater than the probability of one word alone. So UMass is a nonpositive real number, with some meaningless lower bound pertaining to log(epsilon). UMass is greater-is-better because higher values indicate that words in a topic tend to co-occur.

C_v is not very useful IMO in the context of determining n_topics. In my own usage and what I see on the web, it behaves as though lower-is-better -- it sometimes resembles a mirror image of UMass. But the formulation in Roder et al suggests higher-is-better. UMass or C_npmi seem to convey the same information without this interpretive confusion.

--
You received this message because you are subscribed to the Google Groups "Gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gensim/e067257d-3b4f-4b50-9298-1558dad20518n%40googlegroups.com.

ben.r...@gmail.com

unread,
May 27, 2021, 12:33:01 AM5/27/21
to Gensim
What was I saying! I meant u_mass should be high, and c_v should also be high. In my experience also more work (better tokenizing/lemmatizing/NER) gives me higher c_v. I haven't been measuring u_mass recently. Still, according to [1], c_v correlates best with human ratings. What do you find works best to determine the best k=NumberOfTopics?

[1] Roeder, Both and Hinneburg 2015 h ttps://dl.acm.org/doi/abs/10.1145/2684822.2685324

Jonathan Schneider

unread,
Jun 2, 2021, 10:43:03 AM6/2/21
to Gensim
There are some problems associated with c_v as stated by the authors:

It's not recommended to use CV because there are known issues associated with it.

Radim Řehůřek

unread,
Jun 3, 2021, 7:11:32 AM6/3/21
to Gensim
That is interesting – thank you for the links Jonathan! Please keep us posted.

Radim
Reply all
Reply to author
Forward
0 new messages