Simply put, I performed LDA on a document collection and its subsets. Each subset is generated (after the orginial model trained with the complete collection) by filtering out documents of which the max topic weight is less than a certain threshold (sometimes called "low-quality" documents). I tested different threshold values and calculate topic coherence (u_mass and c_v) on resulting models. Here are the results (x-axis is threshold):
#topics = 10

| threshold | #docs |
| 0 | 4095 |
| 0.1 | 4095 |
| 0.2 | 4094 |
| 0.3 | 3865 |
| 0.4 | 3082 |
| 0.5 | 2077 |
| 0.6 | 1337 |
| 0.7 | 780 |
#topics = 30

| threshold | #docs |
| 0 | 4095 |
| 0.1 | 4094 |
| 0.2 | 3982 |
| 0.3 | 3070 |
| 0.4 | 1980 |
| 0.5 | 1169 |
| 0.6 | 647 |
| 0.7 | 363 |
They yield a similar pattern:
- For u_mass, there is a peak, then trends down
- For c_v, it monotonous increases
I know that there are multiple values supported for coherence measure: c_v has the best result, u_mass is faster
But what are the exact differences among these values ('u_mass', 'c_v', 'c_uci', and 'c_npmi')?
How to explain the above-mentioned patterns?
Many thanks!