On Feb 20, 8:37 am, PaulR <
p...@rudin.co.uk> wrote:
> What does the proportion of documents converging tell me when trying
> to train an LDA model?
The online mini-batch training doesn't happen until complete
convergence. Rather, only until the variational parameters stop
changing much, where "much" is by default self.VAR_THRESH==0.001. Now
in degenerate cases this could take a long time, so there is also a
"force switch" to stop training after self.VAR_MAXITER==50, even if
gamma still keeps changing.
> At the end of a run (200k documents) I see "651/1000 documents
> converged with 50 iterations". Presumably this is not a good thing -
> we should be hoping for 1000/1000 or something close?
Yes.
> Would tweaking some of the parameters help? Does the convergence
> failure indicate something about my data? Would just using more data
> help?
Yes. Yes. And yes. :)
You can 1) increase self.VAR_MAXITER -- just set it to a higher value,
`lda = LdaModel(corpus=None, ..); lda.VAR_MAXITER = 100;
lda.update(corpus)`. The longer your documents are (=more unique
words), the higher you can set self.VAR_MAXITER. Or 2), like you say
just give more training data.
Both 1) and 2) actually work in a similar way. Seeing more documents
with a similar structure via 2) is a bit like spending extra time on
the same documents via 1).
But 2) is more flexible in that if the slowly-converging batch is an
out-lier in the overall online data stream, you won't waste so much
time on it.
HTH,
Radim