* http://radimrehurek.com/gensim/ demonstrates LSA on top of BOW,
while it's LSA on top of TFIDF @
http://radimrehurek.com/gensim/tut2.html#transformation-interface
Also, that page says "Latent Semantic Indexing, LSI (or sometimes LSA)
transforms documents from either bag-of-words or (preferrably)
TfIdf-weighted space into a latent space of a lower dimensionality."
in the discussion "LDA versus LSA for computing document similarities"
more specifically this post: http://comments.gmane.org/gmane.comp.ai.gensim/659
Radim confirms again LDA should be run on BOW, despite the
corresponding official example running LDA on top of TFIDF:
http://radimrehurek.com/gensim/wiki.html#latent-dirichlet-allocation?It,
what's the reason?
I guess other than being somewhat confused by the docs, I guess my
main question is why it's preferred to do lsi on top of tfidf instead
of on bow ?
Dieter