Generating precision and recall for Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA)

310 views
Skip to first unread message

Rooster Tumeng

unread,
Jan 20, 2016, 6:05:11 AM1/20/16
to gensim
Hi Radim,

I am working on my journal to evaluate your LSA and LDA methods for requirements traceability links using precision, recall, and f1 measure. 

Can you provide any hints on how to do so like how you did on http://radimrehurek.com/data_science_python/ using LSA and LDA instead. If you could demonstrate on how to automate precision and recall calculation using same corpus used in Deerwester et al. (1990): Indexing by Latent Semantic Analysis, I would highly appreciate it. 

Regards,
Rooster



Christopher S. Corley

unread,
Jan 20, 2016, 9:39:12 AM1/20/16
to gensim
Hi Rooster,

There's no calculations for those particular measurements built into Gensim.  That guide uses implementations found in Scikit-learn: http://scikit-learn.org/stable/modules/classes.html#classification-metrics

To collect the data required to use those methods, take a look at the tutorials found on http://radimrehurek.com/gensim/tutorial.html

Out of curiosity, what are your subject systems?

Chris.

--
You received this message because you are subscribed to the Google Groups "gensim" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensim+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rooster Tumeng

unread,
Jan 20, 2016, 9:48:10 AM1/20/16
to gensim
Hi Chris,

I am not familiar with Scikit-learn to generate precision and recall, is there any example that is based on http://radimrehurek.com/gensim/tutorial.html to generate precision and recall for understanding purposes?
 
My subject systems are some undergraduate students' systems and previous research project at my university i.e SMART Mushroom House Management System & Robotic Wheelchair System.

Christopher S. Corley

unread,
Jan 20, 2016, 11:23:11 AM1/20/16
to gensim
I suppose an example would depend on the approach and the purpose (i.e., traceability link recovery vs searching for similar documents)
Reply all
Reply to author
Forward
0 new messages