Retrieving gamma and lambda matrices

Skip to first unread message

Joseph Emmens

Oct 27, 2022, 8:30:12 AM10/27/22
to Gensim
Hello Gensim Team,

Thanks for the amazing package and support you provide, it is invaluable to us lowly researchers who use your tools.

I am estimating an Author Topic model and after the model has converged, I want to retrieve the final values for the hyper parameters. I want the (num of authors * num topics) gamma matrix and the (num of topics * num tokens) lambda matrix. (My ultimate goal is to build the entire (d * v *  a *k) phi matrix, but have decided given that you use the Lee, Seung: “Algorithms for non-negative matrix factorization”, NIPS 2001. approach, it will be easier to get the gammas and lambdas and calculate phi given the optimised values )

Just as a note, I have read over the conversation called "difference between lda.expElogbeta and lda.show_topics ?" but that is from 2013 and the content appears not to be applicable now, if I am wrong about that my apologies.

My question is simply how can I retrieve both matrices gamma and lambda after convergence? 

I have been playing around with

gamma, stats = model.inference(corpus, dictionary, author2doc, doc2author, rhot, collect_sstats=True)

where I set rhot to the final value pre-convergence listed in the log file. But I get a gamma matrices where the number of rows does not match the number of authors in the corpus. I am loading the entire corpus as one chunk. 

Forgive my ignorance, but if there are any materials or previous examples I could go over that would be amazing. 

Thanks again,

Joseph Emmens

Nov 3, 2022, 3:33:18 AM11/3/22
to Gensim
Hey all,

Just as a note if anyone searches anything similar in the future. The answer is very simple.

Using the datapath module from gensim you can save the trained model, and the expElogbeta and gamma matrices simultaneously using

temp_file = datapath("trained_model")

this will give you two numpy files, one for each. While I asked about getting lambda it was only since I wanted to estimate explogbeta, however if you wanted to get lambda explicitly, check out the get_lambda() and get_logEbeta() under the lda_model, which the author-topic model imports.
Reply all
Reply to author
0 new messages