Hi there,
I have built an document-term-matrix in order to train a LDA model, without using dictionary.doc2bow function. An example of topics formula that I get is as following:
(0, '0.027*"260" + 0.023*"200" + 0.022*"560"), based on num_words=3.
I would like to know what feature (name) each index presents. I know 'id2word'-parameter would do the trick, but I have no dictionary to assign to that parameter. The problem is that my features contain word combinations, e.g. 'pink rose', and I don't want corpora.Dictionary to treat it as separate words.
So I have a list of feature names corresponding to the indices in my document-term-matrix. I would like to let LDA automatically translates the indices in the formula to the feature names. So that if I changed the num_words to=17, it immediately gives me a formula with names instead of indices.
Is there a way to do this? Like, a custom dictionary in the format of corpora?
If it is not possible within gensim, how can I translate the indices 'manually'? I'm not very skilled in python, so I hope someone could help me with this.
Regards,