I'm using gensim to create text summaries. I would like to extract the TextRank which gensim.summarization.summarizer calculates from the summarization method.
This would help to visualize important blocks from a text.
Unfortunately, this is not exposed as an API function. I only can access the most_important_docs via the following code:
corpus = gensim.summarization.summarizer._build_corpus(sentences)
most_important_docs = gensim.summarization.summarizer.summarize_corpus(corpus, ratio=1)
Most_important_docs contains then a list of lists of tuples which seem to identify words in the corpus, something like this:
<class 'list'>: [[(3, 1), (4, 1)], [(3, 1), (7, 1)], [(3, 1)], [(3, 1)], [(3, 1)], [(3, 1), (5, 1)], [(3, 1), (6, 1)], [(0, 1)], [(1, 1), (2, 1)], [(8, 1)]]
I'm not able to make sense of this encoding and format the sentences again.
Would it be possible to expose this function of reconstructing the sentence with its TextRank as a function?
Or is there another possibility to determine this?
Thanks in advance!
Best regards,
Philip Gillißen