Hi all,
We split the questions into individual sentences. For each sentence, we first retrieve top-10 Wikipedia articles over whole Wikipedia using TFIDF scoring. Then insider these articles, we retrieve top-10 paragraphs with TFIDF scoring as candidates. After that we use TAGME to extract all entities linked to Wikipedia titles for each retrieved paragraph.
Hopefully this resource could help build machine reading based models!
Best,
Chen