Data Split (Train-easy)

53 views

Skip to first unread message

Xanh Ho

unread,

May 27, 2020, 11:21:10 AM5/27/20

to HotpotQA

Dear Authors and everyone,

Thank you for your work.

I have a question related to the data split.

You said that "Specifically, we randomly sampled questions (∼3–10 per Turker) from top-contributing turkers,

and categorized all their questions into the train-easy set if an overwhelming percentage in the sample only required reasoning over one of the paragraphs." (Section 3)

I do not fully understand this part, could you please describe more how do you know that the sample is single-hop or not?