Dear Authors and everyone,
Thank you for your work.
I have a question related to the data split.
You said that "Specifically, we
randomly sampled questions (∼3–10 per Turker)
from top-contributing turkers,
and categorized all their questions into the train-easy set if an overwhelming percentage in the sample only required
reasoning over one of the paragraphs." (Section 3)
I do not fully understand this part, could you please describe more how do you know that the sample is single-hop or not?
Thank you for your time.