Dear Dr. Kovatchev,
Many thanks for your interest and participation! Replies to your questions below:
1a. You may ask as many questions as you like per paragraph - there will probably be some sweet spot where you gain information about how the model handles each paragraph with each question asked, but with each additional question it becomes harder to come up with new challenging questions. You may not ask duplicate questions or questions that would be considered close paraphrases of ones you have already fooled the model with. You are allowed to "skip" paragraphs using the "Switch to next context" button.
1b. Paragraphs are selected at random.
2. Yes, that is correct. You are not allowed to retract questions, so every question you submit from your "DADC-affiliated" account counts.
3. It is also possible (and encouraged) to write about strategies used in tracks 1 and 2 - particularly for track 2 but if you identify any interesting research outcomes from track 1 then that is also encouraged.
4. No, the answer has to be a single continuous span of text from the paragraph. We plan to expand on this in the future, but for the time being we are sticking with the standard single-span extractive QA setting. For more information on what is and is not acceptable, please refer to:
https://dadcworkshop.github.io/shared-task.html#validation-instructions5. The base criteria we will use for evaluation is that "a human who reads the question should select the same answer you did". I'm not sure exactly what you mean by multiple questions, but you can definitely ask multi-hop questions e.g. "Who is the father of the father of X?"
6a. This is the data gathered from all participants. It will be the same evaluation set for all models trained on the track 2 data.
6b. You have to work "blind" - for the actual task we will use part of the track 1 data for validation and model selection. I would suggest that a combination of the SQuAD v1.1 dev set and the AdversarialQA dev set will probably be most representative of the expected question distribution from track 1.
Hope this helps and let us know if you have any other questions!
Thanks,
Max