Notes on the end of evaluation phase and up-coming post-evaluation phase

88 views
Skip to first unread message

yifan jiang

unread,
Jan 31, 2024, 4:12:52 PMJan 31
to BrainTeaser
Thank you for your participation in the evaluation phase of BrainTeaser. Here are some notes for the end of the evaluation and upcoming post-evaluation phases:

  1.  The evaluation phase will end on Jan 31st at 11 p.m. (UTC), and we will save the leaderboard result afterwards. The post-evaluation will start after the end of the evaluation phases.
  2.  We will release a Google form on the leaderboard to gather related information for each team. Each participant can declare whether they adopt the zero-shot method in the Google form.
  3. The leaderboard in the evaluation phase will be auto-migrated to the post-evaluation phases and become public.
  4. Participants can score “contrastive runs” that can be included in the analysis in system description papers.  In other words, participants can get result feedback (e.g. accuracy) from the leaderboard as practice phase. The paper submission deadline is Feb 19, 2024.
  5. We will release the whole dataset for further error analysis and case analysis. The whole dataset can be available on our EMNLP paper (https://aclanthology.org/2023.emnlp-main.885.pdf) after the evaluation phase.  In the original paper, the whole brainteaser dataset was designed for evaluation only, and we split data into train/dev/test to fit the SemEval competition. We kindly suggest that the participant go through that paper. The paper contains the details of data construction, evaluation, and error analysis, which can bring more insight into the dataset and be helpful for paper writing.

Feel free to contact me or leave message for any possible concern and questions. Good luck!

yifan jiang

unread,
Jan 31, 2024, 6:48:09 PMJan 31
to BrainTeaser
A follow up update,

As the auto- migration function on Codalab is not working well, we have released the leaderboard result of evaluation phase.

Best,
Yifan Jiang

Reply all
Reply to author
Forward
0 new messages