Competition data avaiable

72 views
Skip to first unread message

yifan jiang

unread,
Jan 31, 2024, 10:09:23 PMJan 31
to BrainTeaser
Hi,

The dataset has been released in https://github.com/1171-jpg/BrainTeaser under data/BTDATA.zip folder. Here is a brief introduction to all files contained in that folder.

Note: The data of the two subtasks is saved in the data folder, BTDATA.zip, which contains the data for the sentence puzzle and word puzzle.

Note: To prevent automatic data crawlers, BTDATA.zip needs a password: brainteaser

Note: The brain teaser was also selected as one of the interesting competitions in SemEval 2024, so we created a split for the Semeval Competition (train/test). The data contained in BTDATA.zip are as follows:

  • Semeval Competition
    • Training Data
      • SP_train.npy (Semeval training data)
      • WP_train.npy (Semeval training data)
    • Test Data
      • SP_test.npy (Semeval test data)
      • WP_test.npy (Semeval test data)
      • SP_test_answer.npy (Semeval test data answer)
      • WP_test_answer.npy (Semeval test data answer)
  • EMNLP Zero-Shot Experiment
    • sentence_puzzle.npy (on all sentence puzzle data)
    • word_puzzle.npy (on all word puzzle data)

Our EMNLP paper results on GitHub are tested on the entire data in a zero-shot manner. In the SemEval2024-Task9, although the whole dataset is the same as our EMNLP paper, we allow people to train on 80% of the whole dataset, and we evaluate the system on the 20% left. The zero-shot evaluation result can be found at https://aclanthology.org/2023.emnlp-main.885/

Good luck! 
Yifan Jiang
Reply all
Reply to author
Forward
0 new messages