Hi,
Note: The data of the two subtasks is saved in the data folder, BTDATA.zip, which contains the data for the sentence puzzle and word puzzle.
Note: To prevent automatic data crawlers, BTDATA.zip needs a password: brainteaser
Note: The brain teaser was also selected as one of the interesting competitions in SemEval 2024, so we created a split for the Semeval Competition (train/test). The data contained in BTDATA.zip are as follows:
- Semeval Competition
- Training Data
- SP_train.npy (Semeval training data)
- WP_train.npy (Semeval training data)
- Test Data
- SP_test.npy (Semeval test data)
- WP_test.npy (Semeval test data)
- SP_test_answer.npy (Semeval test data answer)
- WP_test_answer.npy (Semeval test data answer)
- EMNLP Zero-Shot Experiment
- sentence_puzzle.npy (on all sentence puzzle data)
- word_puzzle.npy (on all word puzzle data)
Our EMNLP paper results on GitHub are tested on the entire data in a zero-shot manner. In the SemEval2024-Task9, although the whole dataset is the same as our EMNLP paper, we allow people to train on 80% of the whole dataset, and we evaluate the system on the 20% left. The zero-shot evaluation result can be found at https://aclanthology.org/2023.emnlp-main.885/Good luck!
Yifan Jiang