yifan jiang

Jan 31, 2024, 10:09:23 PMJan 31
to BrainTeaser

The dataset has been released in under data/ folder. Here is a brief introduction to all files contained in that folder.

Note: The data of the two subtasks is saved in the data folder,, which contains the data for the sentence puzzle and word puzzle.

Note: To prevent automatic data crawlers, needs a password: brainteaser

Note: The brain teaser was also selected as one of the interesting competitions in SemEval 2024, so we created a split for the Semeval Competition (train/test). The data contained in are as follows:

  • Semeval Competition
    • Training Data
      • SP_train.npy (Semeval training data)
      • WP_train.npy (Semeval training data)
    • Test Data
      • SP_test.npy (Semeval test data)
      • WP_test.npy (Semeval test data)
      • SP_test_answer.npy (Semeval test data answer)
      • WP_test_answer.npy (Semeval test data answer)
  • EMNLP Zero-Shot Experiment
    • sentence_puzzle.npy (on all sentence puzzle data)
    • word_puzzle.npy (on all word puzzle data)

Our EMNLP paper results on GitHub are tested on the entire data in a zero-shot manner. In the SemEval2024-Task9, although the whole dataset is the same as our EMNLP paper, we allow people to train on 80% of the whole dataset, and we evaluate the system on the 20% left. The zero-shot evaluation result can be found at

Good luck! 
Yifan Jiang
