Secondary test set available: Nahj al-Balagha

Skip to first unread message

Johannes Kiesel

Jan 4, 2023, 11:15:18 AM1/4/23
Hi all,

We are happy to announce that the second (and secondary) test set for
ValueEval'23 is now available, both in TIRA [1] and (without ground
truth labels) for download on Zenodo [2].

These arguments were compiled and contributed by the team
from and based on the Nahj al-Balagha [3]--thus the argumentation is
very different from the "main" dataset. For example:

Conclusion: Silence is always good
Stance: in favor of
Premise: A wise man's tongue is behind his heart, while a fool's
heart is behind his tongue

We want to encourage you to challenge your approaches and test them
against this dataset (note: if you used Docker submission, this takes
only a minute).

We already tested the BERT-based approach from our ACL paper, the
1-baseline, and the random baseline, and all score considerably lower,
which highlights the general difficulty of this dataset (see attached

Submission to this secondary test dataset is completely optional.
However, we will discuss results on this secondary test dataset in our
task overview paper to highlight approaches that seem especially robust.

Also, we want to inform you that we detected a minor mistake in the
data: for a few arguments, the stance was written "in favour of" instead
of "in favor of". This concerns also ten arguments in the main/primary
test dataset. We now fixed this problem (both in TIRA and Zenodo). In
case your approach employs the stance, please run it again. Sorry for
the inconvenience!

Finally, we want to thank every team that already submitted. It is great
to see how many of you have already a working system. However, there is
still enough time left to build (another) one, if you start soon.

As a reminder: Submission closes on January 24th.

More information on our web page [4].

That's it for now.

Good luck!
The ValueEval Team


Johannes Kiesel

Bauhaus-Universität Weimar
Bauhausstr. 9a, Room 106
99423 Weimar, Germany

Phone: +49 (0)3643 - 58 3720
Reply all
Reply to author
0 new messages