ValueEval: Status Report

Johannes Kiesel

unread,

Nov 9, 2022, 3:58:46 AM11/9/22

to valu...@googlegroups.com

Hi everyone!

Since we got some questions in this regard, we would like to share some
information with you on the current status of the task.

Thankfully, several groups committed to provide us with raw data for the
task. We are pretty excited about this, as this allows us to have a
dataset for this task that is more diverse (in terms of topics and
argumentation) than what we would be able to accomplish alone. However,
it also comes with additional work on our side. Some data has to be
post-processed to make it fit structurally to what we already have. So
we are currently running a pipeline of pre-processing, crowdsourcing
annotations, and quality assurance. We expect to have the data ready at
the end of November.

The task will focus on what is referred to as "level 2" or "value
categories". We will also provide "level 1" annotations for the training
set, and after the deadline for the test set, but for SemEval we decided
to focus on the 20 value categories.

We expect to open the submission system next week. This task will use
https://www.tira.io/ for submissions. We are in close collaboration with
the organizers of the SemEval Task on Clickbait Spoiling, who are
currently running final stress tests on the system. In short: you will
be able to either submit a run file or a Docker container that runs your
system. We are very excited about the latter, as it allows you to submit
a system that other researchers can then directly employ in their
research. The TIRA team is currently developing a Python library to make
all submitted systems instantly available.

So you will soon be able to get familiar with the TIRA system using the
data from our ACL paper (available or our web page), and we will add the
extended training set once it is done. We are planning for an "early
bird" submission in December, that is you will be able to submit already
one run (or preferably Docker container!) on the test set and get a
final score. Independent of whether you make it for early bird, you will
be able to submit up to 4 (more) runs until January 24th. Since
preparing the dataset takes a bit longer than expected, we now postponed
the early bird submission deadline to December 16th.

Thank you all so much for your interest in this task!
Johannes, Milad, Nailia, Maximilian, Henning, and Benno

Johannes Kiesel

unread,

Nov 23, 2022, 2:35:39 AM11/23/22

to valu...@googlegroups.com

Hi everyone!

As the final dataset is nearly complete (but we will use the next week
for quality assurance), we are now opening up the submission system
(TIRA) for the data that is already available. So you can already test
the system.

We added more information on TIRA at https://valueeval.webis.de

But the main facts: TIRA allows you to both upload a file with labels
("run file") and submit a Docker image (then hosted on TIRA servers).
The latter might sound like extra work, but it will allow you boost the
reproducibility of your approach immensely. It will thus save you the
time later when you want to publish your approach.

Those of you who registered with a valid (anonymous) team name (nearly
all registrations) will receive a link to register with TIRA within the
next hour. Please contact me if you do not.

To test the system, I already submitted two naive systems for the
current data (a random guesser and "always yes"). I published the
results on the leaderboard so you can see how it will look like after
the competition (team "Aristotle" is used for baseline submissions):

https://www.tira.io/task/valueeval-at-semeval-2023-human-value-detection

Do not be shy to test the system and provide us with feedback. Best
write in the TIRA forum, so that also the TIRA developers see it:

https://www.tira.io/c/touche

Looking forward to your submissions!

Johannes, Milad, Nailia, Maximilian, Henning, and Benno

--
Johannes Kiesel

Bauhaus-Universität Weimar
Bauhausstr. 9a, Room 106
99423 Weimar, Germany

Phone: +49 (0)3643 - 58 3720

Johannes Kiesel

unread,

Nov 23, 2022, 11:39:43 AM11/23/22

to valu...@googlegroups.com

Hi everyone!

Thank you to all who already registered with TIRA! I forgot to mention
in the TIRA mail: When you are participating as a team (like most), one
of you can register with the invite link you received. Then please drop
me a mail to make your account the team owner (called "group owner" in
TIRA), which will then allow you to invite the others.

Regards,
Johannes

Johannes Kiesel

unread,

Dec 2, 2022, 11:27:19 AM12/2/22

to valu...@googlegroups.com

Hi everyone,

Since I know some of you have questions in this regard, I want to give
another status update.

As I said in the last mail, we are taking this week to complete the
dataset and for quality assurance. The annotations from the crowdworkers
are all in by now, but we really want to make sure they are of decent
quality. So I assume we will also need Monday for this. Sorry, I know
you are waiting eagerly for the full dataset, but I hope a day more of
waiting is still fine.

What you can expect:
- We will provide a training and a validation dataset, both fully
labeled. The training dataset will contain about 6500 arguments, the
validation dataset about 1500 arguments.
- You will get the arguments of the test dataset (about 1500 arguments),
but not the labels. We keep those secret until after the final
submission deadline.
- All datasets will contain a similar proportion of arguments from these
sources:
- above 80% from the IBM argument quality dataset (the main part of
the ACL dataset)
- above 10% from discussions from the Conference on the Future of
Europe [1] (completely new)
- And some (old and new) arguments from the group discussion ideas
web page (also already used in the ACL dataset)
- In addition, we are very excited that we received from the language.ml
lab a dataset of 300 arguments extracted from a religious text. Thank
you so much! We will provide this as an additional test dataset so that
you can check whether your approach also works on a very different kind
of arguments.

Important: none of the arguments from the acl22 dataset will be in the
test data.

And we do not put hard restrictions on the approaches you can use in
this task. Since we got some questions of whether this or that approach
is okay: if you think the approach makes sense in a real-world scenario,
you can also use that approach in this task.

If you have more questions, please ask.

We will be back with the dataset release announcement on Monday or (very
latest) Tuesday.

Looking forward to your submissions!
Johannes, Milad, Nailia, Maximilian, Henning, and Benno

PS: In case you got interested in this Conference on the Future of
Europe data, there is another upcoming shared task that focuses
exclusively on this data in the context of stance detection:
https://touche.webis.de/clef23/touche23-web/multilingual-stance-classification.html
(also thanks to the organizers of that task to help us in preparing the
data for ValueEval!)

[1] https://futureu.europa.eu/?locale=en

On 23.11.22 08:35, Johannes Kiesel wrote:

> Hi everyone!
>
> As the final dataset is nearly complete (but we will use the next week
> for quality assurance), we are now opening up the submission system
> (TIRA) for the data that is already available. So you can already test
> the system.
>
> We added more information on TIRA at https://valueeval.webis.de
>
> But the main facts: TIRA allows you to both upload a file with labels
> ("run file") and submit a Docker image (then hosted on TIRA servers).
> The latter might sound like extra work, but it will allow you boost the
> reproducibility of your approach immensely. It will thus save you the
> time later when you want to publish your approach.
>
> Those of you who registered with a valid (anonymous) team name (nearly
> all registrations) will receive a link to register with TIRA within the
> next hour. Please contact me if you do not.
>
> To test the system, I already submitted two naive systems for the
> current data (a random guesser and "always yes"). I published the
> results on the leaderboard so you can see how it will look like after
> the competition (team "Aristotle" is used for baseline submissions):
>
> https://www.tira.io/task/valueeval-at-semeval-2023-human-value-detection
>
> Do not be shy to test the system and provide us with feedback. Best
> write in the TIRA forum, so that also the TIRA developers see it:
>
> https://www.tira.io/c/touche
>
>
>
>
>

Reply all

Reply to author

Forward