Hi everyone,
Since I know some of you have questions in this regard, I want to give
another status update.
As I said in the last mail, we are taking this week to complete the
dataset and for quality assurance. The annotations from the crowdworkers
are all in by now, but we really want to make sure they are of decent
quality. So I assume we will also need Monday for this. Sorry, I know
you are waiting eagerly for the full dataset, but I hope a day more of
waiting is still fine.
What you can expect:
- We will provide a training and a validation dataset, both fully
labeled. The training dataset will contain about 6500 arguments, the
validation dataset about 1500 arguments.
- You will get the arguments of the test dataset (about 1500 arguments),
but not the labels. We keep those secret until after the final
submission deadline.
- All datasets will contain a similar proportion of arguments from these
sources:
- above 80% from the IBM argument quality dataset (the main part of
the ACL dataset)
- above 10% from discussions from the Conference on the Future of
Europe [1] (completely new)
- And some (old and new) arguments from the group discussion ideas
web page (also already used in the ACL dataset)
- In addition, we are very excited that we received from the
language.ml
lab a dataset of 300 arguments extracted from a religious text. Thank
you so much! We will provide this as an additional test dataset so that
you can check whether your approach also works on a very different kind
of arguments.
Important: none of the arguments from the acl22 dataset will be in the
test data.
And we do not put hard restrictions on the approaches you can use in
this task. Since we got some questions of whether this or that approach
is okay: if you think the approach makes sense in a real-world scenario,
you can also use that approach in this task.
If you have more questions, please ask.
We will be back with the dataset release announcement on Monday or (very
latest) Tuesday.
Looking forward to your submissions!
Johannes, Milad, Nailia, Maximilian, Henning, and Benno
PS: In case you got interested in this Conference on the Future of
Europe data, there is another upcoming shared task that focuses
exclusively on this data in the context of stance detection:
https://touche.webis.de/clef23/touche23-web/multilingual-stance-classification.html
(also thanks to the organizers of that task to help us in preparing the
data for ValueEval!)
[1]
https://futureu.europa.eu/?locale=en
On 23.11.22 08:35, Johannes Kiesel wrote:
> Hi everyone!
>
> As the final dataset is nearly complete (but we will use the next week
> for quality assurance), we are now opening up the submission system
> (TIRA) for the data that is already available. So you can already test
> the system.
>
> We added more information on TIRA at
https://valueeval.webis.de
>
> But the main facts: TIRA allows you to both upload a file with labels
> ("run file") and submit a Docker image (then hosted on TIRA servers).
> The latter might sound like extra work, but it will allow you boost the
> reproducibility of your approach immensely. It will thus save you the
> time later when you want to publish your approach.
>
> Those of you who registered with a valid (anonymous) team name (nearly
> all registrations) will receive a link to register with TIRA within the
> next hour. Please contact me if you do not.
>
> To test the system, I already submitted two naive systems for the
> current data (a random guesser and "always yes"). I published the
> results on the leaderboard so you can see how it will look like after
> the competition (team "Aristotle" is used for baseline submissions):
>
>
https://www.tira.io/task/valueeval-at-semeval-2023-human-value-detection
>
> Do not be shy to test the system and provide us with feedback. Best
> write in the TIRA forum, so that also the TIRA developers see it:
>
>
https://www.tira.io/c/touche
>
>
>
>
>