Regards
Paolo
-----
Please consider participating and/or forwarding to appropriate
colleagues and groups.
*****We apologize for the multiple copies of this e-mail*****
CALL FOR PARTICIPATION
IberLEF 2022 Task DETEST: DETEction and classification of racial
Stereotypes in Spanish
This task will take part of IberLEF 2022, the 4th Workshop on Iberian
Languages Evaluation Forum at the SEPLN 2022 Conference, which will be
held in A Coruña, Spain, on September 20th.
The aim of the task is to detect and classify stereotypes in comments
posted in Spanish in response to different online news articles
related to immigration. The task is designed in a hierarchical fashion
by chaining two subtasks and allowing participants to either model the
simple binary scenario (a stereotype is present or not) or complete
the entire pipeline by modeling the complex multi-label classification
problem (different types of stereotypes). Next, a description of both
subtasks is provided:
Subtask 1: Participants tackling this problem will have to determine
whether the comment contains at least one stereotype (positive
example) or none (negative example) considering the full distribution
of labels provided by the annotators based on the proposal of learning
with disagreements. The actual gold label of this subtask is left as a
proxy to determine the subset of comments that will be evaluated in
the posterior subtask.
Subtask 2: This subtask consists of determining whether the comment
contains at least one stereotype or none and assigning those comments
previously marked as positive (with stereotypes) to ten categories
that present immigrants as: 1) ‘victims of xenophobia’, 2) ‘suffering
victims’, 3) ‘economic resources’, 4) a problem of ‘migration
control’, 5) people with ‘cultural and religious differences’, 6)
people which take ‘benefits’ of our social policy, 7) a problem for
‘public health’, 8) a threat to ‘security’, 9) ‘dehumanization’ and
10) ‘other’ types of stereotypes. Since a comment can contain multiple
stereotypes belonging to different categories, this subtask will be
presented as a multi-label hierarchical classification problem.
Although we recommend participating in both subtasks, participants are
allowed to participate just in one of them (e.g., subtask 1).
Teams will be allowed (and encouraged) to submit multiple runs (max. 5).
The present task is proposed to participants interested in racial,
national, or ethnic stereotype detection and classification tasks,
which is a relevant and relatively novel area of research due to its
impact on modern society. Furthermore, the annotated dataset is a
valuable resource for exploratory linguistic analysis, as well as for
comparing the application of deep learning and classical machine
learning models on Spanish stereotyped expressions under the recently
introduced learning with disagreements paradigm. Participants will be
provided with the annotated data by each of the annotators and the
gold standard.
Linguistic resources:
Our DETESTS corpus is made up of two parts – a subset of the
NewsCom-TOX corpus and the StereoCom corpus. Both corpora consist of
comments (at least 50) published in response to manually selected
articles extracted from Spanish online newspapers. The common topic of
all articles is immigration.
The DETESTS corpus consists of 5,628 sentences. We will provide
participants with 70% of the dataset to train their models, while the
remaining 30% will be used to test their models.
To avoid any conflict with the sources of the comments regarding their
intellectual property rights (IPR), the data will be sent privately to
each participant who is interested in the task. The corpus will only
be made available for research purposes.
Important dates (All deadlines are 11:59 PM UTC-12:00):
Training dataset release: March 21, 2022
Test dataset release: April 20, 2022
Systems results: May 16, 2022
Results notification: May 23, 2022
Working papers submission: June 9, 2022
Working papers (peer-)reviewed: June 20, 2022
Camera-ready versions: July 4, 2022
Workshop at IberLEF 2022: September 20, 2022
Task organizers:
Mariona Taulé (Universitat de Barcelona, UB)
Montserrat Nofre (Universitat de Barcelona, UB)
Alejandro Ariza (Universitat de Barcelona, UB)
Wolfgang Schmeisser (Universitat de Barcelona, UB)
Enrique Amigó (Universidad Nacional de Educación a Distancia, UNED)
Paolo Rosso (Universitat Politècnica de València, UPV)
Berta Chulvi (Universitat Politècnica de València, UPV)
Contact:
Contact the organizers by writing to
detests...@gmail.com.
We invite participants to join our Google Groups to be kept up to date
with the latest news related to the task.
For more information, please visit our website
detestsiberlef.wixsite.com/detests.