Call for Participation: AmericasNLP 2022 Competition on Speech-to-Text translation
-----------
The
Second AmericasNLP Competition on Speech-to-Text Translation for
Indigenous Languages of the Americas is an official NeurIPS 2022
competition aimed at encouraging the development of machine translation
(MT) systems for indigenous languages of the Americas.
Tracks and Tasks
======
The
overall goal of the AmericasNLP 2022 competition is to develop new
speech-to-text translation technology for Indigenous languages, and
participants are invited to submit systems for the following 3 tasks:
* automatic speech recognition (ASR) for an Indigenous language (Task 1),
* text-to-text translation between an Indigenous language and a high-resource language (Task 2), and
* speech-to-text translation between an Indigenous language and a high-resource language (Task 3, our main task).
Each task has two tracks. Both tracks are equivalent and winners will be awarded the same prizes. The tracks are as follows:
* external data and pre-trained models are allowed (Track 1).
* only publicly available pre-trained models are allowed (Track 2).
Languages
======
The following language pairs are featured in the NeurIPS–AmericasNLP 2022 competition:
* Bribri–Spanish
* Guaraní–Spanish
* Kotiria–Portuguese
* Wa'ikhana–Portuguese
* Quechua–Spanish
For
all pairs (and where applicable), the Indigenous language is the source
language, and the high-resource language is the target language.
How?
======
We
invite submissions of speech-to-text MT results (as well as of results
for the subtasks of ASR and text-to-text translation) obtained by
systems built for Indigenous languages. We provide the training and
evaluation data to the participants. The main metrics of this
competition are ChrF (Popović, 2015) for Tasks 2 and 3 and character
error rate (CER) for Task 1. Participants can submit results for as many
language pairs as they like, but only teams that participate for all
language pairs for a task are entering the official ranking. We provide
an evaluation script and a baseline MT system to help participants get
started quickly:
https://github.com/AmericasNLP/americasnlp2022If you are interested in this competition, please register here:
https://forms.gle/vFVyEq3SQsjBoiGa6The submission details will be announced on the competition’s webpage:
http://turing.iimas.unam.mx/americasnlp/st.htmlPrizes
======
As
long as the best performing systems beat our baselines, the
corresponding teams for each track will be awarded the following prizes:
* Task 1: $500 for the best team
* Task 2: $500 for the best team
* Task 3 (main task): $1000 for the best team, $500 for the second best team, $300 for the third best team
Important Dates
======
Release of pilot data and evaluation script: May 23, 2022
Release of training and development data and baseline systems: June 6, 2022
Release of test input/start of evaluation phase: September 16, 2022
Submission of translations by participants/end of competition: September 30, 2022
Announcements of results: October 4, 2022
Submission of system description papers by the participants: October 14, 2022
Notification of acceptance: October 21, 2022
Camera-ready papers and deadline for competition overview paper: October 31, 2022
Competition track meeting at NeurIPS (virtual event): December 2022
All deadlines will be 11:59 pm UTC -12h ("anywhere on Earth").
Organizers
======
Manuel
Mager, Katharina Kann, Abteen Ebrahimi, Arturo Oncevay, Rodolfo
Zevallos, Adam Wiemerslage, Pavel Denisov, John E. Ortega, Kristine
Stenzel, Aldo Alvarez, Luis Chiruzzo, Rolando Coto-Solano, Hilaria Cruz,
Sofía Flores-Solórzano, Ivan Vladimir Meza Ruiz, Alexis Palmer, Ngoc
Thang Vu
Contact:
americas.nlp.work...@gmail.com