Call for papers: The Third Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020) @AACL-IJCNLP

46 views
Skip to first unread message

Chao-Hong

unread,
Jul 4, 2020, 5:21:33 PM7/4/20
to ml-...@googlegroups.com

The Third Workshop on Technologies for MT of Low Resource Languages (LoResMT 2020)

AACL-IJCNLP Virtual Event

(December 4-7, 2020)

Scope


In the past few years machine translation (MT) performance has been improved significantly. With the development of new techniques such as multilingual translation and transfer learning, the use of MT is no longer a privilege to users of popular languages. Consequently, there has been an increasing interest in the community to expand the coverage to more languages with different geographical presence, degree of diffusion and digitalization. However, the goal to increase MT coverage for more users speaking diverse languages, is limited by the fact that MT methods demand huge amounts of data to train quality systems, which has posed a major obstacle to building MT systems for low resource languages. Therefore, developing comparable MT systems with relative small datasets is still highly desirable.


In addition, despite the fast developments of MT technologies, MT systems still rely on several NLP tools to pre-process human-generated texts in the forms that are required as input for MT systems and post-process the MT output in proper textual forms in the target language. This is especially true when it comes to systems involving low resource languages. These NLP tools include, but are not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. The performance of these tools has a great impact on the quality of the resulting translation. There is only limited discussion on these NLP tools, their methods, their role in training different MT systems, and their coverage of support in the many languages of the world.


The workshop provides a discussion panel for researchers working on MT systems/methods for low resource and under-represented languages in general. We would like to help review/overview the state of MT for low resource languages and define the most important directions. We also solicit papers dedicated to supplementary NLP tools that are used in any language and especially in low resource languages. Overview papers of these NLP tools are very welcome. It will be beneficial if the evaluations of these tools in research papers include their impact on the quality of MT output.

Topics of Interest

We solicit original research papers, review papers, and position papers on MT research for low resource languages in the workshop. Multilingual and/or cross-lingual NLP tools for low-resource languages are especially welcome. 


- Research and review papers of pre-processing and/or post-processing NLP tools for MT

- Position papers on the development of pre-processing and/or post-processing tools for MT

- Word tokenizers/de-tokenizers for specific languages

- Word/morpheme segmenters for specific languages

- Alignment/Re-ordering tools for specific language pairs

- Use of morphology analyzers and/or morpheme segmenters in MT

- Multilingual/cross-lingual NLP tools for MT

- Re-usability of existing NLP tools for low resource languages

- Corpora creation and curation technologies for low resource languages

- Review of available parallel corpora for low resource languages

- Research and review papers of MT methods for low resource languages

- MT systems/methods (e.g. rule-based, SMT, NMT) for low resource languages

- Pivot MT for low resource languages

- Zero-shot MT for low resource languages

- Fast building of MT systems for low resource languages

- Re-usability of existing MT systems for low resource languages

- Machine translation for language preservation

Important Dates


  • July 3, 2020 – Call for papers released

  • September 11, 2020 – Paper submissions due

  • September 21 - October 9, 2020 – Review period

  • October 23, 2020 – Notification

  • November 6, 2020 – Camera-ready due

  • December – LoResMT workshop

Invited speakers

To be announced...

Shared Tasks

To be announced...

Organizers (listed alphabetically)


Alina Karakanta Fondazione Bruno Kessler

Atul Kr. Ojha DSI, National University of Ireland Galway & Panlingua Language Processing LLP

Chao-Hong Liu Iconic Translation Machines

Jade Abbott Retro Rabbit

Jonathan Washington Swarthmore College

Nathaniel Oco Philippines

Surafel Melaku Lakew Fondazione Bruno Kessler

Tommi A Pirinen University of Hamburg

Valentin Malykh Huawei Noah’s Ark lab and Kazan Federal University

Varvara Logacheva Skolkovo Institute of Science and Technology

Xiaobing Zhao Minzu University of China

Paper submission

There are two types of submissions in the workshop.  For research, review and position papers, the length of each paper should be at least four (4) and not exceed eight (8) pages, plus unlimited pages for references. For system demonstration papers, the limit is four (4) pages. Submissions should be formatted according to the official AACL-IJCNLP 2020 style templates (LaTeX, Microsoft Word, Overleaf). Accepted papers will be published on-line in the AACL-IJCNLP 2020 proceedings and will be presented at the conference either orally or as a poster. 

Submissions must be anonymised and should be done using the Softconf START conference management system at https://www.softconf.com/aacl-ijcnlp2020/LowResMT .

Scientific papers already, or to be, submitted to other venues must be declared as such, and must be withdrawn from the other venues if accepted and published at LoResMT. The review will be double-blinded. 


We would like to encourage authors to cite papers written in ANY language that are related to the topics, as long as both original bibliographic items and their corresponding English translations are provided.


Program committee

  • Ahmad Rashid

  • Alberto Poncelas

  • Alina Karakanta

  • Anoop Kunchukuttan

  • Arturo Oncevay

  • Atul Kr. Ojha

  • Barry Haddow

  • Bogdan Babych

  • Chao-Hong Liu

  • Daria Dzendzik

  • Eva Vanmassenhove

  • Jade Abbott

  • Kalika Bali

  • Koel Dutta Chowdhury

  • Laura Martinus

  • Liangyou Li

  • Majid Latifi

  • Martin Popel

  • Mathias Müller

  • Mehdi Rezagholizade

  • Meghan Dowling

  • Monojit Choudhury

  • Nathaniel Oco

  • Ondrej Bojar

  • Pinkey Nainwani

  • Rico Sennrich

  • Sangjee Dondrub

  • Santanu Pal

  • Shantipriya Parida

  • Surafel Melaku Lakew

  • Tom Kocmi

  • Tommi A Pirinen

  • Tewodros Abebe

  • Thepchai Supnithi

  • Valentin Malykh

  • Vinit Ravishankar

  • Varvara Logacheva

  • Vukosi Marivate

  • Yalemisew Abgaz

Previous editions

LoResMT @ MT Summit 2019

https://sites.google.com/view/loresmt/loresmt-2019 


LoResMT @ AMTA 2018

https://sites.google.com/view/loresmt-2018/


Reply all
Reply to author
Forward
0 new messages