Call for Participation - CoLI-Dravidian@FIRE 2024: Word-level Code-Mixed Language Identification in Dravidian Languages

16 views
Skip to first unread message

Sabur B

unread,
Jun 30, 2024, 8:59:57 PM (2 days ago) Jun 30
to Women in Machine Learning

****We apologize for multiple postings of this e-mail**** 

CALL FOR PARTICIPATION 

FIRE 2024 Task - CoLI-Dravidian: Word-level Code-Mixed Language Identification in Dravidian Languages

Held as a shared task in the 16th meeting of Forum for Information Retrieval Evaluation (FIRE 2024)

December 12-15, 2024. DAIICT, Gandhinagar, India

Website: https://sites.google.com/view/coli-dravidian-2024/datasets?authuser=0

Codalab link: https://codalab.lisn.upsaclay.fr/competitions/19357


Dear All,


We are inviting researchers and students to participate in the shared task CoLI-Dravidian: Word-level Code-Mixed Language Identification in Dravidian Languages, which is held as a shared task in the 16th meeting of Forum for Information Retrieval Evaluation (FIRE 2024).


Language Identification (LI) involves detecting the language(s) used in a given text, which is a preliminary step for many applications such as sentiment analysis, machine translation, information retrieval, and natural language understanding. In multilingual India, especially among the youth, social media often features code-mixed text, blending local languages with English at various levels. However, this poses significant challenges for LI, particularly when languages are mixed within a single word. Dravidian languages, extensively spoken in southern India, are under-resourced despite their rich morphological structure. These languages face technological challenges, especially in script representation on digital platforms, leading users to prefer Roman or hybrid scripts for communication. This prevalent code-mixing offers vast linguistic data for research yet remains understudied.


To address word-level LI challenges in code-mixed Dravidian languages, we are conducting a shared task by providing code-mixed datasets for four languages - Kannada, Tamil, Malayalam, and Tulu, to encourage the development of advanced LI models.


There will be a real-time leaderboard, and the participants will be allowed to make a maximum of 10 submissions in the training phase and 5 submissions in the testing phase through CodaLab. Each team will have to select the best submission for ranking.


To download the data and participate, go to: https://codalab.lisn.upsaclay.fr/competitions/19357


Best regards,

The CoLI-Dravidian 2024 Organizing Committee


Important dates


  • 14th June 2024 - open track websites and training data release

  • 1st July 2024– test data release

  • 25th July – run submission deadline

  • 27th July – results declared

  • 27th August – Working notes due

  • 10th September - Reviews 

  • 30th October – Camera-ready copies of working notes

NOTE: All dates mentioned here are in the AoE (Anywhere on Earth) zone.


Organizing Committee


  • Shashirekha Hosahalli Lakshmaiah,  Department of Computer Science, Mangalore University, India.

  • Ameeta Agrawal, Department of Computer Science, Portland State University, USA.

  • Fazlourrahman Balouchzahi, CIC, IPN, Mexico.

  • Asha Hegde, Department of Computer Science, Mangalore University, India.

  • Sabur Butt, IFE, Tecnologico de Monterrey, Mexico.

  • Sharal Coelho, Department of Computer Science, Mangalore University, India.

  • Kavya G, Department of Computer Science, Mangalore University, India.

  • Harshitha, Department of Computer Science, Mangalore University, India.

  • Sonith D, Department of Computer Science, Mangalore University, India.



Sabur Butt, Ph.D. (He/Him)
Institute for the Future of Education (IFE)
Tecnológico de Monterrey, Mexico
Address: Av. Eugenio Garza Sada 2501 Sur Tecnológico, 64849 Monterrey, N.L.
Reply all
Reply to author
Forward
0 new messages