📢 Call for Participation – NADI 2025 Shared Task on Multidialectal Arabic Speech Processing
Hosted as part of ArabicNLP 2025, NADI 2025 brings together researchers in speech and language technologies to tackle some of the most pressing challenges in Arabic speech processing across dialects.
With the growing importance of inclusive, dialect-aware AI systems, this shared task offers a structured platform to advance research across three key subtasks, each supported by curated datasets, baseline models, and evaluation tools.
Subtask 1: Spoken Arabic Dialect Identification (ADI)
Objective: Given a short audio clip, predict the spoken Arabic dialect.
This task builds on prior efforts in dialect identification but leverages recent advances in multilingual speech models (e.g., Whisper, MMS) and robust embedding techniques (e.g., i-vector, x-vector).
Relevance: Dialect ID plays a crucial role in building adaptive ASR systems, conversational agents, and regional NLP pipelines.
Resources: Benchmark dataset, baseline models, and Codabench evaluation.
Subtask 2: Multidialectal Arabic ASR
Objective: Develop Automatic Speech Recognition (ASR) systems capable of transcribing Arabic speech across diverse dialects. Participants will use the Casablanca dataset and are encouraged to explore zero-shot, few-shot, or fine-tuned learning strategies to build robust models that handle phonetic variation and dialectal variations.
Relevance: This task supports advancements in generalizable ASR across under-resourced and linguistically diverse varieties of Arabic.
Resources: Labeled training/dev data, blind test set (Codabench), and baseline systems.
Subtask 3: Diacritic Restoration (DR)
Objective: Restore missing diacritics in Arabic text (and optionally speech) across MSA, Classical Arabic, and dialects.
This task focuses on developing models that generalize beyond MSA to more challenging spoken and code-switched data. Multimodal approaches (speech + text) are encouraged for better supervision.
Relevance: Diacritic restoration improves downstream tasks such as TTS, parsing, and disambiguation in Arabic NLP.
Resources: Annotated test sets, speech/text corpora, and baselines.
🛠️ What We Provide:
Carefully curated datasets across subtasks
Starter code and tutorials
Codabench evaluation platforms for submission and scoring
Clear task guidelines to support both academic and practical experimentation
Whether your focus is on speech recognition, language identification, or Arabic NLP more broadly, NADI 2025 offers an excellent opportunity to test novel ideas and benchmark your systems on real-world data.
📝 How To Participate?
Fill out this form to register and participate: https://forms.gle/WHsyFMtyaewufN7E8
Participate in Codabench (links can be found in the NADI website & will be provided after form submission)
Access each dataset in HuggingFace (links can be found in the NADI website)
🔗 You can find all relevant information on: https://nadi.dlnlp.ai/2025
📨 Contact us: NadiSha...@gmail.com
🧠 Google Group for announcements, Q&A, and discussion: https://groups.google.com/u/4/g/nadi-shared-task-2025
Join us in advancing robust and inclusive Arabic speech technologies.
#NADI2025 #ArabicNLP #ASR #DialectIdentification #DiacriticRestoration #SpeechTechnology #MultidialectalArabic #ArabicSpeech #NLPResearch #SharedTask
--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/e3b2710e-39f4-42df-a470-290ae7062324n%40googlegroups.com.
On Jun 11, 2025, at 5:38 PM, Nate Robinson <n8rro...@gmail.com> wrote:
[CAUTION: Non-UBC Email]
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CAPG4dAi61wHXoy76dUCtT2jDzUnwOMY1rtzbvB%2BNdP2x_j5gYQ%40mail.gmail.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/BB8601A7-3B54-4BEF-B927-BF4944151361%40ubc.ca.
On Jun 11, 2025, at 10:34 PM, Hanan Aldarmaki <hanan.a...@gmail.com> wrote:
[CAUTION: Non-UBC Email]
Dr Abdusalam Nwesri,

Associate
Professor,
Faculty of Information Technology,
University of Tripoli,
P.O.Box: 5760 Hai Alandalus,
Tripoli - Libya.
Tel: +218922307021
Email: a.nw...@uot.edu.ly