We are pleased to announce the launch of the Nuanced Arabic Dialect Identification (NADI) 2026 shared task! This year NADI will focus on a breadth of spoken language processing tasks in diverse conditions to better understand the challenges facing real-world dialectal Arabic speech processing.
The tasks this year include:
Robust & Mixed-Dialect ASR: Evaluate Arabic ASR systems under challenging acoustic and dialectal conditions: noisy real-world recordings, hidden sub-country variation, and within-utterance code-switching.
Cross-domain Spoken Dialect Identification: Classify spoken Arabic into one of 11 country-level dialects under cross-domain conditions, where the test set differs in domain and source from training and development data.
Dialectal Arabic Text-to-Speech: Generate dialectal Arabic speech from text prompts that reflects the pronunciation, rhythm, and prosodic characteristics of the target dialect—establishing evaluation practice for an area where benchmarks remain scarce.
Automatic Speech Translation: Translate from nine Arabic dialect speech inputs into English, focusing on robustness to dialectal variation, spontaneous speech, and real-world speaking conditions.
Spoken Language Understanding: Extract semantic information directly from Tunisian-dialect speech—beyond literal transcription—through intent recognition and slot filling under spontaneous, possibly code-switched conditions.
The registration deadline to register is July 20, 2026. For more details and other important dates please see our website.
NADI-2026 will be presented as part of the ArabicNLP conference, co-located with EMNLP 2026 Oct 24-29 2026.
We hope to see you there!
NADI 2026 Organizing Committee