New ACL paper on Arabic dialects

58 views
Skip to first unread message

Walid Magdy

unread,
Jun 1, 2025, 4:23:57 PMJun 1
to SIGARAB: Special Interest Group on Arabic Natural Language Processing

SA Arabic NLP community,

 

I would like to bring your attention to our accepted paper to the ACL main conference on Arabic dialects, since I think it is really important to anyone who is working in the Arabic NLP and linguistic domains.

 

In the paper, we examine some of the most common assumptions about Arabic dialects, which have been used before in modelling or building some of the Arabic NLP tasks/resources. For examples, the categorisation of dialects on the regional or country levels and the assumptions that there are some unique clue words exclusive to certain dialects.

 

In our paper “Revisiting Common Assumptions about Arabic Dialects in NLP”, we examine four of these assumptions about Arabic dialects in a controlled quantitative method to find that these assumptions might not be that accurate as thought, and that modelling of some of the tasks in the field using them might be sub-optimal.

 

We hope the paper will help NLP researchers to better have clearer assumptions about Arabic dialects and to be transformative for any future NLP tasks that address them.

 

You can read our pre-print paper in the following link:

https://arxiv.org/pdf/2505.21816

 

We hope to discuss it more here over the email, or in-person at ACL in Vienna inshaAllah

 

Walid

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

Mustafa Jarrar

unread,
Jul 22, 2025, 2:55:52 AMJul 22
to SIGARAB: Special Interest Group on Arabic Natural Language Processing
Thank you for sharing, Walid – it’s great to see dialects becoming increasingly resourced.

We also have an ACL paper presenting a new corpus, Konoos (كنوز), covering 15 dialects across 10 domains (777K tokens) annotated for NER.
Article: https://www.arxiv.org/pdf/2506.12615
Download: https://sina.birzeit.edu/wojood/

Best, Mustafa

Reply all
Reply to author
Forward
0 new messages