[SIGARAB] [Paper Announcement] AL-QASIDA LLM evaluation for Dialectal Arabic

26 views
Skip to first unread message

Nate Robinson

unread,
Aug 27, 2025, 12:17:57 PM (9 days ago) Aug 27
to SIGARAB: Special Interest Group on Arabic Natural Language Processing
Hi SIGARAB colleagues and friends,

📣 In case you missed ACL this year, I wanted to announce AL-QASIDA (Analyzing LLM Quality & Accuracy Systematically In Dialectal Arabic), a comprehensive evaluation of LLM Dialectal Arabic proficiency.

🤖 As many of you have experienced, LLMs often struggle to produce Dialectal Arabic (العامية أو اللهجات). As practitioners attempt to mitigate this, new evaluation methods are needed. AL-QASIDA measures proficiency across four axes (dialectal fidelity, understanding, quality, and diglossia) via cross-lingual, monolingual, and translation eval sets. 

🔍 As part of the eval suite we define a new metric, ADI2, with logits from NADI (Abdul-Mageed et al., 2024) and ALDi (Keleg et al., 2023) models to measure whether LLM responses are both sufficiently dialectal and corresponding to the desired country-level variety.

🧑‍💻 AL-QASIDA can be run from https://github.com/JHU-CLSP/al-qasida with help from our tutorial video: https://youtu.be/_BVEitNmtCI 

📄 In our paper we applied AL-QASIDA to evaluate nine LLMs in eight Arabic varieties across seven text genres, with lots of results analysis. Check it out here! https://aclanthology.org/2025.findings-acl.1137/

💡 Feel free to reach out if you have questions about using AL-QASIDA or ideas about how to improve it!

Best,
--
Nate Robinson
n8rro...@gmail.com

Zaid Alyafeai

unread,
Aug 27, 2025, 4:22:21 PM (8 days ago) Aug 27
to Nate Robinson, SIGARAB: Special Interest Group on Arabic Natural Language Processing
Nice video, one suggestion, do not use conda, it is very slow and takes too much space. I installed all the packages in a few seconds using uv
[project]
name = "al-qasida"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = [
    "datasets==3.3.2",
    "matplotlib",
    "scikit-learn",
    "scipy==1.15.2",
    "transformers==4.50.0",
    "sacrebleu==2.5.1",
    "tiktoken==0.9.0",
    "sentencepiece==0.2.0",
    "google",
    "google-api-python-client",
    "torch==2.6.0",
    "pandas==2.2.3",
    "numpy==1.26.4",
]

Zaid 

--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CAPG4dAif2BWs4PGA3s3c0PrsANvA-nD7g9FoUwfhC%2BMyW0GqCw%40mail.gmail.com.

Kareem Darwish (‫توجيه العقول‬‎)

unread,
Sep 3, 2025, 8:48:08 AM (2 days ago) Sep 3
to SIGARAB: Special Interest Group on Arabic Natural Language Processing
Please consider leaving part of your dataset private and contributing it to BALSAM consortium, a community wide LLM evaluation effort.  I can give you more information if you are interested.
Kareem

Reply all
Reply to author
Forward
0 new messages