Dear colleagues,
I would like to express how proud I am of the strong participation and visibility of the Arabic NLP community at LREC 2026.
This year’s contributions covered a remarkable range of topics, including speech technologies, dialect processing, multimodal reasoning, morphology, lexicons, essay scoring, translation, cultural QA, and Arabic benchmarks for LLMs. The accepted papers and workshops reflect both the depth and diversity of our growing community. I provide a summary list of the main conference papers below that I was able to identify. If I missed any, please respond to this email to let us know about your work.
It was especially exciting to see the continued success of the OSACT workshop series, as well as the organization of Nakba NLP 2026. Equally inspiring was the participation of many members of the Arabic NLP community in the main conference and in workshops on topics other than Arabic NLP.
Congratulations to everyone who participated in making this happen. The Arabic NLP community continues to make an important and visible impact on the global NLP landscape.
For those who will be in Palma, let's get together!
Best regards,
ADAB: Arabic Dataset for Automated Politeness Benchmarking - a Large-Scale Resource for Computational Sociopragmatics — Hend Al-Khalifa, Nadia Ghezaiel, Maria Bounnit, Hend Hamed Alhazmi, Noof Abdullah Alfear, Reem Fahad Alqifari, Ameera Masoud Almasoud, and Sharefah Ahmed Al-Ghamdi
Arabic ChartSumm: An English-to-Arabic Benchmark for Metadata-to-Text Summarization — Passant Elchafei and Amany Fashwan
Efficient Adaptation of English Language Models for Morphologically Rich and Underrepresented Languages: The Case of Arabic — Ahmed Samy Eldamaty, Mohamed Maher Zenhom Abdelrahman, Mohamed Mostafa Ibrahim Elbehery, Mariam Ashraf, and Radwa Elshawi
Are LLMs Good Text Diacritizers? An Arabic and Yoruba Case Study — Hawau Olamide Toyin, Samar Mohamed Magdy, and Hanan Aldarmaki
Corruption-Based Data Augmentation for Arabic Essay Scoring: A Preliminary Study on the Organization Trait — May Saed Bashendy and Tamer Elsayed
Ramsa: A Large Sociolinguistically Rich Emirati Arabic Speech Corpus for ASR and TTS — Rania Al-Sabbagh
AraHopeCorpus: Annotation Guidelines and Dataset for Hope Speech in Arabic Social Media Crisis Discourse — Esra'a Ahmad Sharqawi and Wajdi Zaghouani
Audience Engagement with Arabic Women's Social Empowerment and Wellbeing: A Decadal Corpus — Wajdi Zaghouani, Mabrouka Bessghaier, Md. Rafiul Biswas, and Shimaa Amer Ibrahim
Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse — Aisha Ali Al-Athba and Wajdi Zaghouani
ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination — Wajdi Zaghouani, Shimaa Amer Ibrahim, Mabrouka Bessghaier, and Houda Bouamor
ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization — Wajdi Zaghouani, Kais Attia, Md. Rafiul Biswas, and Fadhl Eryani
JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media — Wajdi Zaghouani, Shimaa Amer Ibrahim, Mabrouka Bessghaier, and Houda Bouamor
Structured Prompting for Arabic Essay Proficiency: A Trait-Centric Evaluation Approach — Salim Al Mandhari, Hieu Pham Dinh, Mo El-Haj, and Paul Rayson
TDMulti: A Tunisian Dialect-Modern Standard Arabic Multitask Corpus with a Context-Aware Cross-Attention BERT Model — Roua Torjmen and Kais Haddar
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark — Sara Ghaboura, Shubham Patle, Ketan More, Wafa Hamad Mohamed Alghallabi, Omkar Thawakar, Jorma Laaksonen, Hisham Cholakkal, Salman Khan, and Rao Anwer
Benchmarking Arabic Authorship Attribution and Style Transfer with Large Language Models — Injy Hamed, Bashar Alhafni, Nizar Habash, and Thamar Solorio
A Comprehensive Full-Form Lexicon for Arabic NLP and Speech Technology — Yannis Haralambous and Jack Halpern
DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models — Malik H. Altakrori, Nizar Habash, Teresa Lynn, Younes Samih, Abed Alhakim Freihat, Kirill Chirkunov, Muhammed AbuOdeh, Radu Florian, Preslav Nakov, and Alham Fikri Aji
WhiteHouse: Translation of the Casablanca Corpus for Multi-dialectal Arabic Speech Translation — Fethi Bougares, Salima Mdhaffar, and Yannick Estève
Mu'jam Arriyadh: A Comprehensive Lexicon for Contemporary Arabic Language — Afrah A. Altamimi, Abdulrahman Alosaimy, Halah Munif Alharbi, Hawra Aljasim, Muneera Alhoshan, Amal Almazrua, Hanan Alharbi, Abdulrahman Saeed Alshehri, Bayan M. Almuqhim, Maryam H. Algarny, Yahya A. Asiri, Abdullah I. Alharbi, Saleh Zaidan Albalawi, Fawziah Mohammed Asiri, Sara Ali Alhifthi, and Abdullah Alfaifi
Saudi ASWAT: A Large-Scale Corpus of Spontaneous Saudi Arabic Speech — Abdullah I. Alharbi, Afrah A. Altamimi, Muneera Alhoshan, Amal Almazrua, Halah Munif Alharbi, Bayan M. Almuqhim, Hawra Aljasim, Abdulrahman Alosaimy, Yahya A. Asiri, and Abdullah Alfaifi
A Bilingual Bimodal Benchmark for Arabic-English NLP across Grammatical Correction, Essay Scoring, Morphological Tagging, and Speech Recognition — Bashar Alhafni, Injy Hamed, Fadhl Eryani, David Palfreyman, and Nizar Habash
A Large and Balanced Multi-Domain Arabic Corpus Annotated for Morphology, Syntax, and Readability — Khalid N. Elmadani, Adel Mahmoud Wizani, Hanada Taha Thomure, and Nizar Habash
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants — Hunzalah Hassan Bhatti and Firoj Alam
Morphemes without Borders: Evaluating Root–Pattern Morphology in Arabic Tokenizers and LLMs — Yara Yousif Alakeel, Chatrine Qwaider, Hanan Aldarmaki, and Sawsan Alqahtani
Masrad: Arabic Terminology Management Corpora with Semi-Automatic Construction — Mahdi Nasser, Laura Sayah, and Fadi Zaraket
--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CAFfBGVnk6dMq35vv%3DAfPkXtjesjkbR-56m6%2BBGivGpG5k-3yjQ%40mail.gmail.com.
--
--
Dear Nizar,
Thank you very much for this wonderful message and for highlighting the strong presence of the Arabic NLP community at LREC 2026.
It is truly encouraging to see Arabic NLP continuing to grow in visibility across LREC, ACL, and other major NLP venues. The breadth of contributions this year clearly shows how active and diverse the community has become, from core linguistic resources and dialect processing to multimodal models, retrieval, speech, translation, and Arabic LLM evaluation. We are really proud to be part of this momentum.
I would also like to share some of our recent work from the NAMAA Community team, which will appear through NakbaNLP and OSACT7 at LREC 2026:
The NAMAA team has been working actively on several Arabic NLP and Arabic multimodal AI directions, and we would be very happy for colleagues in the community to check out the work, share feedback, and explore possible collaborations.
Congratulations again to everyone contributing to this exciting progress. Looking forward to seeing many of you in Palma.
Best regards,
Omer Nacar
--
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CABPE6JMawu%2Bm6%2B%3DHqonJRHxupSU_h7fBMz%3DF90WM3WMGoQeBbg%40mail.gmail.com.
| [CAUTION: Non-UBC Email] |
Thank you Nizar and everyone. Indeed, this looks like a very productive LREC 2026 for the Arabic NLP community.
From our side, MARSAD Lab will be participating in both the LREC main conference and associated workshops with contributions spanning multilingual NLP, computational social science, LLM safety, political discourse, affective computing, low-resource languages, and responsible AI (All papets are listed below). I look forward to meeting many of you attending LREC this year.
• Zaghouani, W.; Biswas, M. R.; Bessghaier, M.; Ibrahim, S. A.; & Mikros, G. (2026). ClimateChat-300K: A Multi-Modal Facebook Dataset for Understanding Diverse Perspectives in Climate Communication. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Sharqawi, E. A.; & Zaghouani, W. (2026). AraHopeCorpus: Annotation Guidelines and Dataset for Hope Speech in Arabic Social Media Crisis Discourse. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Zaghouani, W.; Bessghaier, M.; Biswas, M. R.; & Ibrahim, S. A. (2026). Audience Engagement with Arabic Women’s Social Empowerment and Wellbeing: A Decadal Corpus. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Ali Al-Athba, A.; & Zaghouani, W. (2026). Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Zaghouani, W.; Ibrahim, S. A.; & Bouamor, H. (2026). ArabDiscrim: A Decade-Long Arabic Facebook Corpus on Racism and Discrimination. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Zaghouani, W.; Attia, K.; Biswas, M. R.; & Eryani, F. (2026). ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Zaghouani, W.; Ibrahim, S. A.; Bouamor, H.; & Bessghaier, M. (2026). JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media. In Proceedings of the 15th Language Resources and Evaluation Conference (LREC 2026).
• Wajdi Zaghouani, Kholoud Khalil Aldous and Isra Fejzullaj. AlbanianLLMSafety: A Safety Evaluation Dataset for Large Language Models in Albanian. In Proceedings of the SIGUL 2026 Workshop (Special Interest Group on Under-resourced Languages) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani, Shimaa Amer Ibrahim, Aruzhan Muratbek, Olzhasbek Zhakenov and Adiya Akhmetzhanova. KZ-SafetyPrompts: A Kazakh Safety Evaluation Prompt Dataset for Large Language Models. In Proceedings of the SIGUL 2026 Joint Workshop with ELE, EURALI and DCLRL at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani, Kholoud Khalil Aldous and Yicheng Gao. Beyond English and Evasion: A Human-Annotated Multi-Domain Benchmark for High-Stakes LLM Safety Evaluation in Chinese. In Proceedings of the RESOURCEFUL 2026 Workshop at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani, Mabrouka Bessghaier and Kais Attia. Nakba Discourse 2025: A Bilingual Social Media Dataset for Collective Trauma Analysis. In Proceedings of Nakba-NLP 2026: The 2nd International Workshop on Nakba Narratives as Language Resources at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Kholoud Khalil Aldous, Md Rafiul Biswas, Mabrouka Bessghaier, Shimaa Amer Ibrahim, Kais Attia and Wajdi Zaghouani. StanceNakba Shared Task: Actor and Topic-Aware Stance Detection in Public Discourse. In Proceedings of Nakba-NLP 2026: The 2nd International Workshop on Nakba Narratives as Language Resources at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Ashhadul Islam, Md Rafiul Biswas, Samir Brahim Belhaouari and Wajdi Zaghouani. Pushing Boundaries at NakbaVirality: Recursive Prompt Improvement for Multimodal Virality Classification. In Proceedings of Nakba-NLP 2026: The 2nd International Workshop on Nakba Narratives as Language Resources at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Accountable Human-AI Deliberation with LLMs: Scaling Collective Intelligence through Symbiotic Scaffolding. In Proceedings of the 2nd Workshop on Language-driven Deliberation Technology (DELITE 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Beyond the Black Box: Ethical and Theoretical Grounding in Affective Computing. In Proceedings of the Workshop on Computational Affective Science (CAS 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Grounding Information Disorder in NLP: A Theoretical and Operational Framework. In Proceedings of the Workshop on Information Disorder (INDOR 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Cultural Adaptation in Large Language Models for Political Discourse. In Proceedings of the Political Natural Language Processing Workshop (Political NLP 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Toward Cognitive Alignment in Large Language Models: Integrating Linguistic Theory and Human Data. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. Toward Responsible and Epistemically Grounded Multilingual LLMs for Computational Social Science and Humanities. In Proceedings of the Workshop on Large Language Models for Social Sciences and Humanities (LLM4SSH 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. The Generator–Eraser Paradox: Community Guidelines for Responsible LLM-Assisted Dialect Resource Creation. In Proceedings of the Dialect Resources Workshop (DialRes 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
• Wajdi Zaghouani. High Resource Bias in AI-Driven Neology: Structural Inequality in Lexical Innovation. In Proceedings of the Workshop on Neology and Large Language Models (NeoLLM 2026) at the Language Resources and Evaluation Conference (LREC 2026). Palma de Mallorca, Spain, 11–16 May 2026.
----
Wajdi Zaghouani, Ph.D.
Associate Professor,
Communication Program
Northwestern Qatar | Education City
T +974 4454 5232 | M +974 3345 4992
To view this discussion visit https://groups.google.com/d/msgid/sigarab/EA702478-0E09-3C43-A497-C84F32F3E5E6%40hxcore.ol.