Masader Form

11 views

Skip to first unread message

Zaid Alyafeai

unread,

Oct 18, 2025, 8:15:00 AMOct 18

to SIGARAB: Special Interest Group on Arabic Natural Language Processing

Now that the ArabicNLP 2025 conference is approaching with many published datasets, we are excited to announce Masader Form. A new way to add datasets to Masader. Instead of manual annotations, we rely on a semi-supervised approach where an LLM can be used to extract the metadata. After submission, the metadata is then directly pushed to our GitHub repository to easily review the metadata. We encourage all authors to submit the datasets through the form to make them easily accessible to the research community. Masader has now +730 datasets, and we aim to reach 1000 by the end of the year.

PS: this approach is based on our new research MOLE and MeXtract.

Zaid

Reply all

Reply to author

Forward

0 new messages