We are thrilled to announce the first official release of the Arabic ChainBank, the latest tool developed by CAMeL Lab at New York University Abu Dhabi.
The Arabic ChainBank is a derivational resource for Modern Standard Arabic (MSA). It is designed to systematically link all derivatives belonging to the same derivational family in a sequential manner (chain), starting from the root and progressing through each derived form.
Explore Arabic ChainBank here
We would be happy to hear your thoughts, feedback, and suggestions—your input will help us improve and grow the resource in future releases
Best wishes,
Reham Marzouk
Hi Reham,
thanks for sharing. This is indeed a very useful resource, thanks
for releasing it!
Until now I was using Otakar Smrž's ElixirFM for this purpose, but it seems to be no longer maintained ... maybe because the core is written in Haskell, which is not widely spoken. :-) Almost 10 years ago, when I was studying Arabic and struggled to grasp the derivational morphology, I used (parts of) the ElixirFM data to populate a graph database, modelling the derivational chains. It never got beyond an alpha version, though: https://shabaka.muraija.org/tx/search?q=إبداعي

Just to get an idea, you can try جمهورية or لامتناهي or ز خ ر ف (as graph).
I'm not writing this mail to say "your stuff is cool, but look at my stuff"... :-), but because I was wondering - without having read your paper in detail yet - if the ElixirFM data could be useful to smoke test ChainBank. ElixirFM is about morphology, so edges do not have semantic labels, but simply generating a list of edges known to Elixir but not to ChainBank might be useful for debugging.
In case you want to try it out, feel free to use a version of the ElixirFM data I prepared to generate "morphological variants of collocations" for Muraija, which is adapted to the camel_morph's diacritization conventions: el-khair-camel.derivations.json

That's my five cents for the moment, unfortunately I have to
leave now before turning this email into something more elaborate
:-)
Best,
Mirko
--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CABZ1H3nqoeYzeWDrVZYX3CcWKG1g4vr1WOZ2gC3TX2dMu%3DbLuw%40mail.gmail.com.
Thank you very much for your interest in our new model, "Arabic ChainBank". I must also say that I truly appreciate your contributions and always enjoy following your valuable discussions on SIGARAB.
Let me briefly share a few points about the Arabic ChainBank. At first glance, it may appear to be a morphological model similar to existing models that follow the root-and-pattern approach for derivation. However, the Arabic ChainBank introduces a unique perspective, positioning itself as a morphosemantic model.
The distinctiveness of the Arabic ChainBank lies in its form and meaning conceptualization. It captures the derivational behavior of Arabic by organizing derivatives into chains, each chain forms a path from the root to the most complex derivational form. This structure reflects not only the morphological relations between words but also the semantic specification and shift that occur as words are derived.
In this sense, the ChainBank provides deeper linguistic insight that cannot be revealed without such a structured organization. It also highlights phenomena like the interaction between derivation and inflection, and the role of affixation in shaping derivational meaning.
Using other resources to test the ChainBank, even those that are only about morphology, such as ElixirFM, is an excellent idea and an essential part of the process. It is not only useful for debugging but also for the completeness of the chains.
Thank you again for your valuable feedback and the shared data. The project is still in its early stages and will benefit from further studies to support its continued development and improvement.

To view this discussion visit https://groups.google.com/d/msgid/sigarab/CABZ1H3maJPx8YHmis7u23P7ztTs%2Bxp2-Sqb2c7xg7Ms0TPfXLQ%40mail.gmail.com.
Assalamu alaikum,
This is more a design/linguistic question: why did you choose to
establish the 3 letter root as the core from which all derivations
follow, instead of the masdar?
Shukran,
--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CABZ1H3nqoeYzeWDrVZYX3CcWKG1g4vr1WOZ2gC3TX2dMu%3DbLuw%40mail.gmail.com.
-- Find me at: https://www.kentoseth.com https://fosstodon.org/web/@kentoseth
Linguistically, we understand المصدر (verbal nouns) to be related to verbs, which themselves are derived from roots. If you start with verbs or verbal nouns, you will have a hard time initiating derivations for words like شجرة ورقة أسنان طريف
To view this discussion visit https://groups.google.com/d/msgid/sigarab/089459b6-d121-4622-9e82-a313385c1d24%40devcroo.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/AS4PR09MB55013E135796B6E312337D0AE77AA%40AS4PR09MB5501.eurprd09.prod.outlook.com.
--