Vidyut 0.3 release

157 views
Skip to first unread message

Arun

unread,
Jan 7, 2025, 2:07:26 AMJan 7
to sanskrit-programmers
Vidyut is a Sanskrit toolkit written in Rust. Version 0.3 of our Python bindings brings powerful new functionality to Sanskrit programmers, particularly around generating and querying Sanskrit words.

Vidyut 0.3 is not a final product. Instead, it is another big step toward providing reliable digital infrastructure for all Sanskrit programs. There are a lot of exciting features on our roadmap, and I hope to ship them to you soon.


vidyut.prakriya, a Sanskrit word generator

vidyut.prakriya, which powers many of the derivations on ashtadhyayi.com, is an interface to the Ashtadhyayi that creates words along with their derivations. It provides excellent support for tinantas, krdantas, subantas, and taddhitantas and partial support for samasas and accent.

Vidyut 0.3 brings a greatly improved and expanded API along with hundreds of small bug fixes. It also integrates a variety of performance improvements that make word generation screamingly fast. (On my laptop, I can generate all 356,961 kartari-tinantas in around 2 seconds.)

Future releases will add more rules and stronger support for accent rules.

vidyut.kosha, a morphological store

vidyut.kosha provides a space-efficient morphological dictionary that stores close to 100 million Sanskrit words in less than one byte per word. Our kosha includes:
- all dhatus from the Dhatupatha on Ashtadhyayi.com
- all combinations of upasarga + dhatu from the Upasargarthacandrika
- various combinations of sanadi pratyayas, including णिच्, सन्, यङ्, यङ्-लुक्, णिच्-सन्, and सन्-णिच्
- all combinations of (prayoga, lakara, purusha, vacana) for these dhatus
- all combinations of (krt, linga, vibhakti, vacana) for these dhatus, including स्य-शतृ स्य-शानच्, and यक्-शानच्
- various pratipadikas and avyayas scraped from Sanskrit dictionary files, inflected for (linga, vibhakti, vacana)

Future releases will add more words, more data, and more API options.

vidyut.lipi, a new transliterator

vidyut.lipi aims to provide the correctness of Aksharamukha with the ease of use of indic_transliteration. It has a robust test suite and supports various edge cases, such as Unicode normalization, Grantha numeral transliteration, and support for the ISO ':' separator.

Future releases will continue to improve quality. There are many transliterators available today, but we hope to create a best-in-class transliterator available throughout the stack.

vidyut.chandas, a metrical classifier

vidyut.chandas classifies a variety of Sanskrit meters. It is not quite state-of-the-art, but it is easy to use and comes provided as an extra in this release.

Future releases will match the best-in-class solutions available today.

Regressions in vidyut.cheda

To avoid getting stuck in an even longer development cycle, we have shipped despite introducing some quality reductions to vidyut.cheda, which we will address in our next release as time allows.

Now that neural segmentation engines are readily available, we think that vidyut.cheda is less important and are considering deprecating it.

Arun

Arun

unread,
Jan 12, 2025, 4:53:18 PMJan 12
to sanskrit-programmers
I've just released Vidyut 0.3.1, which fixes a variety of minor bugs filed over the past week.

Vidyut uses semantic versioning, so 0.3.1 is compatible with the data from version 0.3.0.

Thanks especially to Vishvas for filing so many bug reports and feature requests and to Shreevatsa for feedback on the documentation.


Arun

Arun

unread,
Jan 22, 2025, 2:29:32 AMJan 22
to sanskrit-programmers
I've just released Vidyut 0.4.0, which adds many more entries to the Kosha and also adds some new helper methods.

Vidyut use semantic versioning, so please use the new 0.4.0 data release. Older data files will not work with this release.

Thanks to Vikram B for his code contributions. Thanks also to Neelesh B for sharing Ashtadhyayi data, Avinash V for uncovering a major kosha bug, and Vishvas for continuing to file bugs and feature requests.

Aditya Kumar

unread,
Aug 7, 2025, 1:28:40 AMAug 7
to sanskrit-programmers
Arun ji, is Vidyut Kosha generated through Vidyut Prakriya or is it collected from real Prayog in corpus. My intention of asking is - Can Prakriya be tested against Kosha ? 

Avinash L Varna

unread,
Aug 8, 2025, 7:41:58 PMAug 8
to sanskrit-p...@googlegroups.com
My understanding is that vidyut.prakriya is used to generate the kosha data provided with the release (See the generation code - https://github.com/ambuda-org/vidyut/blob/main/vidyut-data/src/bin/create_kosha.rs).  So we couldn't test the prakriya by just taking an entry in the kosha and checking if the prakriya outputs the same form (since that was how it was generated in the first place). We could use the kosha to validate the prakriya and the generation process indirectly - in the sense that if an invalid form is in the kosha, either the data input to the prakriya or the prakriya itself may require some update to ensure the invalid form is not generated.
(Arun or others more familiar with vidyut - feel free to correct anything)

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sanskrit-programmers/b03b8a4f-cc1c-4ef7-910f-e58fe3c0742en%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages