Database of Sanskrit Verb forms

347 views
Skip to first unread message

dhaval patel

unread,
Jun 24, 2016, 12:32:47 PM6/24/16
to indo...@list.indology.info, bvpar...@googlegroups.com, samskrita, sanskrit-p...@googlegroups.com
Apologies for cross posting
---------------------------------------

Respected scholars,

As a result of our work on tiGanta generation tool for past some years,
We present the following database of generated verb forms of Sanskrit language for Sanskrit NLP community.

www.sanskritworld.in/sanskrittool/SanskritVerb/generatedforms/verbforms.tar.gz

The file is in XML format and a typical line is in the following format -

<f form="Bavati"><root name="BU" num="01.0001"/><law/><tip/></f>

Data set has a total of 267797 entries.
Data has verb forms for around 2240 verbs.

Project code page -
https://github.com/drdhaval2785/SanskritVerb/
Project testing page -
http://www.sanskritworld.in/sanskrittool/SanskritVerb/tiGanta.html
N.B. - The code will also show up applicable Paninian rules and the effect thereof on the verb forms sequentially.

Current version -
v1.10.0 Date 24 June 2016
Authors -
Dr. Dhaval Patel and Dr. Shivakumari Katuri.


Acknowledgements -
1. Prof. Amba Kulkarni of Univ. of Hyderabad for allowing us access to her database of verb forms and various dhAtuvRttis.
2. Prof. Gerard Huet of INRIA for allowing us access to his database of verbforms.
We have used these two existing databases for comparing our results against, and have made necessary corrections where there were evident errors.

--
Dr. Dhaval Patel, I.A.S
Collector and District Magistrate, Anand

Nityanand Misra

unread,
Jun 24, 2016, 9:21:45 PM6/24/16
to भारतीयविद्वत्परिषत्, sams...@googlegroups.com

Bravo Dr. Patel. Great work as you always do. Will download when I get some time and comment on this. 

Can you please confirm what all senses are covered (प्रकृत्यां, सनि, णिचि, यङि,यङ्लुकि, etc)

Also, is there any estimate of the accuracy of the generated verb forms (e.g. 95% or 99%) or have they been cross-checked against a source like Dhaturupanandini?

dhaval patel

unread,
Jun 25, 2016, 2:20:35 AM6/25/16
to samskrita, bvpar...@googlegroups.com

> Can you please confirm what all senses are covered (प्रकृत्यां, सनि, णिचि, यङि,यङ्लुकि, etc)

Right now it is only प्रकृत्याम्. Code can generate and does generate सनादिs, but as they have not been checked, they are not included in the database. Can be added after validation. May take some 2-3 months or so.

> Also, is there any estimate of the accuracy of the generated verb forms (e.g. 95% or 99%) or have they been cross-checked against a source like Dhaturupanandini?

This is a tricky question to answer. The methodology followed was - compare the generated verb forms against the combined database of UoHyd and INRIA.
The forms which seemed OK to me were put in https://github.com/drdhaval2785/SanskritVerb/blob/master/Data/okforms.txt and verb forms which were wrong, but decided not to be handled were put in https://github.com/drdhaval2785/SanskritVerb/blob/master/Data/notnow.txt.
Then the whole script was repeatedly reran until there were no forms which were not present in UoHyd, INRIA, okforms and notnow.

The logic presumes that no two data entry operators or algorithms created the same wrong forms.
A cursory look into verb commentaries was made in specific verbs.

Not sure about accuracy though. As universe is large, some random samping of 1% may be tested and result extrapolated. Not on wishlist as of now though.

dhaval patel

unread,
Jun 25, 2016, 9:33:47 AM6/25/16
to bvpar...@googlegroups.com, samskrita, indo...@list.indology.info, sanskrit-p...@googlegroups.com
As per the requests of members, a CSV file with
verbform,verb,lakAra,suffix,verbnumber
format is made available in Devanagari
e.g. अंसयति,अंस,लट्,तिप्,10.0460.

See www.sanskritworld.in/public/sanskrittool/SanskritVerb/generatedforms/verbformsdeva.tar.gz

The verb number is consciously not changed to Devanagari, so that it is amenable to easy machine handling.

Taff Rivers

unread,
Jun 25, 2016, 11:51:55 AM6/25/16
to samskrita, indo...@list.indology.info, bvpar...@googlegroups.com, sanskrit-p...@googlegroups.com, Eddie Hadley
Dr,

  Are you aware that this information is still available, well provisioned and highly cross referenced in the golden olde Ganakashtadhyayi project.



And in various combinations of Devanagari and Roman diacritics, albeit in legacy fonts, which are available for downloaded.

It's old, but even so It runs well even under Windows 10 O/S.

The site hasn't been maintained for years, but at least It would serve as model for modernising.

As a programmer person, i have enquired as to the availability of the source code, but... even the forum is defunct.


Taff

  The enclosed screen sample illustrates the style.
dhatupatha.png

dhaval patel

unread,
Jun 25, 2016, 12:35:41 PM6/25/16
to samskrita

Yes, I am aware about the ganakashtadhyayi software.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Jun 27, 2016, 12:08:49 PM6/27/16
to sanskrit-programmers, bvpar...@googlegroups.com, samskrita, dhaval patel
(-indology, since I'm not a member there)

We also have stardict files corresponding to shrI dhaval's generated output.

Example output:

aharaH

हृञ् (हरणे, भ्वादिः, उ०, अनिट्, लङ्)
 अहरत् अहरताम् अहरन्
 अहरः अहरतम् अहरत
 अहरम् अहराव अहराम
 अहरत अहरेताम् अहरन्त
 अहरथाः अहरेथाम् अहरध्वम्
 अहरे अहरावहि अहरामहि

​Dictionary file linked here: ​https://github.com/sanskrit-coders/stardict-sanskrit/tree/master/sa-vyAkaraNa/tars (look for and click on dhaval-tiNanta, download the raw file)


Tips about how to access these (and other) dictionaries​ on your favorite mobile or stationary device : https://sites.google.com/site/sanskritcode/dictionaries#TOC-How-to-install-and-use-dictionaries-on-your-device-


--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Vishvas /विश्वासः

Reply all
Reply to author
Forward
0 new messages