vRtta-database

22 views
Skip to first unread message

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 12:54:17 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 9:01 AM, G S S Murthy <murt...@gmail.com> wrote:
If you want to develop a open source program for identifying vruttas, I suggest that instead of manually building up a dictionary of vruttas, let us develop a program that builds up a data base of vruttas. Every time you input  a verse, the program scans it, checks if it matches with its database and if it does not it provides you an option to add it to its database. You can give it a name which it keeps in its database.
Murthy  


BUt, the problem currently is not a paucity of vRRitta-s in the database - thanks to Ananda mishra, we don not have that problem.

Now that we have collected the vRtta definitions mechanically, some manual curation is required. We need to record:
  • examples (esp from popular verses), 
  • lakShaNa-s, 
  • sUtra-s, 
  • yati-sthAna-s (caesura), 
  • Its jAti(s)
  • link to recitation styles, 
  • classify based on the gati (druta or vilambita or mishra)
  • identify whether it sticks to a regular tAla
  • a measure of its popularity
  • notes about the bhAva which goes naturally with it
  • notes about relationship with other metres.
  • ease of composition
  • memorability

Chanda-s identifying machine can interact with this database which will increasingly grow richer for the benefit of all sahRdaya-s. For its basic function of identification, it would only concern itself with the definition.

Such a rich database can easily be converted to stardict dictionaries for easy reference on phones and all sorts of devices, used to make flashcards etc.., besides being valuable by itself to the verse maker.

WIth this in mind I have seeded  https://docs.google.com/spreadsheet/ccc?key=0Al_QBT-hoqqVdDhjNVRMTXdsdDVTZG9kcDIwVnhhN0E&pli=1#gid=7 with some vRtta-s. The sheet named छन्दस् ‌had my collection prior to the addition of the ananda-mishra dump (सम, अर्धसम, विषम sheets) - and I hope to transfer information from there to the newly added sheets.

If you would like to contribute to the creation of this database, please let me know.

Note:
1. Marking the yati-sthAna is the most immediate need. Vague or loose recommendations are put within paranthesis or brackets like: (४, ५). 
2. द दा द दा way of marking mAtra-s is preferred over ल गु ल गु or LGLG for the simple reason that that is how poets often remember verse structure. I've been using a । to denote yati-sthANa in definitions in the छन्दस् spreadsheet, but I am not particular about it - I am ok with switching to -- as a yati-marker. But all that can be mechanically inserted if we just write down the yati-sthAna as a number.
3. One easy way of quickly gathering the yati-sthAna-s is by looking at  आप्टेकोशः .
4. shrI-dhavala had shared digitized vRtta-ratnAkara commentary - one can copy paste examples and laxaNa-s thence.


--
--
Vishvas /विश्वासः

Mārcis Gasūns

unread,
Oct 26, 2013, 2:35:22 PM10/26/13
to sanskrit-p...@googlegroups.com


On Saturday, 26 October 2013 20:54:17 UTC+4, विश्वासो वासुकिजः wrote:
BUt, the problem currently is not a paucity of vRRitta-s in the database - thanks to Ananda mishra, we don not have that problem.
Yes, he solved it.
 

Now that we have collected the vRtta definitions mechanically, some manual curation is required. We need to record:
  • examples (esp from popular verses), 
This should not be manual. We need to loop all GRETIL texts and get statistics and hundreds of samples for popular metres.
 
  • lakShaNa-s, 
  • sUtra-s, 
  • yati-sthAna-s (caesura), 
  • Its jAti(s)
  • link to recitation styles, 
  • classify based on the gati (druta or vilambita or mishra)
Never seen this classification, but interesting. 
  • identify whether it sticks to a regular tAla
  • a measure of its popularity
  • notes about the bhAva which goes naturally with it
  • notes about relationship with other metres.
There must be some smart books on this topic. 
  • ease of composition
Popularity is not the same? 
  • memorability
How do you measure it? 
Chanda-s identifying machine can interact with this database which will increasingly grow richer for the benefit of all sahRdaya-s. For its basic function of identification, it would only concern itself with the definition.

Such a rich database can easily be converted to stardict dictionaries for easy reference on phones and all sorts of devices, used to make flashcards etc.., besides being valuable by itself to the verse maker.
Stardict does seems to be a dinosaur. Consider http://sandic.ru/en/downloads
 

WIth this in mind I have seeded  https://docs.google.com/spreadsheet/ccc?key=0Al_QBT-hoqqVdDhjNVRMTXdsdDVTZG9kcDIwVnhhN0E&pli=1#gid=7 with some vRtta-s. The sheet named छन्दस् ‌had my collection prior to the addition of the ananda-mishra dump (सम, अर्धसम, विषम sheets) - and I hope to transfer information from there to the newly added sheets.
The file needs more comments. As it is it is a great beginning, but can hardly imagine how to use it in real life conditions.
 

If you would like to contribute to the creation of this database, please let me know.

Note:
1. Marking the yati-sthAna is the most immediate need. Vague or loose recommendations are put within paranthesis or brackets like: (४, ५). 
2. द दा द दा way of marking mAtra-s is preferred over ल गु ल गु or LGLG for the simple reason that that is how poets often remember verse structure.
Can anyone approve it? Does not makes sense to argue, because and easy fix anyway.
 
I've been using a । to denote yati-sthANa in definitions in the छन्दस् spreadsheet, but I am not particular about it - I am ok with switching to -- as a yati-marker. But all that can be mechanically inserted if we just write down the yati-sthAna as a number.
Vishvas, what kind of numbers are used?
 
3. One easy way of quickly gathering the yati-sthAna-s is by looking at  आप्टेकोशः .
http://dsal.uchicago.edu/dictionaries/apte/ ? Scraping can be done once again. 
4. shrI-dhavala had shared digitized vRtta-ratnAkara commentary - one can copy paste examples and laxaNa-s thence.
Examples from GRETIL would be a huge amount, but popular cases will be mixed with unknown.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 2:52:33 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 11:35 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
On Saturday, 26 October 2013 20:54:17 UTC+4, विश्वासो वासुकिजः wrote:
BUt, the problem currently is not a paucity of vRRitta-s in the database - thanks to Ananda mishra, we don not have that problem.
Yes, he solved it.
To a large extant.
 
 

Now that we have collected the vRtta definitions mechanically, some manual curation is required. We need to record:
  • examples (esp from popular verses), 
This should not be manual. We need to loop all GRETIL texts and get statistics and hundreds of samples for popular metres.

That is a fair idea. But ideally, we want to use popular and beautiful verses as examples -- not obscure and difficult ones.
 
  • classify based on the gati (druta or vilambita or mishra) 
Never seen this classification, but interesting. 
 
 
  • notes about relationship with other metres.
There must be some smart books on this topic. 
Smart books? There are excellent commentaries to वृत्तरत्नाकर if that is what you mean.
 
  • ease of composition
Popularity is not the same? 
No. There is some correlation, but there are plenty of obscure metres which are easy to compose.
 
  • memorability
How do you measure it? 
By distance to popular metres, similicity of the gaNa or mAtrA structure, for example. 
 
Chanda-s identifying machine can interact with this database which will increasingly grow richer for the benefit of all sahRdaya-s. For its basic function of identification, it would only concern itself with the definition.

Such a rich database can easily be converted to stardict dictionaries for easy reference on phones and all sorts of devices, used to make flashcards etc.., besides being valuable by itself to the verse maker.
Stardict does seems to be a dinosaur. Consider http://sandic.ru/en/downloads

Stardict does have drawbacks (lack of linking ability), sandic is more the dinosaur - or maybe a platypus. It won't run on a mobile phone without an internet connection, does it?
 
 

WIth this in mind I have seeded  https://docs.google.com/spreadsheet/ccc?key=0Al_QBT-hoqqVdDhjNVRMTXdsdDVTZG9kcDIwVnhhN0E&pli=1#gid=7 with some vRtta-s. The sheet named छन्दस् ‌had my collection prior to the addition of the ananda-mishra dump (सम, अर्धसम, विषम sheets) - and I hope to transfer information from there to the newly added sheets.
The file needs more comments. As it is it is a great beginning, but can hardly imagine how to use it in real life conditions.
The file as it stands is only useful to:
a] programs [metre recognizer, dictionary maker]
b] composers.
 
Others will understandably fail to see its use..
 
 
I've been using a । to denote yati-sthANa in definitions in the छन्दस् spreadsheet, but I am not particular about it - I am ok with switching to -- as a yati-marker. But all that can be mechanically inserted if we just write down the yati-sthAna as a number.
Vishvas, what kind of numbers are used?
DevanAgarI. But I am fine with English numbers if it is more convenient for the machines. 
 
3. One easy way of quickly gathering the yati-sthAna-s is by looking at  आप्टेकोशः .
http://dsal.uchicago.edu/dictionaries/apte/ ? Scraping can be done once again. 
Nope - that only contains the Sanskrit-English version, and probably does not include the appendices. That is known to be inferior to his sanskrit-hindI dictionary, anyway.
 

Mārcis Gasūns

unread,
Oct 26, 2013, 4:16:56 PM10/26/13
to sanskrit-p...@googlegroups.com


On Saturday, 26 October 2013 22:52:33 UTC+4, विश्वासो वासुकिजः wrote:

On Sat, Oct 26, 2013 at 11:35 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
On Saturday, 26 October 2013 20:54:17 UTC+4, विश्वासो वासुकिजः wrote:
BUt, the problem currently is not a paucity of vRRitta-s in the database - thanks to Ananda mishra, we don not have that problem.
Yes, he solved it.
To a large extant.
The rest of our life we will strike for perfection and never attain it.
 

Now that we have collected the vRtta definitions mechanically, some manual curation is required. We need to record:
  • examples (esp from popular verses), 
This should not be manual. We need to loop all GRETIL texts and get statistics and hundreds of samples for popular metres.

That is a fair idea. But ideally, we want to use popular and beautiful verses as examples -- not obscure and difficult ones.
What I mean is when we will have looped GRETIL - there will be plenty to choose from and they could be marked manually as obscure or popular. Once and for all times. Based on that new files can be compiled.
 

  • classify based on the gati (druta or vilambita or mishra) 
Never seen this classification, but interesting. 
Thanks, good explanation on many metres is there.
 
  • notes about relationship with other metres.
There must be some smart books on this topic. 
Smart books? There are excellent commentaries to वृत्तरत्नाकर if that is what you mean.
Never seen them, but good to know the name.
 
  • ease of composition
Popularity is not the same? 
No. There is some correlation, but there are plenty of obscure metres which are easy to compose.
The correlation should be documented, I guess. There must be some logic in it.
 
  • memorability
How do you measure it? 
By distance to popular metres, similicity of the gaNa or mAtrA structure, for example. 
What is the formula of matra simplicity?
 

Stardict does have drawbacks (lack of linking ability), sandic is more the dinosaur - or maybe a platypus. It won't run on a mobile phone without an internet connection, does it?
No, no need for internet. It's a local SQLite database. Can run on Ubuntu, Android - it does not matters.
 
WIth this in mind I have seeded  https://docs.google.com/spreadsheet/ccc?key=0Al_QBT-hoqqVdDhjNVRMTXdsdDVTZG9kcDIwVnhhN0E&pli=1#gid=7 with some vRtta-s. The sheet named छन्दस् ‌had my collection prior to the addition of the ananda-mishra dump (सम, अर्धसम, विषम sheets) - and I hope to transfer information from there to the newly added sheets.
The file needs more comments. As it is it is a great beginning, but can hardly imagine how to use it in real life conditions.
The file as it stands is only useful to:
a] programs [metre recognizer, dictionary maker]
How can it help a dictionary maker?
  
I've been using a । to denote yati-sthANa in definitions in the छन्दस् spreadsheet, but I am not particular about it - I am ok with switching to -- as a yati-marker. But all that can be mechanically inserted if we just write down the yati-sthAna as a number.
Vishvas, what kind of numbers are used?
DevanAgarI. But I am fine with English numbers if it is more convenient for the machines. 
I mean how many of them are needed?
 
3. One easy way of quickly gathering the yati-sthAna-s is by looking at  आप्टेकोशः .
http://dsal.uchicago.edu/dictionaries/apte/ ? Scraping can be done once again. 
Nope - that only contains the Sanskrit-English version, and probably does not include the appendices. That is known to be inferior to his sanskrit-hindI dictionary, anyway.
So you mean the Sanskrit-Hindi version appendices, right?

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 4:33:00 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 1:16 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
WIth this in mind I have seeded  https://docs.google.com/spreadsheet/ccc?key=0Al_QBT-hoqqVdDhjNVRMTXdsdDVTZG9kcDIwVnhhN0E&pli=1#gid=7 with some vRtta-s. The sheet named छन्दस् ‌had my collection prior to the addition of the ananda-mishra dump (सम, अर्धसम, विषम sheets) - and I hope to transfer information from there to the newly added sheets.
The file needs more comments. As it is it is a great beginning, but can hardly imagine how to use it in real life conditions.
The file as it stands is only useful to:
a] programs [metre recognizer, dictionary maker]
How can it help a dictionary maker?
  
 
You can take a csv file and convert it to a standard dictionary format various dictionaries recognize. Eg: using a program that comes with the stardict-tools package.
 
I've been using a । to denote yati-sthANa in definitions in the छन्दस् spreadsheet, but I am not particular about it - I am ok with switching to -- as a yati-marker. But all that can be mechanically inserted if we just write down the yati-sthAna as a number.
Vishvas, what kind of numbers are used?
DevanAgarI. But I am fine with English numbers if it is more convenient for the machines. 
I mean how many of them are needed?

One pAda can have many yati-sthAna-s (besides the often default one at the end of the pAda) - so one may write ४ ६ ७ to indicate yati-sthAna-s in mandAkrAntA. Sometimes, there is ambiguity as to where the yati lies - so one may write ४/५.
 
3. One easy way of quickly gathering the yati-sthAna-s is by looking at  आप्टेकोशः .
http://dsal.uchicago.edu/dictionaries/apte/ ? Scraping can be done once again. 
Nope - that only contains the Sanskrit-English version, and probably does not include the appendices. That is known to be inferior to his sanskrit-hindI dictionary, anyway.
So you mean the Sanskrit-Hindi version appendices, right?
The skt-eng dictinonary is inferior to the skt-hindi one in many respects (eg: it does not show derivation of words). Not sure how the appendices compare..

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 4:34:07 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 1:16 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
There must be some smart books on this topic. 
Smart books? There are excellent commentaries to वृत्तरत्नाकर if that is what you mean.
Never seen them, but good to know the name.

You can try some linked here/

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 4:40:19 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 1:16 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
  • memorability
How do you measure it? 
By distance to popular metres, similicity of the gaNa or mAtrA structure, for example. 
What is the formula of matra simplicity?
 
Among anuShTubh variety
दा दा दा दा दा दा दा दा has the greatest mAtrA simplicity. Just the same mAtrA repeating again and again.

दा द दा द दा द दा द in next in the scale of mAtrA simplicity, as is द दा द दा द दा द दा.

And so on.
 

Stardict does have drawbacks (lack of linking ability), sandic is more the dinosaur - or maybe a platypus. It won't run on a mobile phone without an internet connection, does it?
No, no need for internet. It's a local SQLite database. Can run on Ubuntu, Android - it does not matters. 
There is an android app? If not you must consider if it is worth developing one - or if one might rather focus on producing dictionaries which operate with other existing, highly developed programs .

The key is interoperability. One should build applications which work well with others. Working well in isolation is good, but not excellent. It sucks that iphone users cannot charge their phones with a standard usb cable, for example.

Mārcis Gasūns

unread,
Oct 26, 2013, 4:42:03 PM10/26/13
to sanskrit-p...@googlegroups.com


On Sunday, 27 October 2013 00:33:00 UTC+4, विश्वासो वासुकिजः wrote:

On Sat, Oct 26, 2013 at 1:16 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
How can it help a dictionary maker?
   
You can take a csv file and convert it to a standard dictionary format various dictionaries recognize. Eg: using a program that comes with the stardict-tools package.
Yes, a reference file one can get, but it's not a full dictionary, only part of one...
 
I mean how many of them are needed?

One pAda can have many yati-sthAna-s (besides the often default one at the end of the pAda) - so one may write ४ ६ ७ to indicate yati-sthAna-s in mandAkrAntA. Sometimes, there is ambiguity as to where the yati lies - so one may write ४/५.
So 4 different types? 

 
So you mean the Sanskrit-Hindi version appendices, right?
The skt-eng dictinonary is inferior to the skt-hindi one in many respects (eg: it does not show derivation of words).
Did not knew about the derivation. I miss it in Apte, the one I have known. Is derivation for all words in Hindi Apte?
 
Not sure how the appendices compare..

Me either. But now I want to order the Hindi version of the dictionary :) 

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 4:55:25 PM10/26/13
to sanskrit-p...@googlegroups.com
On Sat, Oct 26, 2013 at 1:42 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
On Sunday, 27 October 2013 00:33:00 UTC+4, विश्वासो वासुकिजः wrote:

On Sat, Oct 26, 2013 at 1:16 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
How can it help a dictionary maker?
   
You can take a csv file and convert it to a standard dictionary format various dictionaries recognize. Eg: using a program that comes with the stardict-tools package.
Yes, a reference file one can get, but it's not a full dictionary, only part of one...

If you supply the metre definitions, you don't get a dictionary of nyAya-s if that is what you mean. But that does not matter if your dictionary program can check n dictionaries for every query.
 
 
I mean how many of them are needed?

One pAda can have many yati-sthAna-s (besides the often default one at the end of the pAda) - so one may write ४ ६ ७ to indicate yati-sthAna-s in mandAkrAntA. Sometimes, there is ambiguity as to where the yati lies - so one may write ४/५.
So 4 different types? 
Types? ४ ६ ७ inidcates that a pause/ change in gait is observed after 4, 4+6 and 4+6+7 letters while composing or reciting mandAkrAntA. 
 

 
So you mean the Sanskrit-Hindi version appendices, right?
The skt-eng dictinonary is inferior to the skt-hindi one in many respects (eg: it does not show derivation of words).
Did not knew about the derivation. I miss it in Apte, the one I have known. Is derivation for all words in Hindi Apte?
yes.
 

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 26, 2013, 6:27:58 PM10/26/13
to sanskrit-p...@googlegroups.com

On Sat, Oct 26, 2013 at 1:55 PM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
Is derivation for all words in Hindi Apte?
yes.

- sorry I don't know about "all". But you can check in the link provided earlier in the thread.

Usha Sanka

unread,
Oct 27, 2013, 2:50:18 AM10/27/13
to sanskrit-p...@googlegroups.com
Namaste,
Derivations for all words is given in Apte's "The Practical Sanskrit-English Dictionary" 
Students' Edition does not have them for obvious reasons.
So you cannot say Hindi one is better than English.
I use English one always for all my purposes.
-vinItA
Usha


--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all
Reply to author
Forward
0 new messages