priya mitrANi
Taking from the other mail from Vishvas:
#5. Tools to identify metre(अ1, 2).
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
#--------------------------------------------------------# $description = "Syntax: sscan [options] [file...] Sscan produces a metrical analysis of a Sanskrit text in the CSX encoding. Heavy syllables are represented in the output by \"-\", light syllables by \"u\". The program understands the conventions regarding comments, prose passages, \"X uvаca\"-type lines and non-anuщсubh lines employed in the CSX versions of the Mahаbhаrata and Rаmаyaхa; it can also cope with the Alsdorf and metrically emended versions of the RV text. It should produce good results on other texts too, provided that they are laid out in a fairly normal way. -h option prints this help. "; #--------------------------------------------------------#
Welcome murthy sir.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Thanks Marcis - I am collecting all such code in https://github.com/vvasuki/sanskritnlp/tree/master/src/main .
I decided to try my hand at coding this.
A rough version wasn't too hard to come up with (just required time and effort).
It is still very far from done, but the current version, in case anyone would like a preliminary look, is at http://sanskritmetres.appspot.com/ and source code at https://github.com/shreevatsa/sanskrit/tree/metrical-scan .
The server has either erred or is incapable of performing the requested operation.
Traceback (most recent call last):
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in __call__
rv = self.handle_exception(request, response, e)
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in __call__
rv = self.router.dispatch(request, response)
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
return route.handler_adapter(request, response)
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in __call__
return handler.dispatch()
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
return self.handle_exception(e, self.app.debug)
File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
return method(*args, **kwargs)
File "/base/data/home/apps/s~sanskritmetres/1.371081553055822859/request_handler.py", line 54, in post
sscan.IdentifyFromLines(input_verse.split('\n'))
File "/base/data/home/apps/s~sanskritmetres/1.371081553055822859/sscan.py", line 233, in IdentifyFromLines
pattern_lines.append(MetricalPattern(line))
File "/base/data/home/apps/s~sanskritmetres/1.371081553055822859/sscan.py", line 54, in MetricalPattern
assert CheckHK(text)
AssertionErrorThe Python script that does the actual work (and can be run from the commandline if desired, or imported as a library from other Python code) is https://github.com/shreevatsa/sanskrit/blob/metrical-scan/sscan.py .
To use it, go to http://sanskritmetres.appspot.com/ , type a Sanskrit verse in the box (in Harvard-Kyoto convention), and click on the button. If your verse was in one of the known metres, it will (I hope) get recognized.
There is still a lot to do: the UI/frontend (what you see on the website) I have almost not worked on at all yet; it currently contains only just under 40 of the popular (and some not so popular) metres; there are some issues around dealing with the (rare) cases where the syllable at the end of a line is *intended* to be laghu instead of guru; the output can stand to be improved a lot; it will be useful (and simple) to support input transliteration schemes other than Harvard-Kyoto; there are some obvious performance improvements crying out to be done; the code can do with some refactoring, etc.
(But it already has one feature that I've often felt the absence of in sanskrit.sai.uni-heidelberg.de/Chanda/ -- if you type a verse in which some lines are in correct metre and some are off, there is a chance that this script will still recognize the conforming lines.
As an example of the usefulness of having a script like this locally: running the text on the GRETIL text of Meghaduta uncovered 23 errors in the text (detected by the metre being incorrect), which I've notified the GRETIL maintainer about.)
I've sent this update about this tool to this mailing list as it seems to have a relatively small
membership but useful comments (or commits)
might be forthcoming; please share the link widely (if you wish) only when it's in a somewhat usable state. :-)
There is still a lot to do: the UI/frontend (what you see on the website) I have almost not worked on at all yet; it currently contains only just under 40 of the popular (and some not so popular) metres;
As an example of the usefulness of having a script like this locally: running the text on the GRETIL text of Meghaduta uncovered 23 errors in the text (detected by the metre being incorrect), which I've notified the GRETIL maintainer about.)
The Python script that does the actual work (and can be run from the commandline if desired, or imported as a library from other Python code) is https://github.com/shreevatsa/sanskrit/blob/metrical-scan/sscan.py .
How do you work with Python on localhost?
To use it, go to http://sanskritmetres.appspot.com/ , type a Sanskrit verse in the box (in Harvard-Kyoto convention), and click on the button. If your verse was in one of the known metres, it will (I hope) get recognized.svAgataM devakIputra svAgataM te dhanaJjaya|priyaM me darzanaM vADhaM yuvayornarasiMhayoH||crashed it as per.
Could you get the code to read sama/vishama/ardhasama vRtta definitions from a simple csv (separating it out from the code)?
Could you just scrape http://sanskrit.sai.uni-heidelberg.de/Chanda/HTML/list_all.html to get all the 1352 metres listed there?
उपकृतास् स्मः श्रीवत्स!Suggestions inline:On Mon, Oct 21, 2013 at 1:22 PM, Shreevatsa R <shree...@gmail.com> wrote:
There is still a lot to do: the UI/frontend (what you see on the website) I have almost not worked on at all yet; it currently contains only just under 40 of the popular (and some not so popular) metres;Could you get the code to read sama/vishama/ardhasama vRtta definitions from a simple csv (separating it out from the code)?
Could you just scrape http://sanskrit.sai.uni-heidelberg.de/Chanda/HTML/list_all.html to get all the 1352 metres listed there?
That's brilliant!As an example of the usefulness of having a script like this locally: running the text on the GRETIL text of Meghaduta uncovered 23 errors in the text (detected by the metre being incorrect), which I've notified the GRETIL maintainer about.)
Could you just scrape http://sanskrit.sai.uni-heidelberg.de/Chanda/HTML/list_all.html to get all the 1352 metres listed there?Yes I thought of that, but I'm not sure if there are any copyright concerns around it. So I'm reluctant to do that without consent from "© Copyright 2006-07 Anand Mishra."
Anyway in practice the common metres seem to suffice (and are also what one would mostly encounter), so I'm not too keen on expanding to an exhaustive (i.e. theoretical, as found in large works on prosody) list.
Could you get the code to read sama/vishama/ardhasama vRtta definitions from a simple csv (separating it out from the code)?Yes, I plan to do that soon -- either csv, or json, or some such simple format that is easier for anyone to edit.
# A short vowel followed by a consonant is a laghutext = re.sub(short_vowel + consonant + '*', '-', text)This is not correct. It is rather Guru as a rule.Only in certain cases, it is optionally laghu.So better is to match it as both laghu and guru to see whether it matches any pattern of the input text.I am attaching herewith Relevant portion of kedArabhaTTa's vRttaratnAkara - along with a commentary by sulhaNa (sukavihRdayAnandinI).गुरुलघुपरिज्ञानार्थमाह ।
सानुस्वारो विसर्गान्तो दीर्घो युक्तपरश्च यः ।
वा पादान्तस्त्वसौ ग्वक्रो ज्ञेयोऽन्यो मातृको लृजुः ॥ ९ इति ।
सहानुस्वारेण वर्तत इति सानुस्वारः । विसर्गान्तः सविसर्गः । दीर्घो द्विमात्रः । युक्तपरः संयोगपरो यो भवति । चकाराद्व्यंजनांतोऽपि गृह्यते । परिमिताक्षरमात्रो गणारचितो वक्ष्यमाणलक्षणो वृत्तस्य चतुर्थांशः पादस्तस्यान्ते वर्तमानो लघुरपि विभाषया गुरुः स्यात् । स च कविसमयव्यवहारात् द्वितीयचतुर्थयोरेव पादयोरन्ते वेदितव्यः । यथा ।
प्रायः समासन्नपराभवानां धियो विपर्यस्ततमा भवन्ति ।
असंभवे हेममयस्य जन्तोस्तथाऽपि रामो लुलुभे मृगाय ॥ इति ।
तथा च ।
‘श्रियः पतिः[1] श्रीमति शासितुं जगज्जगन्निवासो वसुदेवसद्मनि’[2] इत्यादि दृष्टव्यम् ।
स च प्रस्तारे वक्रः स्थाप्य लघुलक्षणमाह । ज्ञेयोऽन्यो मातृको लृजुः । अनुस्वरादिरहितो अन्यो मातृको एकमात्रो वर्णो लघुर्भवति । स च प्रस्तारे ऋजुः सरलः । ९
युक्तपराश्च य इत्यनेन प्राप्ते गुरुत्वे अपवादमाह ।
पदादाविह[3] वर्णस्य संयोगः क्रमसंज्ञिकः ।
पुरःस्थितेन[4] तेन स्याल्लघुताऽपि क्वचिद्गुरोः ॥ १०
विभक्त्यंतं पदं तस्य पदस्यादौ वर्तमानो यो वर्णस्तस्य संयोगः । स इह शास्त्रे क्रमसंज्ञो ज्ञेयः । तेन क्रमेण पुरोवर्तिना प्राक्पदांते वर्तमानस्य प्राप्तगुरुभावस्यापि लघुता स्यात् । क्वचिल्लक्षानुरोधेन । ननु क एषः[5] क्रमो नाम संयोग उच्यते । पूर्वाचार्याणां पिंगलनागप्रभृतीनां कालिदासादीनां च कवीनां समयः[6] परिगृहीतः । संयोगः क्रमसंयोगः । १०
तत्र ग्रसंयोगेन[7] यथा । इदमस्योदाहरणम् ।
तरुणं सर्षपशाकं नवौदनं पिच्छलानि च दधीनि ।
अल्पव्ययेन सुंदरि ग्राम्यजनो मिष्टमश्नाति[8] ॥ ११
ह्रसंयोगेन यथा ।
तव ह्रियापह्रिया मम ह्रीरभूत् शशिगृहेऽपि हृतं न धृता ततः ।
वहलभ्रामरमेषकतामसम् मम प्रिये क्व स येष्यति तत्पुनः ॥ इति
निद्रव्यो ह्रियमेति ह्रीपरिगतः प्रभ्रश्यते तेजसः[9]
निस्तेजः[10] परिभूयते परिभवान्निर्वेदमागच्छति ।
निर्विण्णः शुचमेति शोकविवशो बुद्ध्याः[11] परिभ्रश्यते
निर्बुद्धिः क्षयमेत्यहो निधनता सर्वापदामास्पदम् ॥[12]
ममैव ते हृते[13] । यथा ।
स्नेहाद्गेहाद्भुजगतनयालोककौतूहलेन स्थूलोत्तुंगस्तनभरलसन्मध्यभंगानपेक्षाः ।
पौरा नार्यस्तरलनयनानन्दमुत्पादयन्त्यो धावन्ति स्म द्रुतमपह्रियः स्रंसमानोत्तरीयाः ॥ इति ।
बोधप्रदीपेऽपि यथा ।
यज्ञैर्येषां प्रतिपदमियं मण्डिता भूतधात्री
निर्जित्यैतद्भुवनवलयं यैः प्रदत्तं द्विजेभ्यः[14] ।
तेऽप्येतस्मिन् गुरुभवहृदे[15] बुद्बुदस्तम्भलीलं[16]
धृत्वा धृत्वा सपदि विलयं भूभुजः संप्रयाताः ॥
शिशुपालवधे यथा ।
प्राप्तनाभिहृदमज्जनमाशु[17] प्रस्थितं निवसनग्रहणाय ।[18] इति ।
भ्रसंयोगेन[19] यथा ।
शशिमुखि भ्रमरोऽयं पद्मबुद्ध्याऽऽननं ते ।
समभिलषति पातुं त्यक्तवल्लीप्रसूनः ॥
तथा च ।
भ्रमति भ्रमरमारीकानने विप्रमुक्ते । इत्यादि ।
वरतरुकुसुमेषु[20] व्योमगंगाम्बुजेषु त्रिदशकरिकटेषु स्वर्वधूकुंतलेषु ।
स्थितशयितविबुद्धप्रीतिकस्ते[21] भ्रमोऽयं भ्रमसि भ्रमर येन त्वं मुधा केतकेषु ॥
पदादाविति किम् । अन्यत्र मा भूत् ।
ग्रसंयोगेन यथा ।
असमग्रविलोकनेन किं ते दयितं पश्य वरोरु[22] निर्विशंका ।
न हि जातु कुशाग्रपीतमम्भः सुचिरेणापि करोत्यपेततृष्णम् ॥ इति
ह्रसंयोगेन यथा ।
‘आजह्रतुस्तच्चरणौ पृथिव्यामि’ति[23] ।
तथा च ।
प्रद्योतस्य प्रियदुहितरं वत्सराजोऽत्र जह्रे[24] ।
भ्रसंयोगेन यथा ।
कुन्दावदातैर्भवतो यशोभिः शुभ्रीकृतं किं परमारवीर ।
अद्यापि यद्बिभ्रति कालिमानमरातिनारीवदनोत्पलानि ॥ इत्यादि ।
क्वचिदिति किम् । सर्वत्र मा भूत् ।
ग्रसंयोगेन यथा ।
‘मही पादघाताद्व्रजति सहसा संशयपदम् ।
पदं विष्णोर्भ्राम्यद्भुजपरिघरुग्णग्रहणमि’ति[25] ॥
भ्रसंयोगेन यथा ।
‘तत्र भ्रमत्येव मुधा षडंध्रिरि’ति[26] ।
केचित्पादादाविति मन्यन्ते । तदसंगतम् । सूत्रोदाहरणयोर्घटनाभावात् ।
तथापि ।
तरुणं सर्षपशाकं नवोदनं पिच्छलानि दधीनि ।
अल्पव्ययेन सुंदरि ग्राम्यजनो मिष्टमश्नाति[27] । इत्युदाहरणमार्यया प्रदर्शितम् । आर्यायां पादव्यवस्था नास्ति । पूर्वार्धोत्तरार्धग्रहणात् पूर्वार्धोत्तरार्धमित्यार्यालक्षणं कुर्वाणो ग्रन्थकार एवं ज्ञापयति । तावदार्यायां पादव्यवस्था नास्ति पैंगलीयसूत्रपाठाच्च । स्वराऽर्धञ्चार्यार्धमिति[28] । तस्मात्पदादाविति पाठः । श्रेयानित्यलमिति प्रसंगेन ॥ ११
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Great work.# A short vowel followed by a consonant is a laghutext = re.sub(short_vowel + consonant + '*', '-', text)This is not correct. It is rather Guru as a rule.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
निद्रव्यो ह्रियमेति ह्रीपरिगतः प्रभ्रश्यते तेजसःHere, as the ति preceding ह्री is mandated to be guru by rule 5 - by rule 7, it remains laghu.If it is treated as guru - GGGLLGGGLLLG-GGLGGLG - doesnt correspond to zArdulavikrIDita meter.If it is treated as laghu - GGGLLGLGLLLG-GGLGGLG - corresponds to the meter.
This is optional.
Also noted down in the issues section at https://github.com/shreevatsa/sanskrit/issues/1
Now for the actual issue. [BTW, where you say Rule 5 in your email (from Rule 7 onwards), I guess you mean Rule 4.]
To summarize, the issue is that according to Kedārabhaṭṭa, a short vowel followed by a consonant cluster (i.e., multiple consonants), although guru by default, can *optionally* be treated as laghu if that consonant cluster happens to be the beginning of a new word.I've treated it as guru always, and not allowed for this option.It is definitely an interesting feature to add, but before implementing a change like this (which allows a single line to have multiple metrical patterns), I'd like to first check how commonly this exception occurs in classical works, to see whether it's critically needed. I've heard that later prosodists allowed some modifications influenced by Prākṛta prosody, which are not universally accepted by other Sanskrit scholars.
(E.g. optionally allowing some consonant+repha clusters, like प्र, to be treated as single consonants which is coincidentally what's happening here too.)
निद्रव्यो ह्रियमेति ह्रीपरिगतः प्रभ्रश्यते तेजसःHere, as the ति preceding ह्री is mandated to be guru by rule 5 - by rule 7, it remains laghu.If it is treated as guru - GGGLLGGGLLLG-GGLGGLG - doesnt correspond to zArdulavikrIDita meter.If it is treated as laghu - GGGLLGLGLLLG-GGLGGLG - corresponds to the meter.This is optional.For what it's worth, I just checked that http://sanskrit.sai.uni-heidelberg.de/Chanda/ doesn't implement this option either. With the following text as input:nidravyo hriyameti hrIparigataH prabhrazyate tejasaHnidravyo hriyameti hrIparigataH prabhrazyate tejasaHnidravyo hriyameti hrIparigataH prabhrazyate tejasaHnidravyo hriyameti hrIparigataH prabhrazyate tejasaHNo metre is not recognized, though with hrIparigataH changed to hIparigataH, it is correctly recognized as शार्दूलविक्रीडितम् .
Also noted down in the issues section at https://github.com/shreevatsa/sanskrit/issues/1Thanks for creating this; let's continue the discussion of this issue there.
I am attaching herewith Relevant portion of kedArabhaTTa's vRttaratnAkara - along with a commentary by sulhaNa (sukavihRdayAnandinI).
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
PFA the commentary.Osmania University had published it very long back.
This is the one I edited some time ago.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
The issue which faces us is that of the database of the meters.
I have tried to supply 100+ samavRtta meters which are enumerated in the vRttaratnAkara (in the syntax of the python code).
Marcis has sent the list of the meters in https://www.dropbox.com/s/8a6t2kqwkyangu9/HTML.zipTo input them, it would need colloboration of all the participants.
It is a good idea to share the work.Anybody willing to volunteer may come up.
I take this liberty to post without consulting the original author of the code.But hope that he will agree to it.
To input them, it would need colloboration of all the participants.It is a good idea to share the work.Anybody willing to volunteer may come up.
Some one ought to make a machine do it - manual work is not really necessary here;
but if people do it I will happily use it. :-)
Some one ought to make a machine do it -On Wed, Oct 23, 2013 at 12:19 PM, dhaval patel <drdhav...@gmail.com> wrote:
To input them, it would need colloboration of all the participants.It is a good idea to share the work.Anybody willing to volunteer may come up.
manual work is not really necessary here;
but if people do it I will happily use it. :-)
--
--
Vishvas /विश्वासः
fwiw - here is the code : https://github.com/vvasuki/sanskritnlp/tree/master/src/main/python/scrape_chandas
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
On basis of Vishvas's work,
Please find attached the complete list of Chandas for your code in the syntax needed.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-program...@googlegroups.com.
http://simplesanskrit.blogspot.com/
If you want to develop a open source program for identifying vruttas, I suggest that instead of manually building up a dictionary of vruttas, let us develop a program that builds up a data base of vruttas. Every time you input a verse, the program scans it, checks if it matches with its database and if it does not it provides you an option to add it to its database. You can give it a name which it keeps in its database.
I guess Apte had them all.
Correct me if I am wrong.
Note that Anand's collection (though it has great coverage) has many things missing ( eg: mAtrA-Chandas, vaitAlIya-s, *vipula varieties of anuShTubh, some metres I found elsewhere ).
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Any progress on adding additional meters in the database ?
On Sun, Nov 3, 2013 at 6:35 AM, dhaval patel <drdhav...@gmail.com> wrote:
Any progress on adding additional meters in the database ?Caveat : adding 1k lines of code to add 1k metres is the wrong approach.
It's better than nothing. But it seems even it is stuck. And it would be enough to start working with it.
How can an online document with changes every few days be used.
It can be used for reference improving it, but there is no need to count on it as now. It's a deep dive in a different direction. But it is the next, no the now required step. First we make it open source.
Than we can crowd source it. What's your take on it?
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Sorry for the delay in replying; I got busy with other work / life stuff.
An update:Yes, we have a long list of metres, generated by both Dhaval and Vishvas, and it would be a simple matter to add them. I also agree that it's not good to have them be stored as lines of code (instead of data), but it's easy to reprocess the above lists into some "data" format.
I don't yet see it as urgent though, so I'm waiting until the code is in some "clean enough" state, so that the data can be plugged in the right place and in the right format.
As a user, the first hindrance I encountered was not the absence of obscure metres, but the fact that I had to convert input into Harvard-Kyoto before using it. This was not a problem when I was typing out a verse from scratch, but when I was using a verse from elsewhere. Of course there are many transliteration tools online, but still this extra step was a barrier to usage. So, I spent some time on that and fixed that first. I've pushed to github, and also deployed on the website, a version that recognizes input whether you enter it in Devanagari, IAST, HK, or ITRANS. It does the detection automatically (if there are Devanagari characters in the input, use Devanagari; if there are diacritical marks like āīūṛṅñṭḍṇśṣ use IAST, etc.).
Try it out, and if you encounter any errors, please inform me (either over email or on the github "Issues" page), along with the input that caused the error.
The next thing I plan to do soon, now that it handles IAST natively, is to go over GRETIL texts. This should stress-test the system, and uncover a few bugs. After the basic metre-detection system has had its major issues ironed out, it would be a good time to add the large list of metres.
https://docs.google.com/document/d/1z-AQUfFWFfUMLJiNUrNe7S5atP5Wid3ay90LmGlkaQg/edit#heading=h.ak9mx3q9v65n take a look. Śloka is recognized in 1/3 cases. Even when a lot of junk HTML code was there identified right, but easy cases - missed.
Try it out, and if you encounter any errors, please inform me (either over email or on the github "Issues" page), along with the input that caused the error.No error messages, just Metre unknown.
If I want to test several metres, I have to press the back button all the time. No good. Leave the scanningwindow always above. So I can test many metres without pressing the browser back button at all.
The next thing I plan to do soon, now that it handles IAST natively, is to go over GRETIL texts. This should stress-test the system, and uncover a few bugs. After the basic metre-detection system has had its major issues ironed out, it would be a good time to add the large list of metres.It's not yet ready for GRETIL. See my google docs. It's a good idea, but the code is not yet ready.
Thanks for the detailed testing.
I have left comments on that document.Most of the issues were because of messy input,
though the code now handles it to some extent:(1) A HTML <BR> being included in the input (would be read as letters, though fortunately it usually doesn't affect metre). The code now strips these out.
(2) Trailing stuff like "।। म्जैन्य्१.०.८ ।।" being in the input (I've now made it ignore anything after // or || or two dandas or a double-danda).
(3) Leading stuff like ५.००१.००८अ and ५.००१.००८च् (transliterated from 'a' and 'c' I guess)
-- this is not worth handling in code. Just clean up input first.
(4) When the whole verse does not fit the metre, the per-pāda identification usually helps. In light of the fact that śloka is often written in two lines instead of 4, I've made it recognize a single half of śloka (and other metres) as well.
(5) Many of the verses are actually not in proper metre. I think that Śloka, being the bread and butter of epic poets, admits in practice a far greater degree of variation than the formal rules say.
"pañcamaṃ laghu sarvatra" is not sarvatra followed. This is perhaps to be expected from the workhorse metre that is the mainstay of epic works, somewhat like allowed variation in English metres. So actually śloka is one of the harder metres to recognize by computer (how liberal do you want to get?)
, though fortunately, it's one of the easier ones for humans to recognize. Anyway, you should see some improvements now.
Unknown characters are ignored: ṁ
ā ī ū ṛ ṝ ḷ ṅ ñ ṭ ḍ ṇ ś ṣ ḥ ṁ ṃ Ā Ī Ū Ṛ Ṝ Ḻ Ṅ Ñ Ṭ Ḍ Ṇ Ś Ṣ Ḥ Ṁ
Unknown characters are ignored: ‘ for ānato ‘smi = please add different variants for avagrahas.Yes, examples of that would also be welcome. You can email them to me privately, or put them publicly on the Github issues page, to avoid spamming this list. (By the way, if anyone thinks I'm sending too many emails, I'll cut down.)
Have implemented it.
I ran it over some more GRETIL texts (see read_gretil.py), and it uncovered even more typos in GRETIL. So yes, while the code is perhaps not yet ready, it's looking as if GRETIL is not yet ready too. :-)
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-program...@googlegroups.com.
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
Rightly said.While coding my Java program I noticed this and delved into it deeper and the result was a study which was published in Annals of the Bhadarkar Oriental Research Institute, Vol.84, 2003, Pages 101-115. A PDF version of the same is attached for those who have the time and the inclination to go through it.My algorithm in the Java app which has been uploaded by Visvas at his GITHUB page takes account of the findings of this paper.
PFA whatever I had in this context.There are some word files, some PDF, excel etc.
प्रियधवल। विकिस्रोतसि मूलं योजितवान् परोपकाराय - https://sa.wikisource.org/wiki/%E0%A4%B8%E0%A5%81%E0%A4%96%E0%A4%B5%E0%A4%BF%E0%A4%B9%E0%A5%83%E0%A4%A6%E0%A4%AF%E0%A4%BE%E0%A4%A8%E0%A4%82%E0%A4%A6%E0%A4%A8%E0%A5%80 ।
Vishvas, how does one reads ।ऽ।
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Please correct the heading to सुकविहृदयानन्दिनी instead of सुखवि....
Is it a mistake in definition? Everything read instead of only first 2 as bellow?
Definition is fine. It is treating the last letter in the pAda (the only letter in this case) as guru :-)
Definition is fine. It is treating the last letter in the pAda (the only letter in this case) as guru :-)1) 40 metres right now. Why not copy paste the other 1000?
archive.org इत्यत्र doc, xls सञ्चिका अपि स्थापितवान् - https://archive.org/details/SukavihRdayAnandinI
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
There is tool available for meter identification here : http://sanskritlibrary.org:8080/mitweb/
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
I have published my source code thru Mr Vishwas github. Thanks.Murthy
priya mitrANi
Taking from the other mail from Vishvas:
#5. Tools to identify metre(1, 2).
A few years ago I had done a lot of ground work that would identify a metre (i had my own scheme of encoding which was very easy to parse, but may be not a comprehensive scheme). I had taken all the metres from the sanskrit-english dictionary (VS Apte) appendix. It is in java, but I will soon be porting to groovy or scala as the java syntax is very elaborate.
I found a site recently from the samskrita google group that already does this identification and has about a 1000+ metre database. I dont think its exposed as a an API. One of the things I would like to keep as a goal is, whatever be the functionality of a project, it should be exposed as a webservice or have an api so it is easily interfaced. I will probably work towards that.
"In rare cases above source code for Sanskrit tools are available; but they are mostly not open-source; and there is quite a bit of duplication of effort; the boundless-sharing culture is mostly absent. Besides the limitations noted above, what is conspicuously missing from the above are tools directed at meeting important needs of the popular spoken Sanskrit movement, especially as we increasingly interact with information through computers and the internet."
Agree, for the popularizing spoken samskrita, i have been mulling over a khanacademy type of lessons in samskrita. Simple, short < 10 mts.
KhanAcademy style short videos would be ideal!
Also, I recommend using Javascript and putting up the code in Github and having a online version where people can input their sholkas. It might get more traction due to less barrier to entry!
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thanks for these inputs. I have suddenly started getting this chain in my Gmail.Murthy
On Wed, Apr 26, 2017 at 6:53 AM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
2017-04-25 11:15 GMT-07:00 Ramanathan Sharma <heyram...@gmail.com>:KhanAcademy style short videos would be ideal!For what? composition? LIke this?Or recitation? Like this?
- शतावधानिनो गणेशस्य छन्दःपरिचयः आङ्ग्लिक-चलच्चित्रेण।
Also, I recommend using Javascript and putting up the code in Github and having a online version where people can input their sholkas. It might get more traction due to less barrier to entry!Did you even read this thread, sir? You're talking about something like this http://sanskritmetres.appspot.com/ ?----
Vishvas /विश्वासः
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--My web site : http://murthygss.tripod.com/index.htm
and also my Sanskrit blog :
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
----
Vishvas /विश्वासः
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thank you, Sir, for all the inputs. I have browsed through them. Long back I developed a program that identifies vruttas and I think its source code is uploaded by you only at your site as I am not familiar with Github. The program, I am now needing help for converting it into a self-executable one, goes beyond a Name-identifying software.Regards,Murthy
On Mon, May 14, 2018 at 8:12 PM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
This new thread on BVP is pertinent: https://groups.google.com/d/msg/bvparishat/vRn9zywhE1o/3sdVawjWBgAJ
On Wed, Apr 26, 2017 at 3:11 AM, G S S Murthy <murt...@gmail.com> wrote:
Thanks for these inputs. I have suddenly started getting this chain in my Gmail.Murthy
On Wed, Apr 26, 2017 at 6:53 AM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
2017-04-25 11:15 GMT-07:00 Ramanathan Sharma <heyram...@gmail.com>:KhanAcademy style short videos would be ideal!For what? composition? LIke this?Or recitation? Like this?
- शतावधानिनो गणेशस्य छन्दःपरिचयः आङ्ग्लिक-चलच्चित्रेण।
Also, I recommend using Javascript and putting up the code in Github and having a online version where people can input their sholkas. It might get more traction due to less barrier to entry!Did you even read this thread, sir? You're talking about something like this http://sanskritmetres.appspot.com/ ?----
Vishvas /विश्वासः
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--My web site : http://murthygss.tripod.com/index.htm
and also my Sanskrit blog :
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
----
Vishvas /विश्वासः
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--My web site : http://murthygss.tripod.com/index.htm
and also my Sanskrit blog :
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
----
Vishvas /विश्वासः
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--My web site : http://murthygss.tripod.com/index.htm
and also my Sanskrit blog :
--
You received this message because you are subscribed to a topic in the Google Groups "sanskrit-programmers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sanskrit-programmers/8jhfDaawkWI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.