python transliteration module

789 views
Skip to first unread message

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Mar 16, 2017, 11:17:41 AM3/16/17
to Sai सायिः साङ्गणकविद्वान् Susarla, sanskrit-programmers, aruNaprasAda
shrI sAyi (cc-ed) seeks a python transliteration module (supporting a wide variety of scripts and transliterations schemes) so that he can (for various purposes) create a web-api for transliteration. I am aware of

sanscript.js

and

sanscript.php 

(besides my own scala rendering). Is there a good python alternative as well?
--
--
Vishvas /विश्वासः

dhaval patel

unread,
Mar 16, 2017, 11:41:09 AM3/16/17
to sanskrit-p...@googlegroups.com
https://github.com/drdhaval2785/siddhantakaumudi/blob/master/transcoder.py is the most robust I have come across. 
data/transcoder folder holds XMLs with transliteration tables.

Written by Jim Funderburk.

transcoder_processString module is what I use.
transcoder_processString('Davala','slp1','deva') converts from SLP1 to Devanagari.

Shreevatsa R

unread,
Mar 16, 2017, 3:21:57 PM3/16/17
to sanskrit-programmers, Michael Bykov, Sai Susarla, Sanskrit Questions, dhaval patel
One thing I would suggest:
I remember Michael Bykov proposed a while ago (https://groups.google.com/d/msg/sanskrit-programmers/YMMTS5ACCQs/EYrRdOAvAwAJ) that we should have tests for common tasks. I don't know if anyone's implemented the idea, but we should pool together the tests in sanscript.js / sanscript.php / Funderburk's transcoder / etc., so that we can evaluate to what extent a particular transliteration solution satisfies the needs. It would also help us achieve clarity on what exactly transliteration is meant to do, what are flexible design choices (e.g. how to indicate strings that should not be transliterated) and what are common requirements.


--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

learnsanskrit.org

unread,
Mar 22, 2017, 12:20:16 AM3/22/17
to Shreevatsa R, sanskrit-programmers, Michael Bykov, Sai Susarla, dhaval patel
Re-sending since I wasn't a member of sanskrit-programmers under this account.

~

I have an old transliterator that might be useful:



I've been inactive with Sanskrit stuff for a few months now (having switched to meditation instead), but let me know if I can help further.

On Tue, Mar 21, 2017 at 9:18 PM, learnsanskrit.org <questions...@gmail.com> wrote:
I have an old transliterator with tests that might be useful:


I've been inactive with Sanskrit stuff for a few months now (having switched to meditation instead), but let me know if I can help further.

To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Jun 4, 2017, 12:53:49 AM6/4/17
to sanskrit-programmers, Shreevatsa R, Michael Bykov, Sai Susarla, dhaval patel, aruNaprasAda, Arun Prasad
namaste folks, 

I've put arun's code into a convenient pip package. Replicating details from https://pypi.python.org/pypi/indic-transliteration below:

indic-transliteration 1.1.0

Transliteration tools to convert text in one indic script encoding to another

Package: roles | releases | view | edit | files | PKG-INFO

Indic transliteration tools

Intro

For detailed examples and help, please see individual module files in this package.

Transliteration

from indic_transliteration import sanscript
output = sanscript.transliterate('idam adbhutam', sanscript.HK, sanscript.DEVANAGARI)

Script detection

detect.py automatically detects a string’s transliteration scheme:

from indic_transliteration import detect
detect.detect('pitRRIn') == Scheme.ITRANS
detect.detect('pitRRn') == Scheme.HK

For contributors

Contact

Have a problem or question? Please head to github.

Packaging

  • ~/.pypirc should have your pypi login credentials.

    python setup.py bdist_wheel
    twine upload dist/*
    
 
FileTypePy VersionUploaded onSize
indic_transliteration-1.1.0-py2.py3-none-any.whl (md5)Python Wheelpy2.py32017-06-0412KB


To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Jun 4, 2017, 1:00:51 AM6/4/17
to sanskrit-programmers, Shreevatsa R, Michael Bykov, Sai Susarla, dhaval patel, Arun Prasad
2017-03-21 21:20 GMT-07:00 learnsanskrit.org <questions...@gmail.com>:
I've been inactive with Sanskrit stuff for a few months now (having switched to meditation instead), but let me know if I can help further.

​Hey aruN, great to hear your new avenue for achieving peace and happiness. Moving beyond the individual level, we(1) miss your contributions, so please reactivate yourself if you can!​

​(1)​ - Poetry and beauty apart, I especially find myself motivated by the critical role of sanskrit (only matched by ritual) in its use by the brAhmaNa varNa in conserving and reinforcing the sagely Indo Aryan ideals (esp. those pertaining to dharma).

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Jul 18, 2021, 7:38:00 AM7/18/21
to sanskrit-programmers, Michael Bykov, Shreevatsa R श्रीवत्सो गणितज्ञः, Arun Prasad
I've consolidated transliteration modules I've come to maintain here - https://github.com/indic-transliteration .

Pursuing transliteration uniformity irrespective of programming language, we now have:

common_maps: https://github.com/indic-transliteration/common_maps (shared currently between py and js)

and 

common_tests : https://github.com/indic-transliteration/common_tests (used by py, not too difficult in scala - but I'm stuck with js - https://github.com/indic-transliteration/sanscript.js/issues/38 - help welcome.  )

Hope those obsessive hours spent doing the above will prove worthwhile (in terms of time saved at least) :-D




To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Shreevatsa R

unread,
Jul 19, 2021, 2:31:50 PM7/19/21
to विश्वासो वासुकिजः (Vishvas Vasuki), sanskrit-programmers, Michael Bykov, Arun Prasad
This is great, thanks for doing this!

This should be really helpful in the future. I haven't looked at the individual tests yet but glanced at one of the files, and it's nice to see the (what I've heard of as) "table-driven tests". Hope the JS gets figured out as well; then all the systems would have a better chance of being in sync. Of course some of the systems may not yet have implemented some features (some "extra" characters, say), or do things differently (temporarily turning off transliteration with ### vs something else, for example) but presumably these too can be encoded somehow into the tests.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Jul 19, 2021, 8:39:40 PM7/19/21
to Shreevatsa R, sanskrit-programmers, Michael Bykov, Arun Prasad
On Tue, Jul 20, 2021 at 12:01 AM Shreevatsa R <shree...@gmail.com> wrote:
This is great, thanks for doing this!

This should be really helpful in the future. I haven't looked at the individual tests yet but glanced at one of the files, and it's nice to see the (what I've heard of as) "table-driven tests". Hope the JS gets figured out as well; then all the systems would have a better chance of being in sync. Of course some of the systems may not yet have implemented some features (some "extra" characters, say), or do things differently (temporarily turning off transliteration with ### vs something else, for example) but presumably these too can be encoded somehow into the tests.

Reply all
Reply to author
Forward
0 new messages