Sanskrit Sandhi using Pure Python

264 views
Skip to first unread message

shantanu oak

unread,
May 21, 2023, 6:18:07 AM5/21/23
to sanskrit-programmers
Hi,
I have developed Denormalized sanskrit sandhi in pure python. This is using a simple "for - loop" to generate the dictionary based on each Panini Sutra. A work in progress. Feedback appreciated. 

https://github.com/shantanuo/sandhi

Run all cells in the notebook, test your word at the end of the script. for e.g.

sandhi_builder('पितृ उपदेश')
#returns {'पित्रुपदेश'}

If you do not want to use python, then look for the last and first character in the index file. for e.g. if you are looking for sandhi of यज् + न then look for ज् न in the index file.

!grep 'ज् न' sandhi_code_out.txt
# ज् न ज्ञ 2.1.1 श्चुत्व

You will get ज्ञ and hence your sandhi word will be य + ज्ञ = यज्ञ

This is poor man's sandhi builder. For richer experience you can visit:

https://sanskrit.uohyd.ac.in/scl/#

-- Shantanu

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
May 21, 2023, 6:56:49 AM5/21/23
to sanskrit-p...@googlegroups.com
make it a package which I can install with pip; and provide some usage examples in the README.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/731ba55f-7635-452e-b40f-7c9f39311bf4n%40googlegroups.com.


--
--
Vishvas /विश्वासः

shantanu oak

unread,
May 23, 2023, 12:15:16 AM5/23/23
to Rajeshwari Godbolé, sanskrit-p...@googlegroups.com
Hi,
I am not an expert either. I did google search for each sutra and wrote the code. During my research I have seen so many incorrect words that I do not trust google search anymore.

I will not be surprised if the people find bugs in the script.
But the beauty is that anyone can make changes and correct it.

In order to re-check this example, I used the Linux find command (grep) and got this result...

# grep 'ृ उ' sandhi_code_out.txt
 ृ उ ् रु 1.3.3 यण

It means this is 'यण' sandhi, 'इकोऽयणचि' is the sutra. The explanation that I got on the net is...
# ऋ के बाद कोई स्वर आवे तो ऋ के स्थान पर र

You will get पित्रोपदेश word if you run this...
sandhi_builder('पित्र उपदेश') # or पितरोपदेश for 'पितर उपदेश'

Making a python package for this is a good idea. But I do not know much about that. Any help will be appreciated.

-- Shantanu


On Mon, May 22, 2023 at 8:07 PM Rajeshwari Godbolé <rgod...@gmail.com> wrote:
This looks interesting! One question about the example (I'm not an expert so pardon me if this is incorrect):

sandhi_builder('पितृ उपदेश')
#returns {'पित्रुपदेश'} -- should this not be पित्रोपदेश?

Thanks,

Rajeshwari



--

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
May 23, 2023, 12:23:55 AM5/23/23
to sanskrit-p...@googlegroups.com, Rajeshwari Godbolé
On Tue, 23 May 2023 at 09:45, shantanu oak <shanta...@gmail.com> wrote:


Making a python package for this is a good idea. But I do not know much about that. Any help will be appreciated.


Check out these files, and how the code is placed in the repo.

2 months ago

Just imitate that. Shouldn't take long for you to figure out using that as an example.

 
-- Shantanu


On Mon, May 22, 2023 at 8:07 PM Rajeshwari Godbolé <rgod...@gmail.com> wrote:
This looks interesting! One question about the example (I'm not an expert so pardon me if this is incorrect):

sandhi_builder('पितृ उपदेश')
#returns {'पित्रुपदेश'} -- should this not be पित्रोपदेश?


पित्रुपदेश is correct.

 

Thanks,

Rajeshwari



On Sun, May 21, 2023 at 6:18 AM shantanu oak <shanta...@gmail.com> wrote:
Hi,
I have developed Denormalized sanskrit sandhi in pure python. This is using a simple "for - loop" to generate the dictionary based on each Panini Sutra. A work in progress. Feedback appreciated. 

https://github.com/shantanuo/sandhi

Run all cells in the notebook, test your word at the end of the script. for e.g.

sandhi_builder('पितृ उपदेश')
#returns {'पित्रुपदेश'}

If you do not want to use python, then look for the last and first character in the index file. for e.g. if you are looking for sandhi of यज् + न then look for ज् न in the index file.

!grep 'ज् न' sandhi_code_out.txt
# ज् न ज्ञ 2.1.1 श्चुत्व

You will get ज्ञ and hence your sandhi word will be य + ज्ञ = यज्ञ

This is poor man's sandhi builder. For richer experience you can visit:

https://sanskrit.uohyd.ac.in/scl/#

-- Shantanu

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/731ba55f-7635-452e-b40f-7c9f39311bf4n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.

Hrishikesh Terdalkar

unread,
May 23, 2023, 1:15:41 AM5/23/23
to sanskrit-p...@googlegroups.com

shantanu oak

unread,
Aug 14, 2023, 5:44:38 AM8/14/23
to sanskrit-programmers
Here is an API access to the sandhi code. Type the words you want to join after the question mark.

https://2ku5vw336655hiomtcogv4sopm0bpqdo.lambda-url.us-east-1.on.aws/?

If you type "कर्मणि एव अधिकारः ते" you will get back "कर्मण्येवाधिकारस्ते". 

Please test it and let me know the cases where it fails. Any developer can easily build an app for this :)

-- Shantanu

shantanu oak

unread,
Aug 21, 2023, 12:39:48 AM8/21/23
to sanskrit-programmers
The Sanskrit Sandhi android app is available here...

https://play.google.com/store/apps/details?id=com.myapp.marathispellcheckandsanskritsandhi

It includes Marathi spell check as well.

-- Shantanu

Akshay B

unread,
Nov 15, 2023, 7:57:43 PM11/15/23
to sanskrit-programmers
Hi Shantanu

Akshay Bapaye here. Can you share your email address with me. I would like to connect with you personally.

Thanks.

shantanu oak

unread,
Feb 5, 2024, 10:49:43 PMFeb 5
to sanskrit-programmers
If you are using "Telegram" app on your mobile, add "SanskritSandhibot" in your friends list.

https://t.me/SanskritSandhibot

Message any 2 or more words for e.g. "कर्मणि एव अधिकारः ते" and get back sandhi like "कर्मण्येवाधिकारस्ते" 

-- Shantanu

venkata raman

unread,
Feb 24, 2024, 9:40:08 AMFeb 24
to sanskrit-programmers
Is there any such api which does reverse sandhi or split eg: given  "कर्मण्येवाधिकारस्ते"  the api returning "कर्मणि एव अधिकारः ते"

shantanu oak

unread,
Mar 19, 2024, 4:22:05 AMMar 19
to sanskrit-programmers
Use "SandhiSplitBot" in telegram if you need to split.

https://t.me/SandhiSplitBot
If you type कर्मण्येवाधिकारस्ते bot will reply कर्मणि एव अधिकारः ते

Screenshot: https://kagapa.s3.ap-south-1.amazonaws.com/spellcheck/app/sandhi_split.jpg

There are 2 bots in telegram. One bot can join the words the other can split.

SanskritSandhibot
If you type गण ईश उत्सव  bot will reply गणेशोत्सव

Screenshot: https://kagapa.s3.ap-south-1.amazonaws.com/spellcheck/app/sandhi_join.jpg

kenp

unread,
Mar 22, 2024, 11:24:32 AMMar 22
to sanskrit-programmers

shantanu oak

unread,
Mar 24, 2024, 8:52:59 AMMar 24
to sanskrit-programmers
I think browser (desktop) version is not possible because even if it is based on Hunspell, the sandhi and splitter are part of complex cloud programming. You can add the bot called "SanskritOneBot" in your telegram friends list. It is called "one" because it can do spell check, sandhi and also split!
Check spelling based on hunspell. Also try to split (संधि विच्छेद), if there is a single word. If in case 2 or more (upto 19) words are typed, then it will try to merge them based on Panini Sutras.

screenshot:
https://kagapa.s3.ap-south-1.amazonaws.com/spellcheck/app/telegram_sansone.jpg

Spell checker splits (संधि विच्छेद) each word and if all the parts are found in corpus then that word is considered correct. In the screenshot "संक्षिप्तपरिचयं" is marked as incorrect. That is because even if "संक्षिप्त" is there in the database, परिचयं is not. (परिचयम् is however included) I am not an expert and feedback is appreciated!
Reply all
Reply to author
Forward
0 new messages