Sanskrit Sorting

493 views
Skip to first unread message

Mārcis Gasūns

unread,
Sep 20, 2013, 8:51:14 AM9/20/13
to sanskrit-p...@googlegroups.com
Namaste

How do you sort Devanagari text? MS Excel sorting is going wrong. Is there a PHP or VBA script who can make Sanskrit order like in Apte's dictionary?
zwxindy.pdf

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 20, 2013, 12:05:20 PM9/20/13
to sanskrit-p...@googlegroups.com


On Fri, Sep 20, 2013 at 5:51 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
Namaste

How do you sort Devanagari text? MS Excel sorting is going wrong. Is there a PHP or VBA script who can make Sanskrit order like in Apte's dictionary?

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
--
Vishvas /विश्वासः

Mārcis Gasūns

unread,
Sep 20, 2013, 3:19:02 PM9/20/13
to sanskrit-p...@googlegroups.com


On Friday, 20 September 2013 20:05:20 UTC+4, विश्वासो वासुकिजः wrote:

Not exactly.
It should be
  • ह्वल्
  • ह्वला
It's
  • ह्वला
  • ह्वल्
Instead. Shorter words should come first, see screenshot. 
Excel sorts as:
  • ह्वरस्
  • ह्वर्
  • ह्वला
  • ह्वल्
  • ह्वा
  • ह्वान
  • ह्वार
  • ह्वारय्
In the book (and how it should be):
  • ह्वर्
  • ह्वरस्
  • ह्वल्
  • ह्वला
  • ह्वा
  • ह्वान
  • ह्वार
  • ह्वारय्
So no, Google Docs are as miserable as MS Office. I wonder only why I'm the only one to notice that. See .pdf page 24, 25, 27, 29.
hval.jpg

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 20, 2013, 4:39:20 PM9/20/13
to sanskrit-p...@googlegroups.com
On Fri, Sep 20, 2013 at 12:19 PM, Mārcis Gasūns <gas...@gmail.com> wrote:
In the book (and how it should be):
So no, Google Docs are as miserable as MS Office. I wonder only why I'm the only one to notice that. See .pdf page 24, 25, 27, 29.

Ok I now see what you want. Perhaps this explains divergence of expectations:
most hindu-s are fine with unicode sorting - as long as the pattern is clear. They don't have this expectation of "how it should be" - perhaps because sorting is a recent convenience and they have not formed fixed opinions beyond the axaramAlA. I myself keep several spreadsheets, and sort rows occasionally without being bothered by the order.

Ambarish Sridharanarayanan

unread,
Sep 20, 2013, 7:23:03 PM9/20/13
to sanskrit-p...@googlegroups.com
My expectation is exactly as that of Mārcis; I've been too lazy to do anything about it, but you *can* create a custom collation in Excel.

The long-term "fix" is to talk to Unicode about Sanskrit collations. For a start, you can ask on the Unicode Indic mailing list.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 20, 2013, 9:20:23 PM9/20/13
to sanskrit-p...@googlegroups.com
On Fri, Sep 20, 2013 at 5:51 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
Is there a PHP or VBA script who can make Sanskrit order like in Apte's dictionary?

Mārcis Gasūns

unread,
Sep 21, 2013, 1:35:19 AM9/21/13
to sanskrit-p...@googlegroups.com
Sorting is not easy. It's easy, but not implemented. When you write a PhD in Hindi, how do you order the bibliography?


On Saturday, 21 September 2013 03:23:03 UTC+4, Ambarish Sridharanarayanan wrote:


My expectation is exactly as that of Mārcis; I've been too lazy to do anything about it, but you *can* create a custom collation in Excel.
I know I can. But I need an algorithm. I don't have it. I have scattered pieces at http://samskrtam.ru/sanskrit-sorting-devanagari/ I need help.
 

The long-term "fix" is to talk to Unicode about Sanskrit collations. For a start, you can ask on the Unicode Indic mailing list.
Oh, no, that will not work. I've contacted Unicode so many times. They will not help. Why? Because Unicode and Microsoft or Google have nothing in common. Nothing. 

I can not ask anything at https://groups.google.com/forum/#!forum/technical-hindi - it's a closed group. Can you, please?

M.G.
SAEIE-sorting.jpg

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 21, 2013, 1:53:00 AM9/21/13
to sanskrit-p...@googlegroups.com

On Fri, Sep 20, 2013 at 10:35 PM, Mārcis Gasūns <gas...@gmail.com> wrote:

I can not ask anything at https://groups.google.com/forum/#!forum/technical-hindi - it's a closed group. Can you, please?

I am not a member there either - I suppose you need to contact the owner, then join the group and then post.

Mārcis Gasūns

unread,
Sep 21, 2013, 4:03:47 AM9/21/13
to sanskrit-p...@googlegroups.com


On Saturday, 21 September 2013 09:53:00 UTC+4, विश्वासो वासुकिजः wrote:

I am not a member there either - I suppose you need to contact the owner, then join the group and then post.

ह्वा
ह्वान
ह्वार
ह्वारय्
ह्वरस्
ह्वर्
ह्वला
ह्वल्
Bad one.

√ ह्लाद् hlād
ह्लाद hlāda
ह्लादक hlādaka 
ह्लादन hlādana 
ह्लादयत् hlādayat 
ह्लादि hlādi 
ह्लादित hlādita
ह्लादिन् hlādin
√ ह्वल् hval 
ह्वाय् hvāy  
Much better.

Anunad Singh

unread,
Sep 22, 2013, 12:27:14 AM9/22/13
to sanskrit-p...@googlegroups.com
नमस्ते

I came to know about this discussion through  Mārcis Gasūns's request to join 'वौज्ञानिक तथा तकनीकी हिन्दी समूह'. I have quickly gone through the messages and would like to say the following:

1) I have participated in some of the discussions on Devanagari collation. I found that there is no standard for collation (as far as I know).

2) Unicode consortium will not help in this regard. They are very clear about it at this page:


Q: What about collation of Indic language data? Is that just a binary sort?

http://www.unicode.org/faq/indic.html

A: No. Collation order is not the same as code point order. A good treatment of some issues specific to collation in Indic languages can be found in the paper Issues in Indic Language Collation by Cathy Wissink.


Collation in general must proceed at the level of language or language variant, not at the script or codepoint levels. See also UTS #10: Unicode Collation Algortihm. Some Indic-specific issues are also discussed in that report.

(3)  OpenOffice and Micrsoft office have a default collation (the unicode collation algorithm?) for collating Devanagari and other Indian scripts.  And they have provision to sort according to a specfied order too.

(4) I designed a Devanagri sorting program in javascript in 2007. In this, I have followed a sorting order which I thought was appropriate.

(5) This program can be modified to follow a clearly specified sorting order.

So I request the pundits here discuss and to provide one/two/three sorting orders for Devanagari first. Then I will try to modify my program accordingly.

-- अनुनाद सिंहः

Mārcis Gasūns

unread,
Sep 22, 2013, 1:56:22 PM9/22/13
to sanskrit-p...@googlegroups.com
Glad to see you here. I'm afraid you are asking for what will not be found here.


On Sunday, 22 September 2013 08:27:14 UTC+4, Anunad Singh wrote:

I came to know about this discussion through  Mārcis Gasūns's request to join 'वौज्ञानिक तथा तकनीकी हिन्दी समूह'. I have quickly gone through the messages and would like to say the following:

1) I have participated in some of the discussions on Devanagari collation. I found that there is no standard for collation (as far as I know).
Not quite true. There are respected Sanskrit books that follow a pattern. And the pattern is more detailed than any of the schemes I've yet documented at http://samskrtam.ru/sanskrit-sorting-devanagari/
 

2) Unicode consortium will not help in this regard. They are very clear about it at this page:


Q: What about collation of Indic language data? Is that just a binary sort?

http://www.unicode.org/faq/indic.html

A: No. Collation order is not the same as code point order. A good treatment of some issues specific to collation in Indic languages can be found in the paper Issues in Indic Language Collation by Cathy Wissink.


Collation in general must proceed at the level of language or language variant, not at the script or codepoint levels. See also UTS #10: Unicode Collation Algortihm. Some Indic-specific issues are also discussed in that report.
I know, I've read it before. Unicode is no good. Still it's the best we have.
 

(3)  OpenOffice and Micrsoft office have a default collation (the unicode collation algorithm?) for collating Devanagari and other Indian scripts.  And they have provision to sort according to a specfied order too.
And the order has nothing to do with Sanskrit. It has no options, so it's useless in most cases. See http://samskrtam.ru/himl-amd-sorting/
 

(4) I designed a Devanagri sorting program in javascript in 2007. In this, I have followed a sorting order which I thought was appropriate.
URL?
 

(5) This program can be modified to follow a clearly specified sorting order.

So I request the pundits here discuss and to provide one/two/three sorting orders for Devanagari first. Then I will try to modify my program accordingly.
Oh, I guess the best way would be to open books. Books like http://www.flickr.com/photos/gasyoun/

 M.G.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 22, 2013, 2:20:29 PM9/22/13
to sanskrit-p...@googlegroups.com

On Sun, Sep 22, 2013 at 10:56 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
 

(5) This program can be modified to follow a clearly specified sorting order.

So I request the pundits here discuss and to provide one/two/three sorting orders for Devanagari first. Then I will try to modify my program accordingly.

The sorting algorithm Marcis wants seems rather simple:

Given a list of words, split it into two parts:
<list a> with words not ending with virAma.
<list b> with words ending with virAma .

Sort both lists using the default unicode ordering.

Traverse through list b, and insert each word into <list a> at the right spot (ie: just above the spot where the same word without the virAma -- ending in अकार - would have appeared). 

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 22, 2013, 2:24:12 PM9/22/13
to sanskrit-p...@googlegroups.com
I take that back - 
That would not order निशा and निश्चय correctly. 

But, essentially, the ordering is same as unicode with the following exception: virAma appears before any vowel.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 22, 2013, 2:26:00 PM9/22/13
to sanskrit-p...@googlegroups.com

On Sun, Sep 22, 2013 at 11:24 AM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
I take that back - 
That would not order निशा and निश्चय correctly. 

But, essentially, the ordering is same as unicode with the following exception: virAma appears before any vowel.

But then, looking at the flickr images Marcis sent, I just realized that he would want निशा to appear before निश्चय। So the previous algorithm should work.

Mārcis Gasūns

unread,
Sep 22, 2013, 3:34:00 PM9/22/13
to sanskrit-p...@googlegroups.com


On Sunday, 22 September 2013 22:26:00 UTC+4, विश्वासो वासुकिजः wrote:

But then, looking at the flickr images Marcis sent, I just realized that he would want निशा to appear before निश्चय। So the previous algorithm should work.


विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 22, 2013, 4:00:26 PM9/22/13
to sanskrit-p...@googlegroups.com
Seems like the algorithm I sent should work. Does it not?

Mārcis Gasūns

unread,
Sep 23, 2013, 4:44:33 AM9/23/13
to sanskrit-p...@googlegroups.com


On Monday, 23 September 2013 00:00:26 UTC+4, विश्वासो वासुकिजः wrote:


Seems like the algorithm I sent should work. Does it not?

Sent? Where?
My VBA script still gets
  • सनेमि sanemi

  • सन्त् sant

  • सन्न sanna
Instead of
  • सनेमि sanemi

  • सन्त् sant

  • संतत /saṅtata/ (pp. от संतन् ) 1) связанный 2) непрерывный, постоянный

  • संतापन /saṅtāpana/ 1) мучающий 2) ноющий, болящий

  • संदर्प /saṅdarpa/ m. 1) задор 2) озорство, шалость 3) высокомерие, заносчивость 4) упрямство, своенравие

  • संध्यासमय /saṅdhyā-samaya/ m. см. संध्याकाल

  • सन्न /sanna/ pp. от सद्
 

Anunad Singh

unread,
Sep 23, 2013, 5:46:06 AM9/23/13
to sanskrit-p...@googlegroups.com
I did not study your VBA code but following may be the problem:

In microsoft word I found that if you search for, for example, क  in text  कमल  क्लेश  किम कोमल , only the first क in कमल is found, not others .

something similar you are trying to do which is not happening as expected.

-- अनुनादः


2013/9/23 Mārcis Gasūns <gas...@gmail.com>
Boxbe This message is eligible for Automatic Cleanup! (gas...@gmail.com) Add cleanup rule | More info



On Monday, 23 September 2013 00:00:26 UTC+4, विश्वासो वासुकिजः wrote:


Seems like the algorithm I sent should work. Does it not?

Sent? Where?
My VBA script still gets
  • सनेमि sanemi

  • सन्त् sant

  • सन्न sanna
Instead of
  • सनेमि sanemi

  • सन्त् sant

  • संतत /saṅtata/ (pp. от संतन् ) 1) связанный 2) непрерывный, постоянный

  • संतापन /saṅtāpana/ 1) мучающий 2) ноющий, болящий

  • संदर्प /saṅdarpa/ m. 1) задор 2) озорство, шалость 3) высокомерие, заносчивость 4) упрямство, своенравие

  • संध्यासमय /saṅdhyā-samaya/ m. см. संध्याकाल

  • सन्न /sanna/ pp. от सद्
 

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




--
जब भी देश पर विपत्ति, जुल्म, गुलामी की मुसीबत आई है।
अपनी यह हिंदी ही काम आई है।
रामानंद और रामानुजाचार्य से लेकर
अन्ना तक सबने हिंदी ही अपनाई है।

Mārcis Gasūns

unread,
Sep 23, 2013, 9:21:40 AM9/23/13
to sanskrit-p...@googlegroups.com


On Monday, 23 September 2013 13:46:06 UTC+4, Anunad Singh wrote:
I did not study your VBA code but following may be the problem:
 

In microsoft word I found that if you search for, for example, क  in text  कमल  क्लेश  किम कोमल , only the first क in कमल is found, not others .
This is as expected. 


something similar you are trying to do which is not happening as expected.


Something does not helps. And it's in no way similar.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 23, 2013, 10:49:35 AM9/23/13
to sanskrit-p...@googlegroups.com

On Mon, Sep 23, 2013 at 1:44 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
Sent? Where?
Not code, just an algorithm described in english.
Please see email message starting with "The sorting algorithm Marcis wants seems rather simple:"

Mārcis Gasūns

unread,
Sep 23, 2013, 11:15:40 AM9/23/13
to sanskrit-p...@googlegroups.com


On Monday, 23 September 2013 18:49:35 UTC+4, विश्वासो वासुकिजः wrote:

On Mon, Sep 23, 2013 at 1:44 AM, Mārcis Gasūns <gas...@gmail.com> wrote:
Sent? Where?
Not code, just an algorithm described in english.
No, that is not an algorithm. Because no details. But the devil is in the details. Look the .pdf - it has more details, but it does not gives the algorithm either.
 
specimen-MacDonell-Grammar1927.pdf

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 23, 2013, 12:33:56 PM9/23/13
to sanskrit-p...@googlegroups.com
Whoa - this is complicated. The aforementioned algorithm does not meet your needs because it would place anusvAra-s and visarga-s first.

Mārcis Gasūns

unread,
Sep 24, 2013, 3:56:45 AM9/24/13
to sanskrit-p...@googlegroups.com


On Monday, 23 September 2013 20:33:56 UTC+4, विश्वासो वासुकिजः wrote:
Whoa - this is complicated. 
 This only one requirement. So as said before - there is no script that does it today right. Maybe only http://sanskrit.inria.fr/DICO/

Anunad Singh

unread,
Sep 24, 2013, 4:08:55 AM9/24/13
to sanskrit-p...@googlegroups.com


On Monday, 23 September 2013 18:51:40 UTC+5:30, Mārcis Gasūns wrote:


On Monday, 23 September 2013 13:46:06 UTC+4, Anunad Singh wrote:
I did not study your VBA code but following may be the problem:
 

In microsoft word I found that if you search for, for example, क  in text  कमल  क्लेश  किम कोमल , only the first क in कमल is found, not others .

 
This is as expected. 

I think it is not expected. In Notepad++, javascript etc it does detect presence of क in all the above words. In MS word/exel even if you try to search   ि  ,  it will skip किम् . It is only when you search for कि  that you get ponted to 'किम्' .

I am finding to comprehend the central point of the discussion. Can somebody remention the required sorting order in eight-ten lines?

-- अनुनादः

 

Anunad Singh

unread,
Sep 24, 2013, 5:58:28 AM9/24/13
to sanskrit-p...@googlegroups.com
In the above discussion, I wanted to say  "I am finding it difficult to comprehend the central point of the discussion."

Also, ि and any मात्रा for that matter can be searched in Notepad++ and javascript (correctly). Maatraas are seperate code points in Unicode. If they are not detected, it is a sign of some fault in the algorithm. Also, if you cannot detect them seperately, it will become very combursome to write code for comparing two strings.

- अनुनाद

Mārcis Gasūns

unread,
Sep 24, 2013, 2:17:54 PM9/24/13
to sanskrit-p...@googlegroups.com
There are two approaches: phonetic and syllable.
http://www.cdacmumbai.in/projects/indix/tech/phonemic.pdf - phonetic is the best choice, but can't be used in unicode devanagari.


On Tuesday, 24 September 2013 13:58:28 UTC+4, Anunad Singh wrote:
In the above discussion, I wanted to say  "I am finding it difficult to comprehend the central point of the discussion."
The question is how many parameters should be taken into account. And how to code them in any programming language available.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 25, 2013, 2:13:42 AM9/25/13
to sanskrit-p...@googlegroups.com

On Tue, Sep 24, 2013 at 1:08 AM, Anunad Singh <anu...@gmail.com> wrote:
I am finding to comprehend the central point of the discussion. Can somebody remention the required sorting order in eight-ten lines?
अनुनाद - अत्र ईक्षताम् - https://docs.google.com/document/d/1imUVqdem21bTjbeXI300JxDntfYu1jgbWWV2N5Q3Qkc/edit । ( Everyone with the link has edit rights - so marcis can modify appropriately. )

Anunad Singh

unread,
Sep 25, 2013, 5:15:02 AM9/25/13
to sanskrit-p...@googlegroups.com
धन्यवादं विश्वास ।

-- अनुनादः


2013/9/25 विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com>
Boxbe This message is eligible for Automatic Cleanup! (vishvas...@gmail.com) Add cleanup rule | More info


On Tue, Sep 24, 2013 at 1:08 AM, Anunad Singh <anu...@gmail.com> wrote:
I am finding to comprehend the central point of the discussion. Can somebody remention the required sorting order in eight-ten lines?
अनुनाद - अत्र ईक्षताम् - https://docs.google.com/document/d/1imUVqdem21bTjbeXI300JxDntfYu1jgbWWV2N5Q3Qkc/edit । ( Everyone with the link has edit rights - so marcis can modify appropriately. )


--
--
Vishvas /विश्वासः

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mārcis Gasūns

unread,
Sep 25, 2013, 6:30:51 AM9/25/13
to sanskrit-p...@googlegroups.com

Anunad Singh

unread,
Sep 27, 2013, 12:41:57 AM9/27/13
to sanskrit-p...@googlegroups.com
I have developed a javascript based Devanagari sorting program. I think it fulfills the requirement.

It uses  two lists:
(1) list of ordered 'aadhaar-varna'
(2) list of ordered 'maatraas'

it some different order is required, it can be implemented by changing the order n these lists.

pl find the html file attached. I have tested it in Firefox only.

-- अनुनाद


2013/9/25 Mārcis Gasūns <gas...@gmail.com>
Boxbe This message is eligible for Automatic Cleanup! (gas...@gmail.com) Add cleanup rule | More info

http://mathworld.wolfram.com/LexicographicOrder.html

On Wednesday, 25 September 2013 13:15:02 UTC+4, Anunad Singh wrote:
धन्यवादं विश्वास ।

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Devanukarana_11.html

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 27, 2013, 12:58:28 AM9/27/13
to sanskrit-p...@googlegroups.com
Inline image 1 

-- अहं तु एतद् एव पश्यामि त्वत्सञ्चिकायाः उद्घाटनेन… 
image.png

Anunad Singh

unread,
Sep 27, 2013, 1:23:26 AM9/27/13
to sanskrit-p...@googlegroups.com
एतद् संचिका  सम्पादित्रे (नोटपैड, वर्ड इत्यादयः) न उद्घाटितव्यम्।  एतत्  फायरफॉक्स बाउजरे अथवा अन्ये कस्मिश्चिद् ब्राउजरे  चालितव्यम्।  तत्र द्वे पाठ-पेटिकाः सन्ति।  प्रथमे पेटिकायां  अशाटितं पाठं योजयेत।  द्वितीये पेटिकायां शाटितं पाठं प्रकटति यदा भवान्  'Sort Devanagari' इति नाम्ने बटने 'क्लिकं' करोति।

-- अनुनादः


2013/9/27 विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com>
Boxbe This message is eligible for Automatic Cleanup! (vishvas...@gmail.com) Add cleanup rule | More info

Inline image 1 

-- अहं तु एतद् एव पश्यामि त्वत्सञ्चिकायाः उद्घाटनेन… 
On Thu, Sep 26, 2013 at 9:41 PM, Anunad Singh <anu...@gmail.com> wrote:



--
--
Vishvas /विश्वासः

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

image.png

Anunad Singh

unread,
Sep 27, 2013, 1:42:37 AM9/27/13
to sanskrit-p...@googlegroups.com

I have just downloded the html file attached with my reply and tried to run it. It has completely gone corrupt.

Pl download it from here and then use.

https://docs.google.com/file/d/0B06JOlm5x83YN1RQRVVqUXhUdlk/edit?usp=sharing


-- अनुनादः

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 27, 2013, 1:54:34 AM9/27/13
to sanskrit-p...@googlegroups.com
अत्युत्तमं भोः! अभिनन्द्यसे!

Here is the order it yielded:
पुंलिङ्गः
पुँलिङ्गः
हर्
हर
हरः
हरगतिः
हरगतिः
हरङ्गतः
हरचितिः
हरञ्चितः
हरञ्चितः
हरंगतः
हरंचितः
हरांशुः
हरिः
हरिःस्थितिः
हरिरिति

A good bonus would be to have it 
a] be deployed somewhere on the internet.
b] Handle mixed text:
हरगतिः asdfasdf
हरगतिः addfasdf




--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 27, 2013, 1:56:17 AM9/27/13
to sanskrit-p...@googlegroups.com

On Thu, Sep 26, 2013 at 10:54 PM, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:
हरङ्गतः
हरचितिः
हरञ्चितः
हरञ्चितः
हरंगतः
हरंचितः
हरांशुः

I see a problem above - 
the expected order is:
हरंगतः
हरङ्गतः
हरंचितः
हरञ्चितः


विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 27, 2013, 1:59:36 AM9/27/13
to sanskrit-p...@googlegroups.com
संप्रति एवं दृश्यते - 

हरिःकृतिः
हरिःस्थितिः
हरिरिति
हरिस्
हरिस्स्थितिः

======
अपेक्षितम् एवम् - 
हरिःकृतिः
हरिरिति
हरिस्
हरिःस्थितिः
हरिस्स्थितिः




2013/9/26 विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com>

Anunad Singh

unread,
Sep 27, 2013, 3:06:50 AM9/27/13
to sanskrit-p...@googlegroups.com
हरङ्गतः
हरचितिः
हरञ्चितः
हरञ्चितः
हरंगतः
हरंचितः
हरांशुः


यदि उपर्युक्त शब्दाः निम्नलिखित प्रकारेण  शाटिताः,  किम इदम्  स्वीकार्यः?


हरंगतः
हरंचितः
हरङ्गतः
हरचितिः
हरञ्चितः
हरञ्चितः
हरांशुः

---- अनुनादः

Mārcis Gasūns

unread,
Sep 27, 2013, 4:25:46 AM9/27/13
to sanskrit-p...@googlegroups.com

Anusvara sorting still wrong

It’s now in file:

  • समेधन

  • समोकस्

  • सम्°

  • सम्पच्

Should be as in book:

  • समेधन

  • समोकस्

  • संपच्

  • संपठ्

Anunad Singh

unread,
Sep 27, 2013, 7:02:20 AM9/27/13
to sanskrit-p...@googlegroups.com
The present ascending order, for what I call '.aadhaara varna' and 'the 'maatraa varna' , is as follows:

var aadhaar_varna = new Array ( "०",  "१",  "२",   "३",   "४",  "५", "६",  "७",  "८",  "९",

        "ऽ", "अ", "आ", "इ", "ई", "उ", "ऊ", "ऋ", "ए", "ऐ", "ओ", "औ",   

        "क", "क़", "ख", "ख़", "ग", "ग़", "घ", "ङ", "च", "छ", "ज", "ज़", "झ", "ञ",

        "ट", "ठ",  "ड",  "ड़", "ढ",  "ढ़", "ण",     "त", "थ", "द", "ध", "न", "ऩ", "प", "फ", "फ़", "ब", "भ", "म", "य", "य़", "र", "ऱ", "ल",  "ळ",  "व",  "श",  "ष",  "स",  "ह"  )  ;  

maatraa_varna = new Array ( '्', '' , 'ं', 'ँ', ':', 'ा', 'ां', 'ाँ', 'ा:', 'ि', 'िं', 'िँ', 'ि:', 'ी', 'ीं', 'ीँ', 'ी:', 'ु', 'ुं', 'ुँ', 'ु:',            'ू', 'ूं', 'ूँ', 'ू:', 'ृ', 'ृं', 'ृँ', 'ृ:', 'े', 'ें', 'ेँ', 'े:', 'ै', 'ैं', 'ैँ', 'ै:', 'ो', 'ों', 'ोँ', 'ो:', 'ौ', 'ौं', 'ौँ', 'ौ:' )

where  the entry after the 'viraama'  is 'no maatraa'.

Please specify what changes should be made in this order to make it 'standard'  or most appropriate.

-- अनुनादः


Mārcis Gasūns

unread,
Sep 27, 2013, 7:56:58 AM9/27/13
to sanskrit-p...@googlegroups.com

Anunad Singh

unread,
Sep 27, 2013, 8:50:20 AM9/27/13
to sanskrit-p...@googlegroups.com
Mārcis Gasūn,

Frankly speaking, I have not seen the above link or any other algorithm before. I will be happy to know appropriate links such as the above.

--अनुनादः

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Sep 27, 2013, 1:01:18 PM9/27/13
to sanskrit-p...@googlegroups.com

2013/9/27 Anunad Singh <anu...@gmail.com>

यदि उपर्युक्त शब्दाः निम्नलिखित प्रकारेण  शाटिताः,  किम इदम्  स्वीकार्यः?

न अनुनाद - द्वे ऽपि सूची अतृप्तिकर्यौ।

Anunad Singh

unread,
Sep 28, 2013, 3:17:41 AM9/28/13
to sanskrit-p...@googlegroups.com
 
 I have uploaded a new version of Devanagari Sorting program ( देवानुकरण ) ।  It is here:

https://docs.google.com/file/d/0B06JOlm5x83YTFNqZFJvVFBMTUU/edit?usp=sharing

Following modifications were made:

(1) Choice of ascending order or descending order
(2) visarga has been corrected ( ः ). It was ( : ) in the previous file.
(3) ascending order for 'maatraas' is as follows :


maatraa_varna = new Array ( '्', 'ं', 'ँ', 'ः', '' , 'ां','ाँ', 'ाः', 'ा',

        'िं', 'िँ', 'िः', 'ि', 'ीं', 'ीँ', 'ी:', 'ी', 'ुं', 'ुँ', 'ुः', 'ु',

        'ूं', 'ूँ', 'ूः', 'ू', 'ृं', 'ृँ', 'ृः', 'ृ', 'ॄं', 'ॄँ', 'ॄः', 'ॄ',

        'ें', 'ेँ', 'ेः', 'े', 'ैं', 'ैँ', 'ैः', 'ै', 'ों', 'ोँ', 'ोः', 'ो',

        'ौं', 'ौँ', 'ौः', 'ौ'  ) ;

Please test and post your comments.

-- अनुनादः

Mārcis Gasūns

unread,
Sep 28, 2013, 11:12:18 PM9/28/13
to sanskrit-p...@googlegroups.com
Before even testing is Reverse index
http://www.sanskritweb.net/sansdocs/reverse1.pdf possible?

Anunad Singh

unread,
Sep 29, 2013, 12:22:34 AM9/29/13
to sanskrit-p...@googlegroups.com
Do you mean 'is computer program (or algorithm) for sorting Sanskrit words in retrograde order possible?'

If so, I have a program which can do that.

-- अनुनादः


2013/9/29 Mārcis Gasūns <gas...@gmail.com>
Boxbe This message is eligible for Automatic Cleanup! (gas...@gmail.com) Add cleanup rule | More info

Before even testing is Reverse index
http://www.sanskritweb.net/sansdocs/reverse1.pdf possible?

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




--

Mārcis Gasūns

unread,
Sep 29, 2013, 5:43:56 AM9/29/13
to sanskrit-p...@googlegroups.com


On Sunday, 29 September 2013 08:22:34 UTC+4, Anunad Singh wrote:
Do you mean 'is computer program (or algorithm) for sorting Sanskrit words in retrograde order possible?'

Yes, retrograde, from the end.
 
If so, I have a program which can do that.

Would love to see it. The code. 

Anunad Singh

unread,
Sep 29, 2013, 6:13:54 AM9/29/13
to sanskrit-p...@googlegroups.com
Mārcis Gasūns,

Please find the program attached. Please note that in this, the left-->right sorter (forward sorter) is just a simple sorter not suitable for Devanagari. BUT I think the Left <-- Right sorter (retrograde sorter)  will serve your purpose.

-- अनुनादः
देवनागरी प्रतिलोम क्रमक_08.zip

Mārcis Gasūns

unread,
Sep 29, 2013, 1:54:51 PM9/29/13
to sanskrit-p...@googlegroups.com
It does not work.
  • हंसपादी
  • हंसबीज
  • हंसमाला
  • हंसरुत
  • हंसवाह
  • हंसाभिख्य
  • हंसाय
  • हंसावली

In left to right gives
  • हंसपादी
  • हंसबीज
  • हंसमाला
  • हंसरुत
  • हंसवाह
  • हंसाभिख्य
  • हंसाय
  • हंसावली

In right to left
  • हंसबीज
  • हंसरुत
  • हंसाभिख्य
  • हंसाय
  • हंसवाह
  • हंसमाला
  • हंसपादी
  • हंसावली

Anunad Singh

unread,
Sep 29, 2013, 11:36:18 PM9/29/13
to sanskrit-p...@googlegroups.com
I confine to Right to left Sorting in this message only.

I will say that this program works and produces the results as expected. Mārcis Gasūns's expectations are different, so he says 'it does not work'.

Moreover, it can be modified to give results as per clearly defined sorting order. It can implement two or more retrograde sorting orders too.

--अनुनादः

Mārcis Gasūns

unread,
Sep 30, 2013, 2:30:44 AM9/30/13
to sanskrit-p...@googlegroups.com
Everything can be modified but as it is it's useless for a reverse index. Have you opened my link?

Mārcis Gasūns

unread,
Sep 30, 2013, 4:49:12 AM9/30/13
to sanskrit-p...@googlegroups.com
https://www.dropbox.com/s/jvcy38lh5nung9y/devanagari.txt input file.
This is a sample of correct sorting with php. Next step VBA.

Mārcis Gasūns

unread,
Oct 12, 2013, 3:50:14 PM10/12/13
to sanskrit-p...@googlegroups.com
Reverse sorting working as well. First public php script of reverse devanagari sorting for Sanskrit.
Reverse-Palsule-Dhatu-Index.pdf

Anubhav Chattoraj

unread,
Oct 19, 2013, 11:21:50 AM10/19/13
to sanskrit-p...@googlegroups.com
I realize I'm late to the party here, but I only just discovered this thread (via a link to Gasūns' blog posted on the technical-hindi group).

There have been several threads about Devanagari sorting on technical-hindi recently. The requirement that dead consonants be sorted before live consonants (e.g. हल् < हल) has been brought up, as well as the requirement that an anusvāra before a vārgika akṣara be treated as a pañcamakṣara. The requirement that a visarga before a sibilant be treated as the same sibilant is new to me, but isn't fundamentally different from the anusvāra requirement.

On technical-hindi, Hari Raam has pointed out that the dead consonants can be correctly sorted by decomposing क to क्+अ, का to क्+आ, and so on, and then sorting in lexicographic order (not codepoint order). This also properly sorts the "unchangeable" anusvāras and visargas. For instance, (using made-up words)

हल् = ह् अ ल्
हल = ह् अ ल् अ
हंल = ह् अ ं ल् अ
हइ = ह् अ इ
ही = ह् ई

In lexicographic order, इ < ं < ल्, and shorter words sort before longer ones, so the list will sort as हइ, हंल, हल्, हल, ही as expected.

Therefore, here's a skeleton of my proposed algorithm:

Suppose the (unsorted) list of words is stored in `words`.
Let `decomposed_words` be an array of tuples `(word_index, decomposition)`, where
`word_index` is the index of the word in the original array
`decomposition` represents the decomposed words (as in the example above). It is an array of positive integers, where each integer represents the lexicographic index of a character in the decomposed word (letters with smaller lexicographic indices sort before letters with larger indices).

(While constructing `decomposition`, make sure to replace any anusvāras followed by a vārgika letter with the corresponding pañcamākṣara, and any visarga followed by a sibilant with the same sibilant.)

After constructing `decomposed_words`, sort it by comparing the `decomposition` field of each tuple in the array. (Nothing fancy here, just standard lexicographical comparison.) Called the sorted array `sorted_decomposed_words`.

Now, construct an new array `sorted_words`. Loop through the elements of `sorted_decomposed_words`, and for each element `(word_index, decomposition)` add `words[word_index]` to `sorted_words`.

And you're done. That's all you need for sorting classical Sanskrit.
(General-purpose Devanagari sorting is much more complicated.)

For retrograde sorting, there's only one change you need to make: when comparing two `decomposition`s, compare right-to-left instead of left-to-right.

I am planning to implement this; I should have a Javascript implementation up and running by the end of next week.

Mārcis Gasūns

unread,
Oct 20, 2013, 5:48:09 AM10/20/13
to sanskrit-p...@googlegroups.com
Namaste,


On Saturday, 19 October 2013 19:21:50 UTC+4, Anubhav Chattoraj wrote:
I realize I'm late to the party here, but I only just discovered this thread (via a link to Gasūns' blog posted on the technical-hindi group).

Not too late at all.
 
There have been several threads about Devanagari sorting on technical-hindi recently. The requirement that dead consonants be sorted before live consonants (e.g. हल् < हल) has been brought up, as well as the requirement that an anusvāra before a vārgika akṣara be treated as a pañcamakṣara. The requirement that a visarga before a sibilant be treated as the same sibilant is new to me, but isn't fundamentally different from the anusvāra requirement.
Gérard Huet: Two pitfalls to avoid for computing on Sanskrit words or sentences:
- Do not compute on syllables, but on phonemes - thus translate devanagarii at the phonemic level
- Do not use strings - specially Unicode strings - use lists
 

(While constructing `decomposition`, make sure to replace any anusvāras followed by a vārgika letter with the corresponding pañcamākṣara, and any visarga followed by a sibilant with the same sibilant.)
Dhaval has made an easy fix for it.
 

After constructing `decomposed_words`, sort it by comparing the `decomposition` field of each tuple in the array. (Nothing fancy here, just standard lexicographical comparison.) Called the sorted array `sorted_decomposed_words`.

Now, construct an new array `sorted_words`. Loop through the elements of `sorted_decomposed_words`, and for each element `(word_index, decomposition)` add `words[word_index]` to `sorted_words`.

And you're done. That's all you need for sorting classical Sanskrit.
(General-purpose Devanagari sorting is much more complicated.)
Yes, indeed.
 

For retrograde sorting, there's only one change you need to make: when comparing two `decomposition`s, compare right-to-left instead of left-to-right.

I am planning to implement this; I should have a Javascript implementation up and running by the end of next week.
Anubhav, do you love reinventing the wheel? Everything and even much more has been solved at https://docs.google.com/document/d/1t5tWom5GcZIA4TY0U_h84MRdGl4gghQugyti7vacoJA/edit#
If you want to help there is more to do, but better you don't do the same thing. If you know VBA it could be converted to it so could be used in MS Word or Excel.
 

Anubhav Chattoraj

unread,
Oct 20, 2013, 10:32:05 AM10/20/13
to sanskrit-p...@googlegroups.com
Sorry, I was under the impression that your system still had unresolved
issues.

I will be building a javascript implementation anyway, because, as I
said earlier, my aim is to sort Devanāgarī in general, not just Sanskrit.

I plan to build a VBA version (of the generalized algorithm) after the
Javascript version has been validated by the users of technical-hindi. I
will make sure to post it to this group when it's done.

Thank you for your time.

(As an aside: It is, of course, gratifying to know that Huet agrees with
my approach to Devanāgarī sorting.)



Mārcis Gasūns

unread,
Oct 20, 2013, 2:18:03 PM10/20/13
to sanskrit-p...@googlegroups.com
Namaste,


On Sunday, 20 October 2013 18:32:05 UTC+4, Anubhav Chattoraj wrote:
Sorry, I was under the impression that your system still had unresolved
issues.
No, after a month of coding all solvable issues are solved. Please read the PHP comments to see the list of unsolvable issues.
 

I will be building a javascript implementation anyway, because, as I
said earlier, my aim is to sort Devanāgarī in general, not just Sanskrit.
This is interesting. What do you mean by general? Add Hindi and Marathi or what else?
 

I plan to build a VBA version (of the generalized algorithm) after the
Javascript version has been validated by the users of technical-hindi. I
will make sure to post it to this group when it's done 
VBA sounds great. Will be waiting for it. I hope you will find our algorithm helpful as well.



(As an aside: It is, of course, gratifying to know that Huet agrees with
my approach to Devanāgarī sorting.)
No, it's vice versa. It's you who agree with him. He did his choise in 2003.

ken p

unread,
Oct 20, 2013, 3:30:31 PM10/20/13
to sanskrit-p...@googlegroups.com
Why not sort simple Devanagari text using English text sorter?

1- Copy Devanagari text and convert to ITRANS Roman
2-Sort using above link
3-Convert sorted text back to Devanagari


 








शुक्रवार, 20 सितम्बर 2013 7:51:14 am UTC-5 को, Mārcis Gasūns ने लिखा:
Namaste

How do you sort Devanagari text? MS Excel sorting is going wrong. Is there a PHP or VBA script who can make Sanskrit order like in Apte's dictionary?

Mārcis Gasūns

unread,
Oct 21, 2013, 11:23:45 AM10/21/13
to sanskrit-p...@googlegroups.com


On Sunday, 20 October 2013 23:30:31 UTC+4, ken p wrote:
Why not sort simple Devanagari text using English text sorter?
Oh my. Can I ask you what is your mother tongue? Or are you just joking?
 

1- Copy Devanagari text and convert to ITRANS Roman
Correct sorting would be:
अटन
अटवी
अटवीबल
अट्टहास
अष्टाङ्ग
अष्टीला
बभस
बभ्रि
बभ्रु
बभ्रुवाहन
 
2-Sort using above link
aṣṭāṅga
aṣṭīlā
aṭana
aṭavī
aṭavībala
aṭṭahāsa
ba
babhasa
babhri
babhru
babhruvāhana
 
3-Convert sorted text back to Devanagari

अष्टाङ्ग
अष्टीला
अटन
अटवी
अटवीबल
अट्टहास
बभस
बभ्रि
बभ्रु
बभ्रुवाहन 

Yes, indeed. I have seen Sanskrit books sorted in Latin alphabet. You should see how I cried

ken p

unread,
Oct 21, 2013, 1:32:56 PM10/21/13
to sanskrit-p...@googlegroups.com
Hello Mr.Gasūns, 
Gujarati, but have lost touch with it for being  in US for long time.
As per Google transliteration Gujanagari /Gujarati is India's mostly evolved simplest shirorekha(lines above letters) free script which is widely used in writing Hindi and Sanskrit in Gujarat state.Try a script converter and see the simplicity.
अटन  अटवी  अटवीबल   अट्टहास अष्टाङ्ग  अष्टीला   ब  बभस  बभ्रि बभ्रु  बभ्रुवाहन
અટન અટવી અટવીબલ  અટ્ટહાસ  અષ્ટાઙ્ગ  અષ્ટીલા બ બભસ બભ્રિ  બભ્રુ બભ્રુવાહન

सोमवार, 21 अक्तूबर 2013 10:23:45 am UTC-5 को, Mārcis Gasūns ने लिखा:

Rahul V

unread,
Oct 21, 2013, 11:29:34 PM10/21/13
to sanskrit-p...@googlegroups.com
I think there is a little Sanskrit sorting happening in this paper. I could not look into it and give the gist. My apologies.


--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Regards,
R.V.

Mārcis Gasūns

unread,
Oct 22, 2013, 12:11:00 AM10/22/13
to sanskrit-p...@googlegroups.com


On Tuesday, 22 October 2013 07:29:34 UTC+4, Rahul V wrote:
I think there is a little Sanskrit sorting happening in this paper. I could not look into it and give the gist. My apologies.

Actually not, it's about fuzzy search. I've read it before.

Anubhav Chattoraj

unread,
Oct 25, 2013, 1:05:49 PM10/25/13
to sanskrit-p...@googlegroups.com
On 10/20/2013 11:48 PM, Mārcis Gasūns wrote:
> This is interesting. What do you mean by general? Add Hindi and Marathi
> or what else?

Yes, Hindi and Marathi, and other languages that are usually written in
Devanagari.

(I'd like to add support for languages that are only occasionally
written in Devanagari, such as Sindhi, but they tend not to have fixed
collation orders.)

Anyway, the Javascript version is done:

http://anubhav-chattoraj.github.io/indic-tools/devanagari_sorter/

For Sanskrit, the only options you'll need to change are:
"Treat anusvāra before vārgika akṣara as" : "pancamākṣara"
"Treat visarga before sibilant as" : "sibilant"

Do try it out, and let me know if you find any bugs.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 25, 2013, 2:31:03 PM10/25/13
to sanskrit-p...@googlegroups.com

On Fri, Oct 25, 2013 at 10:05 AM, Anubhav Chattoraj <anubhav....@gmail.com> wrote:

Anyway, the Javascript version is done:

http://anubhav-chattoraj.github.io/indic-tools/devanagari_sorter/

That is beautiful. Thank you. Is the source published in some repo? If so, it would be good to link to it from the page.

Mārcis Gasūns

unread,
Oct 25, 2013, 3:16:33 PM10/25/13
to sanskrit-p...@googlegroups.com


On Friday, 25 October 2013 21:05:49 UTC+4, Anubhav Chattoraj wrote:
(I'd like to add support for languages that are only occasionally 
written in Devanagari, such as Sindhi, but they tend not to have fixed
collation orders.)
Great idea.
 

Anyway, the Javascript version is done:

http://anubhav-chattoraj.github.io/indic-tools/devanagari_sorter/
Amazing, and quick one. Now it needs testing - with different lists.
Sample list sorted with Chattoraj's Sorter:
  • अंदर
  • अंदर
  • अकंटक
  • अकंटक
  • अकर्ता
  • अकर्ता
  • अकाउंट
  • अकाउंट
  • अकेला
  • अकेला
  • अक्षर
  • अक्षर
  • अक्सर
  • अक्सर
  • आँख
  • आँख
  • आंतरिक
  • आंतरिक
  • आकाश
  • आकाश
  • इंद्र
  • इंद्र
  • इच्छा
  • इच्छा
  • कई
  • कई
  • कठिनाई
  • कठिनाई
  • कहावत
  • कहावत
  • कान
  • कान
  • काम
  • काम
  • कामिनी
  • कामिनी
  • कि
  • कि
  • किंतु
  • किंतु
  • कितना
  • कितना
  • कितनी
  • कितनी
  • किताब
  • किताब
  • की
  • की
  • कुआँ
  • कुआँ
  • कुंजी
  • कुंजी
  • कुछ
  • कुछ
  • कुतर्क
  • कुतर्क
  • कुतुब
  • कुतुब
  • क़ुतुब
  • क़ुतुब
  • क़ुतुबनुमा
  • क़ुतुबनुमा
  • कुत्ता
  • कुत्ता
  • कुमार
  • कुमार
  • कुमारी
  • कुमारी
  • कोई
  • कोई
  • कोन
  • कोन
  • क्रिया
  • क्रिया
  • क़लम
  • क़लम
  • खाना
  • खाना
  • ख़ाना
  • ख़ाना
  • जब
  • जब
  • जवाब
  • जवाब
  • जीवन
  • जीवन
  • ज़बान
  • ज़बान
  • ज़रूर
  • ज़रूर
  • दरबार
  • दरबार
  • दर्शन
  • दर्शन
  • नंगा
  • नंगा
  • नगर
  • नगर
  • निःसंदेह
  • निःसंदेह
  • निकास
  • निकास
  • निदासी
  • निदासी
  • नियम
  • नियम
  • नींबू
  • नींबू
  • नीला
  • नीला
  • पल
  • पल
  • पलंग
  • पलंग
  • पलँग
  • पलँग
  • पलक
  • पलक
  • बंधन
  • बंधन
  • बचपन
  • बचपन
  • बच्चा
  • बच्चा
  • बजा
  • बजा
  • बाल
  • बाल
  • बाल-बच्चे
  • बाल-बच्चे
  • बालिका
  • बालिका
  • महान्
  • महान्
  • महानगर
  • महानगर
  • महानुभाव
  • महानुभाव
  • ल 
  • लं
  • लँ
  • लं 
  • लँ 
  • ला
  • ला 
  • लां
  • लाँ
  • लां 
  • लाँ 
  • वाक्
  • वाक्
  • वाक़िया
  • वाक़िया
  • वाक्छल
  • वाक्छल
  • वाक्य
  • वाक्य
  • विज्ञान
  • विज्ञान
  • शक्ति
  • शक्ति
  • शक्‌ति
  • शक्‌ति
  • शीशा
  • शीशा
  • ऴ 
  • ळ 
  • ळं
  • ळँ
  • ळं 
  • ळँ 
  • ळा
  • ऴा
  • ळा 
  • ऴा 
  • ळां
  • ळाँ
  • ऴां
  • ऴाँ
  • ळां 
  • ळाँ 
  • ऴां 
  • ऴाँ 
  • ऴँ
  • ऴं 
  • ऴँ 

Some UI proposals:
Sort kṣa (क्ष) as … 
Sort tra (त्र) as … 
Sort jña (ज्ञ) as …
to have a subheader Ligatures or Conjunct Consonants.

Sorted Words should be in a box, otherwise hard to copy paste.
 

For Sanskrit, the only options you'll need to change are:
"Treat anusvāra before vārgika akṣara as" : "pancamākṣara"
"Treat visarga before sibilant as" : "sibilant"
So some Help file online will be good to have. Good start.
 

Do try it out, and let me know if you find any bugs.
Not more than 20, do not worry. 
 

Anubhav Chattoraj

unread,
Oct 26, 2013, 12:12:23 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 12:46 AM, Mārcis Gasūns wrote:
> Sample list sorted with Chattoraj's Sorter:

Eyeballing that list, I discovered a bug with my handling of nuqta
characters. (Now fixed.)

Instead of
कठिनाई
कठिनाई
कहावत
कहावत
कान
...
क्रिया
क्रिया
क़लम
क़लम

I now get

कठिनाई
कठिनाई
क़लम
क़लम
कहावत
कहावत
कान
...
क्रिया
क्रिया

which is the correct order.

Can't see any other obvious errors.

> to have a subheader Ligatures or Conjunct Consonants.

Done.

> Sorted Words should be in a box, otherwise hard to copy paste.

Good idea. But first, I'd like to see if it's possible to add a button
to copy the text. (I've read some links suggesting that it is, in fact,
possible to access the user's clipboard.)

> Not more than 20, do not worry.

That file is hard to navigate. It's hard to understand what part is
relevant to what program, or which problems are still outstanding, which
are resolved, and which are unsolvable.

Anubhav Chattoraj

unread,
Oct 26, 2013, 12:14:56 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 12:01 AM, विश्वासो वासुकिजः (Vishvas Vasuki) wrote:
> That is beautiful. Thank you. Is the source published in some repo? If
> so, it would be good to link to it from the page.

Added the link.

Pasting here for convenience:
https://github.com/anubhav-chattoraj/indic-tools/tree/gh-pages

dhaval patel

unread,
Oct 26, 2013, 1:30:59 AM10/26/13
to sanskrit-p...@googlegroups.com
> Not more than 20, do not worry.

That file is hard to navigate. It's hard to understand what part is relevant to what program, or which problems are still outstanding, which are resolved, and which are unsolvable.



Let me jot down a few.
1. In dictionaries you always have some characters which should be ignored while sorting. 
e.g. कुम्भ-कार should be considered as कुम्भकार. The hyphen should be ignored. It is good to give a UI to ask them which letters they want to ignore. Note that the characters need not be the common ones like #$% etc. they maybe some unicode characters like  ˚ .

2. Sorting of avagrahas and om.

3. correct order is  नि - निःक्षिप्‌ - निःशिष्‌  - निःसह. Note that the visarga - khar - shar combination is treated before even sibilants.

4. ऋ ॠ लृ लॄ. The unicode doesn't sort them in correct order. लृ लॄ appear after ह.

5. उरँग(म)

क+ठ

क+ण

कर्ण-भु+षण

कार्षि(न्)

would also need to be sorted. also a root sign has to be ignored.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
Dr. Dhaval Patel, I.A.S
District Development Officer, Rajkot

Anubhav Chattoraj

unread,
Oct 26, 2013, 3:23:34 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 11:00 AM, dhaval patel wrote:
> Let me jot down a few.

Thanks for the pointers.

1. Done. The page now has a text field to input the characters to ignore.

2. Is there an accepted order for sorting avagraha and Om? I'm sorting
avagraha as अ and ॐ right before अ.

3. I don't follow. The order नि - निःक्षिप्‌ - निःशिष्‌ - निःसह can be
obtained simply by sorting in Unicode order, without making any special
adjustments for the visarga. (Even after adjustments, the order is the
same: नि - निःक्षिप - निश्शिष् - निस्सह.)

4. Already implemented.

5. Already implemented. Non-Devanagari characters as a whole are sorted
after Devanagari characters, and their relative order is determined by
the default string sort algorithm (whatever it is). Characters to be
ignored can be entered in the "ignore" text field.

dhaval patel

unread,
Oct 26, 2013, 3:42:51 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 11:00 AM, dhaval patel wrote:
> Let me jot down a few.

Thanks for the pointers.

1. Done. The page now has a text field to input the characters to ignore.


Great.
 
2. Is there an accepted order for sorting avagraha and Om? I'm sorting avagraha as अ and ॐ right before अ.


It is advisable to give a user interface to decide how he wants to treat Om. The possibilities maybe before अ, after ओं, after औं.

As for avagarhas - the same. 
Options can be to ignore them, treat ऽ as अ, treat ऽऽ as आ
 
3. I don't follow. The order नि - निःक्षिप्‌ - निःशिष्‌  - निःसह can be obtained simply by sorting in Unicode order, without making any special adjustments for the visarga. (Even after adjustments, the order is the same: नि - निःक्षिप - निश्शिष् - निस्सह.)


Try with
नि
निःक्षिप्‌
निःशब्द
निःशेष
निक
निःक
निखिल

निःक should come at the place of निष्क


4. Already implemented.


Great.
 
5. Already implemented. Non-Devanagari characters as a whole are sorted after Devanagari characters, and their relative order is determined by the default string sort algorithm (whatever it is). Characters to be ignored can be entered in the "ignore" text field.


Great.
Very user friendly i must say. 

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Mārcis Gasūns

unread,
Oct 26, 2013, 4:11:28 AM10/26/13
to sanskrit-p...@googlegroups.com
I must confess I love what I see.
I sorted:
  • हैमीकृ  8.Ā.
  • होमय्  10.P.
  • ह्रस्  1.Ā.
  • ह्रासय्  10.P.
  • ह्री  3.P.
  • ह्लादय्  10.Ā.
  • ह्लाद्  1.Ā.
  • ह्वा  4.Ā.
Descending, and got:

  • ह्वा  4.Ā.
  • ह्लादय्  10.Ā.
  • ह्लाद्  1.Ā.
  • ह्री  3.P.
  • ह्रासय्  10.P.
  • ह्रस्  1.Ā.
  • होमय्  10.P.
  • हैमीकृ  8.Ā.
It would be nice to get reverse as well.

GRETIL's sorting:
  • अ-उ-म-कारसंयुक्तं KubjT_8.59c
  • अकरिष्यत् सुधीस् तदा BhStc_40d
  • अकर्ता निर्गुणश्चाहं SvaT_12.49c
  • अकर्ता निर्गुणश्चाहं SvaT_12.75c
  • अकर्ता पुरुषः स्मृतः SvaT_12.76d
  • अकर्तृभावाद्भोक्तुश्च MrgT_1,2.15c
  • अकर्मपथवर्तिनाम् SvaT_10.58b
  • अकल्कवान्सत्त्ववान्यो SvaT_10.71a
  • अकल्को ज्ञानशीलता SvaT_10.64d
  • अकल्यश्च न कल्यते SvaT_11.310b
  • अकस्माच् छ्रीर् उपस्थिता KubjT_2.28b
  • अकस्माज् जायते स्थूलः KubjT_23.37c
  • अकस्माद्धूसरच्छविः SvaT_7.264b
  • अकस्माद्वै भवेत्कृशः SvaT_7.275d
Becomes in your sorter:
  • अकरिष्यत् सुधीस् तदा BhStc_40d
    अकर्ता निर्गुणश्चाहं SvaT_12.49c
    अकर्ता निर्गुणश्चाहं SvaT_12.75c
    अकर्ता पुरुषः स्मृतः SvaT_12.76d
    अकर्तृभावाद्भोक्तुश्च MrgT_1,2.15c
    अकर्मपथवर्तिनाम् SvaT_10.58b
    अकल्कवान्सत्त्ववान्यो SvaT_10.71a
    अकल्को ज्ञानशीलता SvaT_10.64d
    अकल्यश्च न कल्यते SvaT_11.310b
    अकस्माच् छ्रीर् उपस्थिता KubjT_2.28b
    अकस्माज् जायते स्थूलः KubjT_23.37c
    अकस्माद्धूसरच्छविः SvaT_7.264b
    अकस्माद्वै भवेत्कृशः SvaT_7.275d
    अ-उ-म-कारसंयुक्तं KubjT_8.59c
Which is interesting.

Anubhav Chattoraj

unread,
Oct 26, 2013, 5:04:13 AM10/26/13
to sanskrit-p...@googlegroups.com


On 10/26/2013 01:41 PM, Mārcis Gasūns wrote:
> It would be nice to get *reverse* as well
> <https://docs.google.com/document/d/1t5tWom5GcZIA4TY0U_h84MRdGl4gghQugyti7vacoJA/edit#heading=h.mqp334pcac19>.

Already works. (In "sort direction", pick "right-to-left").

The order is slightly different, though. The order you cite is




एक
एकक
लेखक
वायुवेगक
स्फुलिङ्गक

My program gives:




एकक
लेखक
वायुवेगक
स्फुलिङ्गक
एक

which makes sense, because everything sorted before एक ends in अक (eg
स्फुलिङ्गक = … + ङ् + ग् + अ + क्). Looks like Ulrich's order ignores the
inherent अ.

(Speaking of this, I've discovered another bug in my code. The
anusvāra/visarga etc. don't get replaced when sorting in reverse. Will
fix in a few hours.)

> GRETIL's sorting:
>
> * अ-उ-म-कारसंयुक्तं KubjT_8.59c
> * अकरिष्यत् सुधीस् तदा BhStc_40d
> *
> * अकर्ता निर्गुणश्चाहं SvaT_12.49c
> * अकर्ता निर्गुणश्चाहं SvaT_12.75c
> * अकर्ता पुरुषः स्मृतः SvaT_12.76d
> * अकर्तृभावाद्भोक्तुश्च MrgT_1,2.15c
> * अकर्मपथवर्तिनाम् SvaT_10.58b
> * अकल्कवान्सत्त्ववान्यो SvaT_10.71a
> * अकल्को ज्ञानशीलता SvaT_10.64d
> * अकल्यश्च न कल्यते SvaT_11.310b
> * अकस्माच् छ्रीर् उपस्थिता KubjT_2.28b
> * अकस्माज् जायते स्थूलः KubjT_23.37c
> * अकस्माद्धूसरच्छविः SvaT_7.264b
> * अकस्माद्वै भवेत्कृशः SvaT_7.275d
>
> Becomes in your sorter:
>
> * अकरिष्यत् सुधीस् तदा BhStc_40d
> अकर्ता निर्गुणश्चाहं SvaT_12.49c
> अकर्ता निर्गुणश्चाहं SvaT_12.75c
> अकर्ता पुरुषः स्मृतः SvaT_12.76d
> अकर्तृभावाद्भोक्तुश्च MrgT_1,2.15c
> अकर्मपथवर्तिनाम् SvaT_10.58b
> अकल्कवान्सत्त्ववान्यो SvaT_10.71a
> अकल्को ज्ञानशीलता SvaT_10.64d
> अकल्यश्च न कल्यते SvaT_11.310b
> अकस्माच् छ्रीर् उपस्थिता KubjT_2.28b
> अकस्माज् जायते स्थूलः KubjT_23.37c
> अकस्माद्धूसरच्छविः SvaT_7.264b
> अकस्माद्वै भवेत्कृशः SvaT_7.275d
> अ-उ-म-कारसंयुक्तं KubjT_8.59c
>
> Which is interesting.

If you add - to the "ignore" text field, my sort is identical to GRETIL's.

Anubhav Chattoraj

unread,
Oct 26, 2013, 5:09:57 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 01:12 PM, dhaval patel wrote:
> It is advisable to give a user interface to decide how he wants to treat
> Om. The possibilities maybe before अ, after ओं, after औं.
>
> As for avagarhas - the same.
> Options can be to ignore them, treat ऽ as अ, treat ऽऽ as आ
>

Will implement.

> 3. I don't follow. The order नि - निःक्षिप्‌ - निःशिष्‌ - निःसह can be
> obtained simply by sorting in Unicode order, without making any
> special adjustments for the visarga. (Even after adjustments, the
> order is the same: नि - निःक्षिप - निश्शिष् - निस्सह.)
>
>
> Try with
> नि
> निःक्षिप्‌
> निःशब्द
> निःशेष
> निक
> निःक
> निखिल
>
> निःक should come at the place of निष्क

Is there a list of such transformations? I looked through a list of
visarga-sandhis: http://learnsanskrit.org/references/sandhi/visarga

But it mentions nothing about visarga before ka = ṣ, or even visarga
before ś, ṣ, s = ś, ṣ, s.

dhaval patel

unread,
Oct 26, 2013, 5:13:59 AM10/26/13
to sanskrit-p...@googlegroups.com
निःक should come at the place of निष्क

Is there a list of such transformations? I looked through a list of visarga-sandhis: http://learnsanskrit.org/references/sandhi/visarga

But it mentions nothing about visarga before ka = ṣ, or even visarga before ś, ṣ, s = ś, ṣ, s.



Please have a look at the siddhAntakaumudi visargasandhi prakaraNam.
For your reference I copy paste the relevant portion from my PHP code. 
This will help your endeavour.

// vA shari (8.3.36) - visarga-$shar = visarga-$shar. Dictionaries seem to take this as mandatory, even though it is optional
$k=0;
while($k<3)
{
$c[$i]= str_replace("\u0938\u094d".$shar[$k],"\u0903".$shar[$k],$c[$i]);
$k++;
}
// kharpare shari vA visargalopo vaktavyaH (8.3.36 vArtikA) e.g. rAmaH sthAtha = rAma sthAtA. (used in sentences and not in dictionaries. so not coded)
// kupvoH HkHpau ca (8.3.37) - jihvAmUlIya and upadhmAnIya are not used in most of the dictionaries. so this is skipped.
// so'padAdau (8.3.38), pAzakalpakakAmyeSviti vAcyam - These two rules specify that a visarga is converted to 's' when followed by 'pAza','kalpa','ka','kAmyac'. This has already been taken care of because we have already converted it to 's'.
// ananvayasyeti vAcyam - the rule so'padAdau will not apply if the word is an avyaya. e.g. prAtaHkalpam. avyaya list has to be furnished.
// kAmye roreveti vAcyam - this would entail the analysis of vibhaktis to identify whether this is 'ru' or not. We have not done it.
// iNaH SaH (8.3.39) - in the rule 8.3.38, the 's' would be converted to 'S' if it is preceded by 'i'/'u'. Not done because we didnt do 8.3.38. And anyhow it would be converted by 'idudupadhasya cApratyayasya'.
// namaspurasorgatyoH (8.3.41) - namas / puras would have 's' instead of visarga, if followed by 'pAza','kalpa','ka','kAmyac' and it has gati saJjJA. This has already been done. gati would be decided by sentence, therefore we dont need in dictionary.
// idudupadhasya cApratyayasya (8.3.41) 'i'/'u'-visarga-'ka'/'pa' = 'i'/'u'-'Sha'-'ka'/'pa'. Here apratyayasya - the examples are seen only in sentences. so no need to bother about it right now.

 
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

dhaval patel

unread,
Oct 26, 2013, 5:25:35 AM10/26/13
to sanskrit-p...@googlegroups.com



2013/10/26 Mārcis Gasūns <gas...@gmail.com>

I must confess I love what I see.
I sorted:
  • हैमीकृ  8.Ā.
  • होमय्  10.P.
  • ह्रस्  1.Ā.
  • ह्रासय्  10.P.
  • ह्री  3.P.
  • ह्लादय्  10.Ā.
  • ह्लाद्  1.Ā.
  • ह्वा  4.Ā.
Descending, and got:

  • ह्वा  4.Ā.
  • ह्लादय्  10.Ā.
  • ह्लाद्  1.Ā.
  • ह्री  3.P.
  • ह्रासय्  10.P.
  • ह्रस्  1.Ā.
  • होमय्  10.P.
  • हैमीकृ  8.Ā.
 
What Marcis wants to point out is that there have to be some readymade regex which can ignore the English data in devanagari, because most of the devanagari data is mixed up in most of the sites.
e.g. here the regex can be - starting from a number and ending in a line ending. 
Here also, it is not possible to enumerate this  KubjT_8.59c in ignore field.
There has to be a regex, which will be something like from first Capital letter to the end of line. 
Then this sorter can be of more use for pAda indices.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Anubhav Chattoraj

unread,
Oct 26, 2013, 10:13:11 AM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 02:55 PM, dhaval patel wrote:
> What Marcis wants to point out is that there have to be some readymade
> regex which can ignore the English data in devanagari, because most of
> the devanagari data is mixed up in most of the sites.
> e.g. here the regex can be - starting from a number and ending in a line
> ending.
>
> *Here also, it is not possible to enumerate this KubjT_8.59c in ignore
> field.*
> *There has to be a regex, which will be something like from first
> Capital letter to the end of line. *
> *Then this sorter can be of more use for pAda indices.*

I see why that would be a problem when sorting in reverse. Thank you for
pointing it out.

However, this is a general-purpose tool for Devanagari sorting, not a
specific tool for sorting messy data collected from the internet. Such
an improvement is quite out of scope.

At best, I can add an option to ignore all non-Devanagari characters.

dhaval patel

unread,
Oct 26, 2013, 11:19:06 AM10/26/13
to sanskrit-p...@googlegroups.com
At best, I can add an option to ignore all non-Devanagari characters.


That would still be great.

Mārcis Gasūns

unread,
Oct 26, 2013, 2:07:29 PM10/26/13
to sanskrit-p...@googlegroups.com
GRETIL is the only academic source out there and you call it a mess.
General means not usable for any real life needs. All pada indexes 
are similar and issues like these will arise. If you want just another
half working sorter - than you will be top #2 in my list, but people
use mixed data as well. See atached root list. Only Dhaval's batch
PHP sorter managed the task. Your's than will be a small list solution.
6427-IAST-MWmeanings.pdf

Anubhav Chattoraj

unread,
Oct 26, 2013, 10:54:24 PM10/26/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 11:37 PM, Mārcis Gasūns wrote:
> GRETIL is the only academic source out there and you call it a mess.
> General means not usable for any real life needs. All pada indexes
> are similar and issues like these will arise. If you want just another
> half working sorter - than you will be top #2 in my list, but people
> use mixed data as well. See atached root list. Only Dhaval's batch
> PHP sorter managed the task. Your's than will be a small list solution.

Your poor grasp of the nuances of English lead you to see put-downs
where there are none. Calling GRETIL's data "messy" (i.e. not idealized,
containing data not expected by the program) is not at all the same as
calling GRETIL "a mess" (i.e. ineptly run, a total failure).

The pada numbers have no effect on forward sorting, since they're at the
end of the line. For reverse sorting, they can be ignored by adding all
Roman letters, digits, and punctuation to the "Ignore these characters"
field. I recognize this is tedious to do, which is why I'm planning to
add an option to ignore all non-Devanagari text.

This should allow you to sort your padas properly in both forward and
reverse directions.

As for the root list, do you expect a sorting program to transliterate
its input? How is it supposed to know what part of the input corresponds
to Sanskrit text and what part of it is the English explanation? This
can be determined only by a program tailored to the particular format of
the input file.

If you pass the root list through a format-specific transliteration
program, such that only the Sanskrit words are in Devanagari, and feed
its output to the sorter, the forward sort will still work correctly,
and reverse sort can be made to work using the planned "ignore
non-Devanagari text" option.

Anubhav Chattoraj

unread,
Oct 28, 2013, 12:56:56 PM10/28/13
to sanskrit-p...@googlegroups.com
On 10/26/2013 08:49 PM, dhaval patel wrote:
>
> At best, I can add an option to ignore all non-Devanagari characters.
>
>
> That would still be great.
>

Done. The sorter now has a checkbox labelled "Ignore numbers and Roman
characters".

This should allow reverse sorting even when the line has a pada
identifier at the end.

(Link for convenience:
http://anubhav-chattoraj.github.io/indic-tools/devanagari_sorter/ )

Mārcis Gasūns

unread,
Oct 28, 2013, 2:30:33 PM10/28/13
to sanskrit-p...@googlegroups.com


On Monday, 28 October 2013 20:56:56 UTC+4, Anubhav Chattoraj wrote:
Done. The sorter now has a checkbox labelled "Ignore numbers and Roman 
characters".
Thanks.
 

This should allow reverse sorting even when the line has a pada
identifier at the end.
Understood.
Sort order Descening is by word beginning? Is so, please write it directly there.
Retrogade + descening should give me ordered by word end?

  • ह्रस्वं दीर्घं प्लुतं चैव Stk_16.1c
    ह्रस्वं दीर्घं प्लुतं परम् SvaT_5.69d
    ह्रूंकारं तदनन्तरम् Stk_21.19d
    ह्रस्वं दीर्घं प्लुतं सूक्ष्मम् SvaT_6.4c
    ह्रस्वदीर्घप्लुतान्वितम् SvaT_3.20d
    ह्रस्वा त्याज्या प्रयत्नेन KubjT_23.155c
    ह्रस्वदीर्घविभागेन SvaT_4.181a
    ह्रस्वदीर्घविभागेन SvaT_4.192c
    ह्रस्वे नीले भयं विन्द्याद् KubjT_19.79c
    ह्रस्वः पुत्रविनाशनः SvaT_1.25b
    ह्रीं हूं स्व्लें स्वाहापतये KubjT_23.70a
    ह्रास्यमाना पदे पदे KubjT_25.204d
    ह्रौं ह्रूं च फण्णमश्चान्ते Stk_21.20a
    ह्रस्वो दहति पापानि Stk_16.2a
    ह्लादयन्तीव गात्राणि SvaT_10.562a 

Anubhav Chattoraj

unread,
Oct 28, 2013, 10:29:53 PM10/28/13
to sanskrit-p...@googlegroups.com
> Sort order Descening is by word beginning? Is so, please write it
> directly there.

Descending order works for both sort-from-beginning and sort-from-end.

> Retrogade + descening should give me ordered by word end?

It should. Sorting your list in retrograde + descending gives

ह्रस्वं दीर्घं प्लुतं चैव Stk_16.1c
ह्रस्वं दीर्घं प्लुतं परम् SvaT_5.69d
ह्रूंकारं तदनन्तरम् Stk_21.19d
ह्रस्वं दीर्घं प्लुतं सूक्ष्मम् SvaT_6.4c
ह्रस्वदीर्घप्लुतान्वितम् SvaT_3.20d
ह्रस्वा त्याज्या प्रयत्नेन KubjT_23.155c
ह्रस्वदीर्घविभागेन SvaT_4.192c
ह्रस्वदीर्घविभागेन SvaT_4.181a
ह्रस्वे नीले भयं विन्द्याद् KubjT_19.79c
ह्रस्वः पुत्रविनाशनः SvaT_1.25b
ह्रीं हूं स्व्लें स्वाहापतये KubjT_23.70a
ह्रास्यमाना पदे पदे KubjT_25.204d
ह्रौं ह्रूं च फण्णमश्चान्ते Stk_21.20a
ह्रस्वो दहति पापानि Stk_16.2a
ह्लादयन्तीव गात्राणि SvaT_10.562a

Almost the correct order, except that it seems to be ignoring final-अs.
This happens in ascending order too. Looks like a bug in my
ignore-Romans code.

Will fix.

Anubhav Chattoraj

unread,
Oct 28, 2013, 11:18:18 PM10/28/13
to sanskrit-p...@googlegroups.com
Fixed, the list now sorts in the correct retrograde+descending order:

ह्रस्वं दीर्घं प्लुतं परम् SvaT_5.69d
ह्रूंकारं तदनन्तरम् Stk_21.19d
ह्रस्वं दीर्घं प्लुतं सूक्ष्मम् SvaT_6.4c
ह्रस्वदीर्घप्लुतान्वितम् SvaT_3.20d
ह्रस्वे नीले भयं विन्द्याद् KubjT_19.79c
ह्रस्वः पुत्रविनाशनः SvaT_1.25b
ह्रीं हूं स्व्लें स्वाहापतये KubjT_23.70a
ह्रास्यमाना पदे पदे KubjT_25.204d
ह्रौं ह्रूं च फण्णमश्चान्ते Stk_21.20a
ह्रस्वो दहति पापानि Stk_16.2a
ह्लादयन्तीव गात्राणि SvaT_10.562a
ह्रस्वं दीर्घं प्लुतं चैव Stk_16.1c

Mārcis Gasūns

unread,
Oct 29, 2013, 4:52:42 AM10/29/13
to sanskrit-p...@googlegroups.com


On Tuesday, 29 October 2013 07:18:18 UTC+4, Anubhav Chattoraj wrote:
Fixed, the list now sorts in the correct retrograde+descending order:
I'm not quite sure about if it's fixed.
ह्रस्वं दीर्घं प्लुतं परम् SvaT_5.69d
ह्रस्वं दीर्घं प्लुतं सूक्ष्मम् SvaT_6.4c
Does makes more sense for me.

But a real retrograde (=reverse) list would look like:
  • हाटक
  • त्रिपिटक
  • कीटक
  • नर्कुटक
  • संपुट(क)
  • आखेटक
  • चेटक
  • कोशपेटक
  • कोटक
  • कर्कोटक
  • मोटक
  • पट्टक
  • कण्टक
  • अकण्टक
  • लोककण्टक
  • सकण्टक
  • निष्कण्टक
  • अष्टक
  • ऐष्टक
  • प्रविष्टक
  • पक्वेष्टक
  • पठक
  • काठक
  • प्रपाठक
  • शितिकण्ठक
  • लुण्ठक
  • पृष्ठक
  • नडक
  • ताडक
  • नीडक
  • गुडक
  • भयेडक
  • खण्डक
  • शिखण्डक
  • दण्डक
  • पण्डक
  • करण्डक
  • विभाण्डक
  • पिण्डक
  • मुण्डक
  • गणक
  • क्षपणक
  • धारणक
  • प्रेक्षणक
  • भाणक
  • हरिणक
  • गुणक
  • सकर्णक
  • तर्णक
  • वर्णक
  • प्रकीर्णक
  • युगतक
  • शतक
  • सप्तशतक
  • वैराग्यशतक
  • नितिशतक
  • अमरुशतक
  • हतक
  • दैवहतक
  • घातक
  • ब्रह्मघातक
  • स्त्रीघातक
  • चातक
  • पिष्टातक
  • स्नातक
  • महापातक
  • नाटित(क)
  • हितक
  • संगीतक
  • असितपीतक
  • क्रीतक
  • दूतक
  • मूतक
  • सूतक
  • कृतक
  • भृतक
  • And not as
  • भृतक
    कृतक
    सूतक
    मूतक
    दूतक
    क्रीतक
    असितपीतक
    संगीतक
    हितक
    महापातक
    स्नातक
    पिष्टातक
    चातक
    स्त्रीघातक
    ब्रह्मघातक
    घातक
    दैवहतक
    हतक
    अमरुशतक
    नितिशतक
    वैराग्यशतक
    सप्तशतक
    शतक
    युगतक
    प्रकीर्णक
    वर्णक
    तर्णक
    सकर्णक
    गुणक
    हरिणक
    भाणक
    प्रेक्षणक
    धारणक
    क्षपणक
    गणक
    मुण्डक
    पिण्डक
    विभाण्डक
    करण्डक
    पण्डक
    दण्डक
    शिखण्डक
    खण्डक
    भयेडक
    गुडक
    नीडक
    ताडक
    नडक
    पृष्ठक
    लुण्ठक
    शितिकण्ठक
    प्रपाठक
    काठक
    पठक
    ऐष्टक
    पक्वेष्टक
    प्रविष्टक
    अष्टक
    निष्कण्टक
    सकण्टक
    लोककण्टक
    अकण्टक
    कण्टक
    पट्टक
    मोटक
    कर्कोटक
    कोटक
    कोशपेटक
    चेटक
    आखेटक
    नर्कुटक
    कीटक
    त्रिपिटक
    हाटक
    नाटित(क)
    संपुट(क)
  • And I understand that you will say that (क) is at the end, because I have not entered them in "Ignore these characters" input box. But some preset options would not hurt. () would be one of them. The hyphen "-" one more. Otherwise very interesting work done on your end. I'm very impressed. And I give critique not because I do not like it. But because I love it.

Anubhav Chattoraj

unread,
Oct 29, 2013, 9:15:31 AM10/29/13
to sanskrit-p...@googlegroups.com
On 10/29/2013 02:22 PM, Mārcis Gasūns wrote:
> But some preset options would not hurt. () would be one of them. The
> hyphen "-" one more.

That's exactly what the new "Ignore numbers and non-Devanagari
characters" checkbox is for.

Once I check that and choose sort direction → right-to-left, my result
is identical to yours.

> ह्रस्वः पुत्रविनाशनः SvaT_1.25b
> ह्रस्वं दीर्घं प्लुतं चैव Stk_16.1c
> ह्रस्वो दहति पापानि Stk_16.2a
> ह्लादयन्तीव गात्राणि SvaT_10.562a
> ह्रीं हूं स्व्लें स्वाहापतये KubjT_23.70a
> ह्रास्यमाना पदे पदे KubjT_25.204d
> ह्रौं ह्रूं च फण्णमश्चान्ते Stk_21.20a
> ह्रस्वे नीले भयं विन्द्याद् KubjT_19.79c
> ह्रूंकारं तदनन्तरम् Stk_21.19d
> ह्रस्वदीर्घप्लुतान्वितम् SvaT_3.20d
> ह्रस्वं दीर्घं प्लुतं परम् SvaT_5.69d
> ह्रस्वं दीर्घं प्लुतं सूक्ष्मम् SvaT_6.4c
> Does makes more sense for me.

I can't figure out the logic behind that, to be honest.

It seems to be in ascending order, as the final इs are placed before the
final एs, but then within the इs न is placed before ण, and within the एs
द is placed before त. And why is the visarga placed at the very beginning?

> And I give critique not because I do not like it. But because I love it.

I understand that, and appreciate your critique. No hard feelings.

Mārcis Gasūns

unread,
Oct 30, 2013, 5:03:07 AM10/30/13
to sanskrit-p...@googlegroups.com
Is it sorting https://docs.google.com/document/d/1t5tWom5GcZIA4TY0U_h84MRdGl4gghQugyti7vacoJA/edit#heading=h.lovravyrs5ca letter by letter or word by word if there are several words in a line?

Anubhav Chattoraj

unread,
Oct 30, 2013, 11:06:29 AM10/30/13
to sanskrit-p...@googlegroups.com
> --

Correct me if I'm wrong, but I'm guessing that by "word by word", you mean

क ख < क घ < कक < कग

and by "letter by letter", you mean

कक < क ख < कग < क घ

i.e. spaces are ignored when sorting "letter-by-letter".

If so, it sorts word-by-word by default. You can sort letter-by-letter
by adding a space to the ignore field.

Mārcis Gasūns

unread,
Nov 5, 2013, 3:15:35 AM11/5/13
to sanskrit-p...@googlegroups.com


On Wednesday, 30 October 2013 19:06:29 UTC+4, Anubhav Chattoraj wrote:
क ख < क घ < कक < कग 

and by "letter by letter", you mean

कक < क ख < कग < क घ

i.e. spaces are ignored when sorting "letter-by-letter". 
 
Correct me if I'm wrong, but I'm guessing that by "word by word", you mean 
Right, In a letter by letter sort, the spaces between words are ignored, so "Offenbach" would come before "off switch."
 

If so, it sorts word-by-word by default. You can sort letter-by-letter
by adding a space to the ignore field.
That's a smart enough solution, accepted. 
Eagerly waiting for the VBA version of your devanagari sorter. 

Mārcis Gasūns

unread,
Nov 11, 2013, 7:44:33 PM11/11/13
to sanskrit-p...@googlegroups.com
Anubhav, can I know your opinion on one more reverse sorting issue?

Reverse Issues PWG


1) Above अ, which looks fishy. aṃ, āḥ, oḥ in reverse sorting above "a". Does not looks correct to me. https://www.dropbox.com/s/5db5z7u3bi2d52r/devanagarisorted.txt:

  • ऊंं

  • सयेफखांं

  • सारिस्थाखांं

  • भारिकं

  • पारिभाषिकं

  • महैरण्डं

  • सुतृष्णं

  • आनिधनं

  • अभिनिधनं

  • व्युत्क्रमं

  • भगनरायं

  • अधिहस्त्यं

  • यौथ्यं

  • प्रवरं

  • वृत्तिकारं

  • महद्बिलं

  • स्थूलनासं

  • किसं

  • ह्रस्वगवेधुकां

  • उच्चां

  • पुत्रीकरणमीमांसां

  • षडिडः

  • पृत्सुधः

  • व्यथ्ययः

  • शैलवालुकाः

  • गोपद्रुमलताः

  • अनूराधाः

  • आञ्जनाभ्यञ्जनाः

  • प्रहितोः

Anubhav Chattoraj

unread,
Nov 11, 2013, 11:31:42 PM11/11/13
to sanskrit-p...@googlegroups.com
Doesn't look right to me either. ं and ः should sort between औ and क IMO.

My program sorts your list with भारिकं … प्रहितोः between होहौ and अक्.

I've seen Hindi sorting orders that place ं before अ, but this seems to have started out as an implementation mistake which some government bodies have since legitimized.
--

Mārcis Gasūns

unread,
Nov 12, 2013, 2:55:11 AM11/12/13
to sanskrit-p...@googlegroups.com
Namaste,

  Thanks for the quick comment.


On Tuesday, 12 November 2013 08:31:42 UTC+4, Anubhav Chattoraj wrote:
Doesn't look right to me either. ं and ः should sort between औ and क IMO.
Right, that was exactly what I was thinking.
 
My program sorts your list with भारिकं … प्रहितोः between होहौ and अक्.
So it's smart enough. Ours is not yet.
 

I've seen Hindi sorting orders that place ं before अ, but this seems to have started out as an implementation mistake which some government bodies have since legitimized.

If you even see one and scan a page, I'll be thankful. Never thought it could exist on paper. Sorting is so wrongly done in Devanagari because MS Office failed to make it right in 2003 and did not fix in 2013.
As per https://github.com/anubhav-chattoraj/indic-tools/tree/gh-pages I see the last change was made 12 days ago and still to VBA sorter. Oh, Anubhav, hurry up :) The Sanskrit community is looking at you, following each your step. 

Mārcis Gasūns

unread,
Nov 12, 2013, 7:57:33 AM11/12/13
to sanskrit-p...@googlegroups.com
http://www.unicode.org/faq/indic.html Unicode doesn't use nukta for the "om" character (eg. chandrabindu + nukta in ISCII, which is encode as a separate character in Unicode).

In Hindi, it is replaced in writing by anusvara when it is written above a consonant which carries a vowel symbol which extends above the top line.

Let's have before anunasika in reverse order. Right? Wrong?

Anubhav Chattoraj

unread,
Nov 12, 2013, 8:57:21 AM11/12/13
to sanskrit-p...@googlegroups.com

If you even see one and scan a page, I’ll be thankful. Never thought it could exist on paper.

Not on paper, but see here (scroll down to page 57). TDIL, run by the Government of India, presents ँ < ं < अ as an “order pertinent to sorting by a computer program”.

Also, in Prabhat Prakashan’s “Brihat Hindi Shabdakosh”, the author complains about improper sorting in other dictionaries. Among other things, he accuses some (unnamed) dictionaries of using an अंकारादिक्रम.

I see the last change was made 12 days ago and still to VBA sorter. Oh, Anubhav, hurry up :) The Sanskrit community is looking at you, following each your step.

The Sanskrit community is gently reminded that I don’t do this full-time, informed that I have more pressing work to take care of at the moment, and strongly urged to be patient.

When I have any progress to report, I’ll make sure to report it here.

Mārcis Gasūns

unread,
Jan 14, 2014, 10:13:46 AM1/14/14
to sanskrit-p...@googlegroups.com
Namaste,

Some time has passed and I wanted to know if any luck with the VBA sorter tool? A gentle reminder from the community to the gentle coder :)

M.G.

Åke Persson

unread,
Mar 28, 2014, 5:44:17 AM3/28/14
to sanskrit-p...@googlegroups.com
 
Mimer's method is mentioned on http://samskrtam.ru/sanskrit-sorting-devanagari/.
 
Here's some additional information...
 
MimerSQL is a DBMS that includes a complete implementation of the Unicode Collation Algorithm (UCA), with extended collation attributes, http://developer.mimer.com/collations/doc/Linguistic_Sorting_and_Searching_in_Mimer_SQL.pdf.
The collation rules for different languages can be found at http://developer.mimer.com/charts/index.tml.
A windows command line version of the sort engine can be downloaded from http://developer.mimer.se/unidrv/unidrv_win.zip (used by others to sort Tibetan, Hindi, Tamil, Kurdish, etc.).
 
Monier-Williams.txt (158744 entries) is available on request.
 

Mārcis Gasūns

unread,
Mar 30, 2014, 5:44:00 AM3/30/14
to sanskrit-p...@googlegroups.com


On Friday, 28 March 2014 13:44:17 UTC+4, Åke Persson wrote:
 
Mimer's method is mentioned on http://samskrtam.ru/sanskrit-sorting-devanagari/.
It's my post, so sure I know it.
 
 
Here's some additional information..
It does not helps much, it does not sorts correctly a Sanskrit dictonary.

Åke Persson

unread,
Mar 30, 2014, 7:26:18 AM3/30/14
to sanskrit-p...@googlegroups.com
Oh, that's surprising! An example of that inabiliity would be greatly appreciated.
 

Mārcis Gasūns

unread,
Oct 1, 2014, 2:53:06 AM10/1/14
to sanskrit-p...@googlegroups.com, anubhav....@gmail.com


On Sunday, 20 October 2013 18:32:05 UTC+4, Anubhav Chattoraj wrote:

I plan to build a VBA version (of the generalized algorithm) after the
Javascript version has been validated by the users of technical-hindi. I
will make sure to post it to this group when it's done.

Any chance to see it move to VBA?
And could you please add an option to upload files with a Browse button and not only copy-paste, please? And maybe even a download button for results - for larger lists.

Anubhav Chattoraj

unread,
Oct 1, 2014, 11:44:02 AM10/1/14
to sanskrit-p...@googlegroups.com

First of all, let me say that I am really sorry for the lack of updates.

I see that it has been eleven whole months since I last worked on this thing. That isn’t because I don’t want to work on it, but because I’ve honestly been too busy to.

What keeps a man busy day and night, weekdays and weekends, for eleven months? Being an actuary, which means working a full-time job and studying for exams on one’s off-time. Very financially rewarding, but doesn’t leave one with any free time.

Unfortunately, I don’t see the situation easing up any time soon. I expect to remain extremely busy at least until the next round of exams in May 2015. But that’s a very optimistic date…The best I can say is that I’ll probably be free enough to work on this after the round of exams after that one, in November 2015.

Of course, I understand that people really don’t want to wait another 13 months for updates, but I really can’t help it. All I can say is that pull requests are welcome; if anyone else wants to work on this, I’ll gladly clean up their code and merge it with mine. (That would take a long time… probably weeks for every pull request. But that’s still much shorter than 13 months.)

The silver lining (if you can call it that) is that once I get back to working on this, it won’t take all that long. All the changes to the Javascript app are just two or three weekends’ worth of work. (The existing app was built over a single weekend.) Porting to VBA would take another 2-3 months.

TL;DR: This is on the back burner for probably the next 13 months (*@#!!!), but it’ll only take a few months after that to add all the features everyone wants.


Mārcis Gasūns

unread,
Apr 27, 2015, 3:57:27 PM4/27/15
to sanskrit-p...@googlegroups.com
After I checked Ignore numbers and non-Devanagari characters
is what I got - longer dhatus sorted at the end, nothing to do with the devanagari :)

On Friday, 25 October 2013 20:05:49 UTC+3, Anubhav Chattoraj wrote:
On 10/20/2013 11:48 PM, Mārcis Gasūns wrote:
> This is interesting. What do you mean by general? Add Hindi and Marathi
> or what else?

Yes, Hindi and Marathi, and other languages that are usually written in
Devanagari.

(I'd like to add support for languages that are only occasionally
written in Devanagari, such as Sindhi, but they tend not to have fixed
collation orders.)

Anyway, the Javascript version is done:

http://anubhav-chattoraj.github.io/indic-tools/devanagari_sorter/

For Sanskrit, the only options you'll need to change are:
"Treat anusvāra before vārgika akṣara as" : "pancamākṣara"
"Treat visarga before sibilant as" : "sibilant"

Do try it out, and let me know if you find any bugs.


muh — 76
abhī — 230
apekṣ — 172
arc — 42
ṣo — 134
ṣic — 240
ṣad — 209
arṣ — 138
ṇam — 35
asūy — 225
īç — 68
çvas — 231
çumbh — 136
aç — 140
añj — 79
çram — 98
çoṣay — 165
bhaj — 168
bhakṣ — 50
çliṣ — 96
çiṣ — 137
bhid — 123
çaṃs — 237
bhuj — 137
bhāvay — 71
bhāṣ — 211
çam — 98
çak — 71
bhūṣ — 34
çabd — 52
çabdāy — 52
yaj — 117
bādh — 120
vī — 132
cal — 196
vyādiç — 187
ceṣṭ — 70
vrīḍ — 188
chid — 79
chiṣ — 86
vraçc — 124
vep — 210
cit — 121
vañc — 103
coday — 94
varṣ — 96
vadh — 80
dah — 161
dam — 193
upeta — 151
dham — 223
upapre — 146
tras — 110
tiro — 146
tij — 77
tarp — 238
duḥkhay — 18
tark — 74
tard — 123
tapasya — 201
dyut — 147
takṣ — 123
sṛ — 18
ej — 138
sū — 118
gad — 104
sāntvay — 202
gardh — 137
sādh — 196
syand — 123
gharṣ — 97
ghuṣ — 239
ghātaya — 78
glai — 201
svid — 135
gras — 102
guh — 222
svad — 153
su — 184
star — 239
sru — 98
hary — 131
snā — 185
sarp — 151
has — 63
hu — 180
hvar — 141
samutthā — 181
sac — 116
indh — 128
ruj — 125
riṣ — 129
jan — 55
jar — 78
ric — 114
rac — 38
pū — 135
pīḍ — 95
jāgar — 135
jāgṛ — 45
kalpay — 115
pār — 203
pyai — 142
prī — 184
pretya — 90
karṣay — 85
prer — 152
pratīṣ — 199
pratīkṣ — 204
klid — 79
kliç — 89
pratyujjīv — 60
krīḍ — 59
kās — 51
pracch — 233
kḷp — 77
paṭh — 187
parī — 134
kṣubh — 99
palāy — 48
lag — 60
lakṣay — 209
niveday — 238
nind — 164
lap — 160
nand — 112
likh — 192
loc — 48
lokay — 107
nad — 92
lup — 48
mānay — 156
mad — 231
mahīya — 145
majj — 237
ac — 137
muc — 143
miṣ — 147
mil — 58
marj — 227
maṇḍ — 40
math — 95
çudh — 70, 181
bandh — 81, 168
atī — 138, 238
manth — 91, 95
lī — 83, 158
bhañj — 56, 165
lok — 107, 229
laṣ — 199, 207
naç — 34, 191
lamb — 54, 66
çuṣ — 79, 165
arh — 95, 199
çaṅs — 90, 91
bhraṅç — 46, 200
cyu — 96, 199
kṣip — 91, 96
paṭ — 187, 208
piṣ — 97, 125
kāṅkṣ — 89, 200
aṇ — 118, 144
ṣidh — 61, 205
brū — 122, 194
karç — 96, 231
preṣ — 142, 190
kart — 102, 104
kam — 118, 161
pāl — 164, 183
kalp — 77, 115
çuc — 79, 226
upās — 139, 234
rabh — 119, 202
upe — 151, 194
rakṣ — 122, 161
cint — 176, 202
arth — 156, 187
ruc — 196, 224
ukṣ — 114, 230
tvar — 45, 187
vāñch — 148, 214
ūh — 71, 141
bhar — 115, 125
tuṣ — 58, 160
hvā — 119, 124
saparya — 18, 132
dhāv — 47, 217
īr — 152, 216
sev — 92, 168
smar — 88, 148
snih — 86, 154
harṣ — 65, 104
sparç — 109, 118
çās — 143, 177
gāh — 221, 222
gup — 50, 139
vyath — 77, 218
gaṇ — 45, 233
ve — 124, 210
chad — 50, 65
e — 136, 138
dīp — 92, 163
març — 155, 187
thā — 69, 111
dviṣ — 132, 136
chri — 94, 226
duḥkh — 18, 186
vraj — 208, 217
ji — 82, 127, 145
tan — 77, 114, 121
ṣvaj — 133, 166, 228
sarj — 84, 87, 126
sah — 95, 205, 224
vā — 118, 148, 214
rudh — 108, 126, 224
gar — 122, 135, 137
juṣ — 118, 207, 218
arj — 53, 59, 170
avekṣ — 80, 208, 218
bhā — 36, 71, 211
lakṣ — 41, 203, 209
mantray — 93, 158, 237
cud — 94, 188, 190
labh — 58, 221, 224, 227
çī — 45, 53, 152, 161
sidh — 61, 122, 126, 205
ruh — 32, 104, 113, 174
mud — 111, 167, 184, 200
rah — 33, 57, 87, 160
nī — 18, 197, 200, 240
darç — 108, 136, 139, 181
cakṣ — 87, 140, 143, 215
mar — 78, 155, 187, 227
du — 18, 186, 207, 209
bhī — 47, 63, 127, 129
vad — 80, 143, 146, 186
dru — 101, 146, 237, 238
yā — 78, 120, 187, 229
jval — 91, 97, 185, 223
rud — 108, 126, 224, 226, 231
mantr — 58, 93, 128, 158, 237
dhar — 53, 94, 212, 232, 237
diç — 89, 111, 187, 198, 202
tar — 45, 74, 123, 125, 238
tap — 85, 163, 191, 201, 235
sā — 127, 134, 196, 202, 219
vyadh — 92, 139, 171, 209, 231
grah — 101, 185, 190, 204, 221
vac — 62, 107, 123, 187, 235
budh — 110, 176, 215, 221, 232
yam — 48, 190, 215, 217, 225
pā — 130, 164, 183, 203, 216
karṣ — 85, 153, 162, 212, 221
khyā — 67, 102, 121, 190, 199
pat — 51, 96, 103, 133, 196
mā — 78, 92, 156, 159, 180
ram — 139, 142, 167, 179, 203, 207
pre — 90, 138, 142, 143, 152, 190
viç — 64, 117, 119, 126, 167, 189
vas — 49, 86, 137, 185, 215, 221
kram — 110, 114, 164, 179, 194, 211
vardh — 47, 82, 83, 184, 191, 234
par — 53, 54, 108, 134, 180, 208
kṣi — 91, 96, 118, 127, 194, 211
hā — 57, 71, 87, 205, 215, 236
iṣ — 50, 86, 142, 186, 190, 199
gā — 38, 122, 139, 180, 221, 222
sad — 65, 88, 99, 183, 209, 218, 230
vid — 43, 78, 124, 140, 186, 198, 238
āp — 34, 48, 49, 93, 160, 178, 181
çri — 94, 101, 104, 110, 123, 161, 176
ṣṭhā — 51, 73, 121, 125, 148, 174, 225
ās — 44, 68, 128, 189, 198, 234, 239
han — 78, 126, 127, 134, 148, 213, 223
dā — 33, 111, 124, 136, 137, 187, 205
īkṣ — 61, 80, 145, 172, 204, 208, 218, 221
pad — 55, 83, 104, 106, 108, 169, 214, 215
as — 60, 71, 125, 139, 143, 200, 210, 225
sar — 18, 84, 87, 98, 126, 151, 221, 231
gam — 43, 57, 74, 138, 158, 179, 185, 188
vart — 89, 116, 119, 152, 169, 194, 200, 218
ci — 67, 92, 121, 145, 176, 202, 224, 229, 230
car — 46, 49, 92, 107, 155, 176, 188, 213, 229
jñā — 78, 89, 128, 139, 143, 183, 188, 206, 234
bhū — 34, 53, 63, 69, 71, 82, 127, 130, 145, 161
yuj — 59, 60, 64, 85, 88, 90, 132, 143, 170, 211
ar — 42, 53, 59, 95, 122, 138, 156, 170, 187, 199
ru — 32, 104, 108, 113, 125, 126, 174, 196, 223, 224, 226, 231
yu — 59, 60, 64, 85, 88, 90, 128, 132, 141, 143, 170, 211
har — 65, 68, 79, 88, 104, 106, 131, 182, 191, 213, 222, 237
man — 58, 91, 93, 95, 128, 144, 156, 158, 161, 175, 185, 237, 240
sthā — 51, 56, 65, 69, 73, 99, 111, 118, 120, 121, 125, 148, 174, 180, 181, 186, 191, 207, 225, 228, 239
kar — 45, 48, 71, 79, 85, 96, 102, 104, 127, 142, 153, 155, 162, 181, 190, 195, 211, 212, 213, 216, 217, 221, 226, 231
var — 47, 82, 83, 89, 93, 96, 113, 116, 119, 120, 126, 138, 140, 152, 169, 172, 184, 185, 188, 191, 194, 200, 206, 218, 231, 234
dhā — 45, 47, 50, 54, 85, 87, 89, 90, 99, 115, 117, 122, 126, 135, 139, 140, 141, 146, 161, 162, 179, 183, 195, 217, 219, 228, 232
i — 50, 55, 60, 85, 86, 90, 94, 105, 122, 128, 132, 134, 136, 138, 142, 143, 146, 151, 156, 162, 172, 181, 186, 190, 192, 194, 199, 230, 238 

विपुल पाण्डे

unread,
Apr 27, 2015, 4:15:25 PM4/27/15
to sanskrit-p...@googlegroups.com
I tried 

सरस्वती नमस्तुभ्यं वरदे काम रुपिणी।
विश्व रुपं विशालाक्षि,विध्यां देहि सरस्वतीं

it gave me  the following two words in this order : 
विशालाक्षि
विश्व रूपं

Shouldn’t श् come before श or is it because श is seen as श्  + अ and hence it comes before श् + व ?


--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It is loading more messages.
0 new messages