Reverse Sanskrit Dictionaries

Mārcis Gasūns

unread,

Nov 18, 2013, 12:49:38 PM11/18/13

to bvpar...@googlegroups.com

Namaste,

I'm aware of 2 reverse dictionaries from Germany and a 2 years old discussion at https://groups.google.com/forum/#!msg/bvparishat/OtynEDhVtRQ/qTT7jQzG7VMJ (I understood from that discussion I should find some नानार्थs and give reference to at least मेदिनीकोश in my to-be reverse dictionary, which is planned in 2014, 255k words total from 13 dictionaries). But is http://books.google.ru/books/about/Sanskrit_Tibetan_dictionary.html?id=YWoaAQAAIAAJ&redir_esc=y Sanskrit-Tibetan dictionary: being the reverse of the 19 volumes of the Tibetan-Sanskrit dictionary a reverse dictionary as well? Who has seen it? If seen, who can scan a few pages for me, please?

Thanks,

M.G.

Sample stats from my dictionary word ending in:

s 32260 12.6%

a 28957 11.3%

p 28229 11.0%

v 23912 9.3%

k 18579 7.3%

m 13848 5.4%

d 13227 5.2%

ś 12613 4.9%

b 10918 4.3%

n 10537 4.1%

u 7488 2.9%

t 7252 2.8%

g 7165 2.8%

r 6553 2.6%

ā 6473 2.5%

c 5420 2.1%

j 4674 1.8%

h 4544 1.8%

y 3928 1.5%

l 3429 1.3%

i 1355 0.5%

e 1245 0.5%

dictionary-word-legth-stats.gif

Nityanand Misra

unread,

Nov 18, 2013, 7:27:22 PM11/18/13

to bvpar...@googlegroups.com

On Tuesday, November 19, 2013 1:49:38 AM UTC+8, Mārcis Gasūns wrote:

Namaste,

I'm aware of 2 reverse dictionaries from Germany and a 2 years old discussion at https://groups.google.com/forum/#!msg/bvparishat/OtynEDhVtRQ/qTT7jQzG7VMJ (I understood from that discussion I should find some नानार्थs and give reference to at least मेदिनीकोश in my to-be reverse dictionary, which is planned in 2014, 255k words total from 13 dictionaries).

Are you planning a Samskrita -> Samskrita reverse dictionary? Or do you mean an English/Russian -> Sanskrit dictionary?

If you mean sorting the words by how they end (or by the last entity in a compound), then rather than publishing something anew, an interface to exisiting dictionaries may be provided. For example if I want to know words ending in "sama" in Samskrita, I go to http://www.sanskrit-lexicon.uni-koeln.de/mwquery/, type "sama" in the Samskrita word field, choose "suffix" from the dropdown and search.

What I miss in this Advanced Search feature is the option of searching for grep-style "regular expressions". I can currntly search for prefix, suffix, substring but with a regular expression I can search for any sort of word.

But is http://books.google.ru/books/about/Sanskrit_Tibetan_dictionary.html?id=YWoaAQAAIAAJ&redir_esc=y Sanskrit-Tibetan dictionary: being the reverse of the 19 volumes of the Tibetan-Sanskrit dictionary a reverse dictionary as well? Who has seen it? If seen, who can scan a few pages for me, please?

Now is this a simple Sanskrit-Tibetan dictionary or a non-conventional "reverse dictionary".

A very good English-English reverse dictionary is published by Readers' Digest Reverse Dictionary (http://www.readersdigestdirect.co.in/reverse-dictionary) which gives words for meanings, and much more infomation. For example if you look up "stair", it would give you a diagram showing what is "tread", "rise", "baluster" et cetera. In a sense the 19th century dictionaries Sabdakalpadruma and Vacaspatyam doubled up as thesauruses and reverse dictionaries too, as under most words they have तत्पर्यायाः (synonyms), तत्प्रकाराः (types), तल्लक्षणम् (characteristics), etc.

How do I read your bell shaped frequency count tables?

Mārcis Gasūns

unread,

Nov 19, 2013, 2:50:03 AM11/19/13

to bvpar...@googlegroups.com

Namaste,

On Tuesday, 19 November 2013 04:27:22 UTC+4, Nityanand Misra wrote:

Are you planning a Samskrita -> Samskrita reverse dictionary? Or do you mean an English/Russian -> Sanskrit dictionary?

None of the above. Reverse dictionaries are words indexes with some grammatical data and without meanings. Because it's based on real books, maybe 20-21 by the end, than there is no need to add them, but one could look up in the "usual" dictionaries. Otherwise instead of 1100 pages it would take 4400/

If you mean sorting the words by how they end (or by the last entity in a compound), then rather than publishing something anew, an interface to exisiting dictionaries may be provided. For example if I want to know words ending in "sama" in Samskrita, I go to http://www.sanskrit-lexicon.uni-koeln.de/mwquery/, type "sama" in the Samskrita word field, choose "suffix" from the dropdown and search.

Yes, I'm aware of this website. More than that - how many OCR mistakes have you submitted to them? Lot's of fixing needs to be done. But web is web. Print is print. I go for print. I'm no fan of web. Even the dictionaries you mentioned I use locally. Because it's much faster that way.

What I miss in this Advanced Search feature is the option of searching for grep-style "regular expressions". I can currntly search for prefix, suffix, substring but with a regular expression I can search for any sort of word.

And because I can use RegEx which is another word fro grep in this context. So I have grep-style functions on my PC for MW.

Now is this a simple Sanskrit-Tibetan dictionary or a non-conventional "reverse dictionary".

As I've not seen it, can't tell.

A very good English-English reverse dictionary is published by Readers' Digest Reverse Dictionary (http://www.readersdigestdirect.co.in/reverse-dictionary) which gives words for meanings, and much more infomation. For example if you look up "stair", it would give you a diagram showing what is "tread", "rise", "baluster" et cetera.

Let me explore it, never heard of it before, thanks. Do you have a scanned page of it?

In a sense the 19th century dictionaries Sabdakalpadruma and Vacaspatyam doubled up as thesauruses

No they did not. I will show you it one day, I hope. I have some interesting numbers, but not enough time to analyze them. Maybe you can help me?

and reverse dictionaries too,

No, I guess.

as under most words they have तत्पर्यायाः (synonyms), तत्प्रकाराः (types), तल्लक्षणम् (characteristics), etc.

What stands behind etc. What more "types" are you aware in them as well, please tell me.

How do I read your bell shaped frequency count tables?

It shows how many words are of particular length. On the left, 1st column - length of words. Above row - name of dictionary.

I've got one idea. I want to kill visargas and -m (4759 cases) at the end of the word in Apte, so it would have the same word forms as MW. Is it ok, or do I miss something?

8 A aṃśanam अंशनम् 7 m
14 A aṃśukam अंशुकम् 7 m
33 A akam अकम् 4 m
43 A akaraṇam अकरणम् 8 m
59 A akalpanam अकल्पनम् 9 m
79 A akālikam अकालिकम् 8 m
92 A akupyam अकुप्यम् 7 m
121 A aktram अक्त्रम् 6 m
153 A akṣarakam अक्षरकम् 9 m
163 A akṣitaram अक्षितरम् 9 m
180 A akṣaurimam अक्षौरिमम् 10 m
224 A agāram अगारम् 6 m
229 A agulmakam अगुल्मकम् 9 m
278 A aṅkanam अङ्कनम् 7 m
279 A aṅkasam अङ्कसम् 7 m
286 A aṅkupam अङ्कुपम् 7 m
301 A aṅgam अङ्गम् 5 m
302 A aṅgakam अङ्गकम् 7 m
307 A aṅgaṇam अङ्गणम् 7 m
309 A aṅgadam अङ्गदम् 7 m
310 A aṅganam अङ्गनम् 7 m
313 A aṅgavam अङ्गवम् 7 m
330 A aṅgirasāmayanam अङ्गिरसामयनम् 15 m

M.

Nityanand Misra

unread,

Nov 19, 2013, 7:22:08 PM11/19/13

to bvpar...@googlegroups.com, Me

On Tuesday, November 19, 2013 3:50:03 PM UTC+8, Mārcis Gasūns wrote:

Namaste,

On Tuesday, 19 November 2013 04:27:22 UTC+4, Nityanand Misra wrote:
Are you planning a Samskrita -> Samskrita reverse dictionary? Or do you mean an English/Russian -> Sanskrit dictionary?
None of the above. Reverse dictionaries are words indexes with some grammatical data and without meanings. Because it's based on real books, maybe 20-21 by the end, than there is no need to add them, but one could look up in the "usual" dictionaries. Otherwise instead of 1100 pages it would take 4400/

Well reverse dictionaries are of different types, giving words for concepts or meanings, words for an ending suffix or morphemes or phonemes, etc. But all dictionaries, including reverse dictionaries, are lists of sorted key-value pairs. Key is what you use to lookup, value is what you get when you lookup the key. If your key (concept or ending syllable or suffix etc) is in Samskrita, and values also in Samskrita, then it is a Samskrita-Samskrita reverse dictionary. If your keys are not in Samskrita, Russian or English, what are they in? Are they hashmaps or md5sums? If not, then it is most likely one of the above and not none of the above.

Yes, I'm aware of this website. More than that - how many OCR mistakes have you submitted to them? Lot's of fixing needs to be done. But web is web. Print is print. I go for print. I'm no fan of web. Even the dictionaries you mentioned I use locally. Because it's much faster that way.

I have not submitted OCR mistakes, but the online MW does provide a link to the scan of the page for every lookup.

A very good English-English reverse dictionary is published by Readers' Digest Reverse Dictionary (http://www.readersdigestdirect.co.in/reverse-dictionary) which gives words for meanings, and much more infomation. For example if you look up "stair", it would give you a diagram showing what is "tread", "rise", "baluster" et cetera.
Let me explore it, never heard of it before, thanks. Do you have a scanned page of it?

I would if I could but I can't so I won't. I don't have it with me now.

In a sense the 19th century dictionaries Sabdakalpadruma and Vacaspatyam doubled up as thesauruses
No they did not. I will show you it one day, I hope. I have some interesting numbers, but not enough time to analyze them. Maybe you can help me?

They list synonyms for many words, which is what a thesaurus does.

and reverse dictionaries too,
No, I guess.

They list types, parts, related concepts which is what RD's reverse dictionaries does.

as under most words they have तत्पर्यायाः (synonyms), तत्प्रकाराः (types), तल्लक्षणम् (characteristics), etc.
What stands behind etc. What more "types" are you aware in them as well, please tell me.

Look up "Chandas" छन्दस् in Sadbakalpadrumah - an entry spanning multiple pages and you will know.

How do I read your bell shaped frequency count tables?
It shows how many words are of particular length. On the left, 1st column - length of words. Above row - name of dictionary.

That is obvious. What is not is what are the rows - column A is missing? I am a statistician for a good part of my job. I see frequency count tables day in and day out - but a frequency count table without row and column labels is like a plot without X and Y axis labels. So send another snap with the missing column and a legend and I can read it better. I want to know if the distribution looks skewed.

I've got one idea. I want to kill visargas and -m (4759 cases) at the end of the word in Apte, so it would have the same word forms as MW. Is it ok, or do I miss something?

Your string-length concept may be flawed as you seem to apply it to the transliteration and not on the syllable-length which is what matters in Samskrita. Do you count length of "dh" (ध्) as two or as one? In your table "Sample stats from my dictionary word ending in:", the percentages do not add up to 100% (98.5%) - it is incomplete. I do not see entries for diphthong vowels "au" and "ai" and aspirated consonants like "dh" etc, whereas most Samskrita dictionaries have words ending in these. Do you count them separately? Or under words ending in "i", "u" or "h". For Samskrit prosody, rhyme, creative writing and crosswords, अज and कार्त्स्न्य are both two-syllabled which is what matters. You would have length 3 for the first one and length 8 for the second one.

You probably want something that counts the syllables in the original language and not the letters in the transliteration. Something what this does -

http://sanskrit.sai.uni-heidelberg.de/Chanda/HTML/

Reply all

Reply to author

Forward