---------- Forwarded message ----------
From: Guntupalli Karunakar <karu...@indlinux.org>
Date: Mar 19, 2008 12:44 AM
Subject: [Indlinux-hindi] Hindi aspell wordlist review
To: IndLinux Hindi <indlinu...@lists.sourceforge.net>
Hi,
This has been one long pending task, that of reviewing the wordlist
used in Hindi Aspell. A review of all the words in it and with
corrections would make a more accurate Hindi spellchecker
The current wordlist has about 83514 words in it. I have split the
file into chunks of about 2000 words each - total about 42 files.
all files available here.
http://indlinux.sourceforge.net/hindi/dict/
Check if the word is valid (as in exists or not)
* if valid and spelling is correct leave it as is
* if looks valid but wrong spelling correct it , suffix by C
┬а eg рд╕рд╡рд╛рд╕реНрдереНрдп
┬а┬а рд╕реНрд╡рд╛рд╕реНрдереНрдп , C
* if there are variants to a base word
┬а┬а eg, рд▓рдбрдХрд╛┬а┬а рд▓рдбрдХреЗ┬а рд▓рдбрдХреЛрдВ┬а etc┬а put them in one line like
┬а рд▓рдбрдХрд╛ , рд▓рдбрдХреЗ┬а рд▓рдбрдХреЛрдВ
* if invalid word, it could be deleted , suffix by D
┬а рд▓реЗрдбреЗрдХреЛрдВ , D
* if transliteration/origin of english/foriegn word leave as it is,
suffix it by E
рдЗрд▓реЗрдХреНрдЯреНрд░рд┐рдХрд▓ , E
рд╕реНрд╡рд┐рдЪ , E
(these are not good examples, but just to give idea).
Send back the file once you have completed it as per steps above.
Anyone with good hindi knowledge can volunteer to correct the word
lists.
* Announce that you are interested
* Take up one file ( you can take upto 3 weeks to complete it)
* If you do it sooner you can take another part.
* If all parts assigned, you can do a review of another part (which
you didnt do).
* keep file names same, dont add extra words.
Ravi
----- Original Message -----From: narayan prasadSent: Tuesday, March 25, 2008 2:23 PMSubject: [technical-hindi] Hindi aspell wordlist review<<┬а* if there are variants to a base word
┬а┬а eg, рд▓рдбрдХрд╛┬а┬а рд▓рдбрдХреЗ┬а рд▓рдбрдХреЛрдВ┬а etc┬а put them in one line like
┬а рд▓рдбрдХрд╛ , рд▓рдбрдХреЗ┬а рд▓рдбрдХреЛрдВ >>┬аHave to delete the other rows containing the variants ?┬а
<<┬а* if transliteration/origin of english/foriegn word leave as it is, suffix it by E
рдЗрд▓реЗрдХреНрдЯреНрд░рд┐рдХрд▓ , E
рд╕реНрд╡рд┐рдЪ , E┬а>>┬аEven when the pronunciation (represented by the nearest devanAgarii characters) is wrong ?┬а
---Narayan Prasad
рд╡рд╣рд╛рдБ 17 рдирдВ рдХреА рдлрд╝рд╛рдЗрд▓ рдХреЗ рдмрд╛рдж рдХреА рдлрд╝рд╛рдЗрд▓реЛрдВ рдкрд░ рдХрд╛рдо рдХрд░рдирд╛ рд╣реИ. рддреЛ рдЖрдк 17 рд╕реЗ 42 рддрдХ рдХреА рдХреЛрдИ
рднреА рдФрд░ рдХрд┐рддрдиреА рд╣реА рдлрд╝рд╛рдЗрд▓реЗрдВ рдЙрдард╛ рд╕рдХрддреЗ рд╣реИрдВ. рдлрд╝рд╛рдЗрд▓реЗрдВ рдпрд╣рд╛рдБ рд╕реЗ рдбрд╛рдЙрдирд▓реЛрдб рдХрд░реЗрдВ
http://indlinux.sourceforge.net/hindi/dict/
рдФрд░ рд╣рд╛рдБ, рдКрдкрд░ рдХреА рдЕрд╕рд╛рдЗрдирдореЗрдВрдЯ рд╕реВрдЪреА рдореЗрдВ рдлрд╝рд╛рдЗрд▓ рдХреНрд░рдо рдореЗрдВ рдЕрдкрдирд╛ рдирд╛рдо рдЕрд╡рд╢реНрдп рджрд░реНрдЬ рдХрд░ рджреЗрдВ рддрд╛рдХрд┐ рдХрд╛рд░реНрдп
рдореЗрдВ рджреЛрд╣рд░рд╛рд╡ рди рд╣реЛ.
рд░рд╡рд┐
рдорд┐рддреНрд░реЛрдВ рд╕реЗ рдЖрдЧреНрд░рд╣ рд╣реИ рдХрд┐ 15 рдЕрдкреНрд░реИрд▓ рддрдХ рдЕрдкрдиреЗ