Namaste!Fyip, I am working on a voluntary activity to extract some of the Shlokas in Grantha Script, from a Scanned PDF, and then transliterate them to Devanagari. While researching around this process, I got to go through some of your work on google, and at this https://sites.google.com/site/sanskritcode/ocr/0-introductionso thought will take the liberty of reaching out for any possible help/lead for this extract/transliteration
The source of scripture is in a scanned PDF, which I tried to extract (OCR Scan-to-text) using tools like Adobe, Sejda etc., The text gets extracted but with distortion, since, these tools don`t support some of the native fonts! I am also exploring the Google Cloud Vision OCR for now, no luck yet!
Can you pls advise what`s the best approach/tool to extract Grantha scripture from a scanned PDFSample text
--Thanks & RegardsPrashanth Anantharaman
आपको यह संदश इसलिए मिला है क्योंकि आपने Google समूह के "sanskrit-ocr" समूह की सदस्यता ली है.
इस समूह की सदस्यता खत्म करने और इससे ईमेल पाना बंद करने के लिए, sanskrit-ocr...@googlegroups.com को ईमेल भेजें.
वेब पर यह चर्चा देखने के लिए, https://groups.google.com/d/msgid/sanskrit-ocr/CAO2eXKY%3D9tRn6i_TNrA6CXR92Z95zuhvAA5aWe1PAXNAPxsuNQ%40mail.gmail.com पर जाएं.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/CAFY6qgG2-NrPcYxCy60izYAN%2B8F72w%2BfvQ3%3DWYUB%2BWbTdGZ5MA%40mail.gmail.com.
Namaste,
Even Tamil understanding seems to be bad. Is there any better sample of Tamil text by OCR extraction. Just curious if it is even worth the trouble of going the OCR route.
From: विश्वासो वासुकिजः (Vishvas Vasuki)
Sent: Sunday, February 20, 2022 7:19 PM
To: Prashanth Anantharamu; sanskrit-programmers
Cc: sanskrit-ocr
Subject: Re: extraction of Grantha Script from a Scanned PDF (OCR Scan extract)
On Mon, 21 Feb 2022 at 07:43, Prashanth Anantharamu <prashanth...@gmail.com> wrote:
Namaste!
Fyip, I am working on a voluntary activity to extract some of the Shlokas in Grantha Script, from a Scanned PDF, and then transliterate them to Devanagari. While researching around this process, I got to go through some of your work on google, and at this https://sites.google.com/site/sanskritcode/ocr/0-introduction
so thought will take the liberty of reaching out for any possible help/lead for this extract/transliteration
That site is obsolete - https://sanskrit-coders.github.io/content/ocr/ is newer.
someone on sanskrit-programmers (cc-ed) might have ideas you might benefit from. But you won't necessarily get their responses unless you become a member/ subscriber of https://groups.google.com/g/sanskrit-programmers/ and https://groups.google.com/g/sanskrit-ocr/
The source of scripture is in a scanned PDF, which I tried to extract (OCR Scan-to-text) using tools like Adobe, Sejda etc., The text gets extracted but with distortion, since, these tools don`t support some of the native fonts! I am also exploring the Google Cloud Vision OCR for now, no luck yet!
I suspect Google OCR won't work. It only understands tamil letters, not grantha. (see below). I would suspect that the situation is same for other OCR tools.
Can you pls advise what`s the best approach/tool to extract Grantha scripture from a scanned PDF
Sample text
This is what google drive ocr from https://ocr.sanskritdictionary.com/# yields for the above - all tamil.
வெ வாரண நி குவெரா உஹாவில்விழoner திணெ த . | கவொறே மஜா நெலானவாறு ஜெ வாவபு தீவத, BUTo ராணா
Thanks & Regards
Prashanth Anantharaman
--
आपको यह संदश इसलिए मिला है क्योंकि आपने Google समूह के "sanskrit-ocr" समूह की सदस्यता ली है.
इस समूह की सदस्यता खत्म करने और इससे ईमेल पाना बंद करने के लिए, sanskrit-ocr...@googlegroups.com को ईमेल भेजें.
वेब पर यह चर्चा देखने के लिए, https://groups.google.com/d/msgid/sanskrit-ocr/CAO2eXKY%3D9tRn6i_TNrA6CXR92Z95zuhvAA5aWe1PAXNAPxsuNQ%40mail.gmail.com पर जाएं.
--
--
Vishvas /विश्वासः
--
आपको यह संदश इसलिए मिला है क्योंकि आपने Google समूह के "sanskrit-ocr" समूह की सदस्यता ली है.
इस समूह की सदस्यता खत्म करने और इससे ईमेल पाना बंद करने के लिए, sanskrit-ocr...@googlegroups.com को ईमेल भेजें.
वेब पर यह चर्चा देखने के लिए, https://groups.google.com/d/msgid/sanskrit-ocr/CAFY6qgG2-NrPcYxCy60izYAN%2B8F72w%2BfvQ3%3DWYUB%2BWbTdGZ5MA%40mail.gmail.com पर जाएं.
Thanks! (adding shreeshree and shrIramaNa, ravi to bcc)namaste shreedevi,could you please point me to a guide which describes how to use this model?
(adding Vinodh Rajan also to the conversation)I haven't looked at this (experimental training) for a while.It would help if those who are familiar with Grantha script create line images and corresponding Unicode text groundtruth for training.I had tried using the limited unicode grantha fonts for creating traing data, but the printed texts use legacy fonts which look quite different hence the results are suboptimal.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/56c525bc-4d11-4716-88fe-a4a8a95cea45n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/CAFY6qgEjaqFuSZ1t3QMkxkvr3sNpGRXuztQvzJzGO5hqWkxUSg%40mail.gmail.com.
Indian Sanatani People is learning and speaking Sanskrit with their family that's only my way make sanskrit as popular within Hindu people.so, See below linkabove link is only one video iif you see channel you can see more video's related to sanskrit learning.it's my only Try(Prayas) for Sanskrit language.By profession I'm Professional IT engineer
To view this discussion on the web visit https://groups.google.com/d/msgid/sanskrit-programmers/CALUBD77jNwYLOcQ11WN_2do%3DVwV%3D%3DcviSh8U57A3nskqtR7Q3A%40mail.gmail.com.