OCR for Sanskrit

124 views
Skip to first unread message

Prasanna Swaroopa

unread,
Oct 7, 2017, 12:31:40 PM10/7/17
to sams...@googlegroups.com
Namaste!

Can you kindly suggest a good (Free) OCR (Optical Character Recognition) tool (online or offline) for Sanskrit, that will take JPG (or some such image file) and output the text.

Thank you.

with warm regards
Br. Prasanna Swaroopa



Virus-free. www.avast.com

Sunder Hattangadi

unread,
Oct 7, 2017, 1:32:21 PM10/7/17
to sams...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.
To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

Prasanna Swaroopa

unread,
Oct 8, 2017, 2:09:06 PM10/8/17
to sams...@googlegroups.com
Namaste!
Thank you Sunderji.

with warm regards
Br. Prasanna Swaroopa

Virus-free. www.avast.com

On Sat, Oct 7, 2017 at 11:02 PM, 'Sunder Hattangadi' via samskrita <sams...@googlegroups.com> wrote:
On Saturday, October 7, 2017, 11:31:40 AM CDT, Prasanna Swaroopa <prasanna...@gmail.com> wrote:


Namaste!

Can you kindly suggest a good (Free) OCR (Optical Character Recognition) tool (online or offline) for Sanskrit, that will take JPG (or some such image file) and output the text.

Thank you.

with warm regards
Br. Prasanna Swaroopa



Virus-free. www.avast.com

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+unsubscribe@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+unsubscribe@googlegroups.com.

ePandit | ई-पण्डित

unread,
Oct 9, 2017, 12:53:07 PM10/9/17
to sams...@googlegroups.com
I have found google doc's OCR best for Devanagari. Many people are not aware about it. Just upload an image (containing Devanagari text) to google drive, now right click the file in google drive folder and select 'Open with > Google Docs'. It will put ocred text below image so you can compare recognized text. You may have to do minor edits, you can do this within google docs or download the file as MS word file and do in Microsoft Word later.

If you want to use tesseract, use a easy GUI like VietOCR
Shrish Benjwal Sharma (श्रीश बेंजवाल शर्मा)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If u can't beat them, join them.

ePandit.NET

विश्वासो वासुकिजः

unread,
Oct 9, 2017, 3:57:42 PM10/9/17
to samskrita
(Pending ability to cc other email ids, suggested in https://screenshots.firefox.com/69NZFh4SzQ1I80hl/groups.google.com , sending to my mailbox first)
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Oct 9, 2017, 4:15:53 PM10/9/17
to संस्कृतसन्देशश्रेणिः samskrta-yUthaH, sanskrit-programmers
(+sanskrit-programmers)

--
You received this message because you are subscribed to a topic in the Google Groups "samskrita" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/samskrita/Co6LTVuMnbI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to samskrita+unsubscribe@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.



--
--
Vishvas /विश्वासः

Prasanna Swaroopa

unread,
Oct 10, 2017, 7:19:03 AM10/10/17
to sams...@googlegroups.com
Namaste!

Thank you all for your inputs on this subject.

The Google Docs option was a very pleasant surprise. 
On top of it, such a simple method.

With warm regards
Br. Prasanna Swaroopa

Anunad Singh

unread,
Oct 17, 2017, 12:45:47 AM10/17/17
to sams...@googlegroups.com
Google OCR has also been integrated with Sanskrit wikisource. Due to this it has become so easy to put any Sanskrit book on Wikisource and converting it into Unicode. Very old books are already available on internet archive and elsewhere in PDF or DJVU format. These are the format accepted by wikisource. And with integration of Google OCR with this, the process of OCR has become very simple, fast and problem free. The accuracy of recognition is excellent. To me it seems more accurate than that from Tesseract in the present conditions.

-- anunAda

Taff Rivers

unread,
Oct 18, 2017, 10:24:44 AM10/18/17
to samskrita
anunAda,

  Bless you!

I my case, almost, but not quite problem free...

I had to locate a detailed tutorial to get me started.
I found this recent one, dated May 2017, to be helpful

https://business.tutsplus.com/tutorials/how-to-ocr-documents-for-free-in-google-drive--cms-20460?_ga=2.205751760.2121351967.1508318986-1995006691.1508318986

With that, and the addition of a little mental elbow grease, I managed to get both an .html and an .rtf output from a .pdf input.

Magic!

Taff Rivers
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.



--
Shrish Benjwal Sharma (श्रीश बेंजवाल शर्मा)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If u can't beat them, join them.

ePandit.NET

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.

To post to this group, send email to sams...@googlegroups.com.
Visit this group at https://groups.google.com/group/samskrita.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "samskrita" group.
To unsubscribe from this group and stop receiving emails from it, send an email to samskrita+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages