Re: [tesseract-ocr] Support Language

69 views
Skip to first unread message

ShreeDevi Kumar

unread,
Nov 7, 2014, 4:57:51 AM11/7/14
to tesser...@googlegroups.com


​Possible that it will be included in 3.04.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Thu, Nov 6, 2014 at 8:10 PM, Giorgi Gognadze <gognadz...@gmail.com> wrote:
Hi, I'm George. I want to support Georgian language but don't know where to starts and what to do. Can anyone give me a advice? 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b2af59da-4fbb-425e-9c29-5a9003702b9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Derek

unread,
Nov 7, 2014, 10:38:57 AM11/7/14
to tesser...@googlegroups.com
ShreeDevi,

Where did this training text come from? It includes two different Georgian scripts (mkhedruli and asomtavruli). Only mkhedruli is in common usage today, so it seems to me that it would be best to remove the asomtavruli to increase accuracy on modern texts. If complete historical accuracy is desired, then the third Georgian script (nuskhuri) should probably be included as well.

Giorgi, there is some further information about training tesseract with Georgian here (I have trained tesseract to read Georgian and got decent results, but using the old training methods, not the new ones): https://groups.google.com/forum/#!searchin/tesseract-ocr/Georgian/tesseract-ocr/_ytk3bU592A/lHhwYd67xHsJ

In addition, you might try contacting Levan Gelashvili (CCed), who has created a tesseract-based OCR program for Georgian; I haven't had very good results with SunnyPage, but he may have improved it since the last time I tried it.

ShreeDevi Kumar

unread,
Nov 8, 2014, 9:14:27 PM11/8/14
to tesser...@googlegroups.com

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sun, Nov 9, 2014 at 3:06 AM, Giorgi Gognadze <gognadz...@gmail.com> wrote:
First thanks for quick answer. I'm new here and don't know how to train tesseract ocr if you have something to help me with I'll appreciate that. So I can contact Levan Gelashvili on that SunnyPage or not? Do you know exact date when 3.04 version will be available?

Thx,
Goerge
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
Message has been deleted
0 new messages