Re: [tesseract-ocr] Support Georgian Language

95 views
Skip to first unread message

ShreeDevi Kumar

unread,
Nov 7, 2014, 11:26:10 AM11/7/14
to tesser...@googlegroups.com, tesser...@googlegroups.com, Ray Smith
CC:ing Ray and Dev group 

That language data is part of the update done by Ray Smith on August 12. Ray is planning an update to language data and traineddata soon, so if you have suggestions for improvement, please file an issue and provide more details, samples of each script, etc.. 

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Fri, Nov 7, 2014 at 9:08 PM, Derek <doh...@gmail.com> wrote:
ShreeDevi,

Where did this training text come from? It includes two different Georgian scripts (mkhedruli and asomtavruli). Only mkhedruli is in common usage today, so it seems to me that it would be best to remove the asomtavruli to increase accuracy on modern texts. If complete historical accuracy is desired, then the third Georgian script (nuskhuri) should probably be included as well.

Giorgi, there is some further information about training tesseract with Georgian here (I have trained tesseract to read Georgian and got decent results, but using the old training methods, not the new ones): https://groups.google.com/forum/#!searchin/tesseract-ocr/Georgian/tesseract-ocr/_ytk3bU592A/lHhwYd67xHsJ

In addition, you might try contacting Levan Gelashvili (CCed), who has created a tesseract-based OCR program for Georgian; I haven't had very good results with SunnyPage, but he may have improved it since the last time I tried it.

On Friday, November 7, 2014 4:57:51 AM UTC-5, shree wrote:


​Possible that it will be included in 3.04.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Thu, Nov 6, 2014 at 8:10 PM, Giorgi Gognadze <gognadz...@gmail.com> wrote:
Hi, I'm George. I want to support Georgian language but don't know where to starts and what to do. Can anyone give me a advice? 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b2af59da-4fbb-425e-9c29-5a9003702b9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/12ef4b1d-dde0-49a9-9e37-0534b8d5a283%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages