मराठी ओसीआर

60 views
Skip to first unread message

Amarjeet Chopade

unread,
Nov 14, 2014, 3:27:10 AM11/14/14
to tesser...@googlegroups.com
श्रीमान श्रीश्रीजी
बहुत धन्यवाद आपने कोकणी ट्रेनडाटा फाईल भेजी। मैने तुरंत उपयोग किया। ७० से ८० प्रतिशत सही ओसीआर हो रहा है। मुझे एक शंका है, टेसडाटा फोल्डर मे हिन्दी के लिए ट्रेनडाटा फाईल के अलावा और भी सात फाईले होती है। वे क्या है? इसी तरह कोकणी की फाईले भी बनाकर उसमे पेस्ट की जाए तो?
फिरसे धन्यवाद। कृपया मेरी आशंका का समाधान हो जाये तो बहुत आभारी रहूँगा।
अमरजीत

ShreeDevi Kumar

unread,
Nov 14, 2014, 4:14:46 AM11/14/14
to tesser...@googlegroups.com
Amarjeet,

Glad that you are getting 70-80% correct OCR for Marathi using the Konkani traineddata I posted.  

The Hindi traineddata was trained with 'cube' method by Google but that is not available to us.

The training can be improved with better training text or font similar to the one being OCRed.

I will post a newer version of file , you can test to see whether there are improvements.

ShreeDevi

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/906ca7e6-e84f-4c45-b20b-2753eeaa92a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ameya Badadare

unread,
Dec 17, 2014, 12:20:20 PM12/17/14
to tesser...@googlegroups.com

 These files are created when tessdata is trained.
You can create your own trained data file for Marathi but it will need a lot of efforts to create.
Reply all
Reply to author
Forward
0 new messages