sinhala trained data

120 views
Skip to first unread message

Thilina Mendis

unread,
Feb 23, 2015, 9:43:01 PM2/23/15
to tesser...@googlegroups.com
hi guys anyone know whether there is sinhala.traineddata file... for sinhala fonts? i have to use this. please do share if yall have any files regarding sinhala fonts :)

Thanks
Thilina Mendis

Ruwanka De Silva

unread,
Feb 24, 2015, 5:20:29 AM2/24/15
to tesser...@googlegroups.com
Hi,

There is no Sinhalese traineddata file, no one has published trained data for tesseract yet. There is a sinhala ocr developed by UCSC, but their traineddata file is not accessible. You can find Sinhalese traineddate file from this sinhala ocr but it is lack of accuracy. I am looking forward to train tesseract for Sinhalese (especially for the letters in old newspapers which don't have exact fonts). I'll post here if I succeed with training Sinhalese. Anyone has knowledge about training tesseract for Sinhalese in high accuracy please comment here or share training files. 

Regards. 

Manusha Dilan

unread,
Nov 10, 2015, 4:10:48 AM11/10/15
to tesseract-ocr

The thing you may searching for is this https://github.com/tesseract-ocr/tessdata/blob/master/sin.traineddata

Best Regards.

On Tuesday, February 24, 2015 at 8:13:01 AM UTC+5:30, Thilina Mendis wrote:
Reply all
Reply to author
Forward
0 new messages