Hi Sheeyam, sorry for not replying to your emails sooner.
On Sun, Jun 22, 2014 at 04:43:27AM -0700, sheeyam shellvacumar wrote:
> Does Tesseract support sinhala. How do u guys train them ??? Actually i am
> confused help me
It looks like some people have trained Tesseract for Sinhala; see
http://www.ucsc.cmb.ac.lk/sdu/research.html &
http://192.248.22.122/ocrsinhala/
However as far as I can see they aren't sharing their .traineddata
file, or the source files for it. It would be a good idea to contact
them and ask if they can share those with you, and with the
community more broadly, so we can potentially improve things in the
future.
If they don't respond, instructions on training Tesseract are on the
wiki:
https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Nick