Hi,
Here's
the point: I have to train tesseract on a new font in traditional
chinese. For now, all the results were not good enough.
I've just tried to train it with only a small set of characters and 1 input image.
Then I took a sample of that image to test it.
The image is:

And the detected text is:
客戶服務 置龍擇語言 設交置 社交
I'm using tesseract 3.02 on Windows.
The questions are:
- What kind of machine learning concept tesseract use ?
- How can I have better results with tesseract ?
- Do I have to train it with a lot of different images ?
- Do I have some parameters to play with on the training part ?
Thanks.