Hello,
I am making a transcrypt of YT wideos using tessaract.
Images I input to tessaract look like this:
The output is mostly correct but sometimes the same character give numerous output.
Example:
Input:
Output: 大叔中文 - CORRECT
Input:
Output: 今天不是3位 大档 - INCORRECT
In preparation of the images I use:
- dilatation,
- cropping the area of image containg characters
- I add borders.
For dilatation I use 2x2 kernel and the border is 2px thick.
For segmentation method I am currently experimentig with psg --7 and psg -- 13. psg --7 seems to give a bit better results. Of course the language is : lang='chi_sim'
Could you give my any advice how to improve the robustness of the output?
Thank you in advance,
Jan