such as this image.it 's not very complex.
Is interpreted as two characters 青 and 争。But this is actually a relatively good case. For
It is totally off, which interprets the character as three part from top to bottom, and the bottom is interpreted as the symbol ^. The worst case is
which is completely garbage output.
In all my user cases, I need only detect a single Chinese character a time. My question is, what can I do to improve the accuracy of the recognition? Thanks
Here's an example where the Chinese characters are very large and clear, but Tesseract gets the wrong result. Could you advise on what image processing could help Tesseract's accuracy?
爸爸说我
It looks correct to me, but I don't read Chinese.
Basically I just thresholded to send anything that wasn't very white to be completely black. I didn't even both inverting the white on black.