Problem with single line three character Chinese

83 views
Skip to first unread message

Sayang

unread,
Jul 31, 2014, 2:36:12 AM7/31/14
to tesser...@googlegroups.com
a)  Tesseract correctly OCR'd eight (>30 character) lines of Chinese, scanned from a book

b) Tesseract seemed to fail OCR'ing a single line image with three characters (xingqisi - Thursday)

   (i) Four different fonts were tried - so four different single line images - attached.
   (ii) The binary data produced by Tesseract for each of the four attempts was identical
   (iii) There were no error messages.

Any suggestions would be greatly appreciated.
xingqisi04.png
Xingqisi01.png
Xingqisi02.png
Xingqisi03.png

Paul

unread,
Jul 31, 2014, 3:17:34 PM7/31/14
to tesser...@googlegroups.com
Did you try the options -psm 7 or -psm 8?
Probably you will get better results by using one of them.

Paul

HanmoLingfeng

unread,
Aug 15, 2014, 5:49:36 AM8/15/14
to tesser...@googlegroups.com
Sometimes  this problem in my project,and the  image(.jpg) file  convert into the file(.tif) which is empty . Do you know  the reason ?
Thx your help.
Reply all
Reply to author
Forward
0 new messages