why such simple word can't be recognized?

137 views
Skip to first unread message

xll...@gmail.com

unread,
Aug 14, 2018, 6:59:01 AM8/14/18
to tesseract-ocr
I use opencv to extract chars from image and combine them together, but tasseract failure to recognize it.
I have tested with paramters "-c tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.-\\'"  "-psm 7" and "-psm 8", still no lucky.
please see attachment, ears.png

but some others were successful, like godmother.png.

who could teach me, please.
ears.png
godmother.png

zwwts...@gmail.com

unread,
Aug 14, 2018, 8:41:06 AM8/14/18
to tesseract-ocr
It's interesting. I'v tried many way to process the img, binary inverse, cut, resize. 
I'v tried with oem of 3.0.0 and 4.0.0,  psm of 3\6\7 
I thought maybe some one works, but actually no one did, and nothing went out
Maybe this special fonts just hit some weakness of tesseract


在 2018年8月14日星期二 UTC+8下午6:59:01,xll...@gmail.com写道:
ears_1.png
ear_inv.png
ear_inv_1.png
ear_inv_2.png
Message has been deleted

xll...@gmail.com

unread,
Aug 16, 2018, 12:54:57 PM8/16/18
to tesseract-ocr
yes, more funny thing is that if I feed all words to tesseract, it will successful read "ears" out, you can test with attachment .

在 2018年8月14日星期二 UTC+8下午8:41:06,zwwts...@gmail.com写道:
hat.png

xll...@gmail.com

unread,
Aug 16, 2018, 1:24:27 PM8/16/18
to tesseract-ocr
many thanks for your information.
I have tried a lot of scaling, from factor y1.0, x0.8 to y1.0, x0.5 , none of them work.
I will try do some fine tune train with this font.
thanks again for let me known this font name :)


在 2018年8月15日星期三 UTC+8下午9:49:01,James Q写道:
It looks like you may need to fine tune train Tesseract on this particular font. From the letters in you images it looks like 'Bevan', which you can download from here:

https://www.fontsquirrel.com/fonts/bevan

If you are unable to train Tesseract, I have sometimes had success by stretching (changing the aspect ratio) of the image. In this case it is quite a fat font so stretching it taller might improve the result.

Hope this helps
James

xll...@gmail.com

unread,
Aug 16, 2018, 1:27:05 PM8/16/18
to tesseract-ocr
all scaling images

在 2018年8月17日星期五 UTC+8上午1:24:27,xll...@gmail.com写道:
ears-0.5.png
ears-0.6.png
ears-0.8.png
ears-0.7.png
Message has been deleted

Art Rhyno

unread,
Aug 18, 2018, 3:48:18 PM8/18/18
to tesser...@googlegroups.com

Try reducing the images, e.g.:

 

convert -resize 25% ears.png ears2.png

 

art

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8744457a-85cf-4359-b82e-b68597e28d55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages