tesseract does not detect text with different font size.

66 views
Skip to first unread message

Abhishek Khandelwal

unread,
Jul 16, 2019, 2:04:45 AM7/16/19
to tesseract-ocr
After pre-processing I am getting the attached binary image. Tesseract is able to detect the title but unable to detect the "TITLE" word itself. Any suggestions ?
513380440_sdpage_5_element5.tiff

Zdenko Podobny

unread,
Jul 16, 2019, 2:26:07 AM7/16/19
to tesser...@googlegroups.com
tesseract 513380440_sdpage_5_element5.tiff - --psm 12

TITLE

0.4 B-TO-B REC ASSY

EMBSTP PKG

Zdenko


ut 16. 7. 2019 o 8:04 Abhishek Khandelwal <abhi98kh...@gmail.com> napísal(a):
After pre-processing I am getting the attached binary image. Tesseract is able to detect the title but unable to detect the "TITLE" word itself. Any suggestions ?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/cd00ac8a-ad3d-45c5-ba64-3d4a590ac438%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abhishek Khandelwal

unread,
Jul 16, 2019, 8:27:00 AM7/16/19
to tesseract-ocr
Thanks. But by using psm 12, I do not get the same accuracy as the default mode. So I would rather crop the top left corner and then apply tesseract to detect the word "TITLE". 


On Tuesday, July 16, 2019 at 11:56:07 AM UTC+5:30, zdenop wrote:
tesseract 513380440_sdpage_5_element5.tiff - --psm 12

TITLE

0.4 B-TO-B REC ASSY

EMBSTP PKG

Zdenko


ut 16. 7. 2019 o 8:04 Abhishek Khandelwal <abhi98kh...@gmail.com> napísal(a):
After pre-processing I am getting the attached binary image. Tesseract is able to detect the title but unable to detect the "TITLE" word itself. Any suggestions ?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages