Way to set minimum font size to reduce errors

1,765 views
Skip to first unread message

John James

unread,
Apr 21, 2015, 2:33:37 AM4/21/15
to tesser...@googlegroups.com
Hi All

I am looking for a parameter that sets the minimum acceptable rectangle size that tesseract will interpret as a character.
 For example every character in the image has a rectangle height of 40 pixels but occasionally Tesseract recognizes a character erroneously that is 15 pixels high.


All help Appreciated

Regards 

Codger

Dmitri Silaev

unread,
Apr 21, 2015, 3:39:29 AM4/21/15
to tesser...@googlegroups.com
You can try "textord_min_xheight" (in pixels). However it might not work as there are tons of possible reasons why that happens. There can be other params that could help. Show a few sample images and explain what's going wrong. Provide your command lines and Tesseract version you're using.

Best regards,
Dmitri Silaev
www.CustomOCR.com





--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d96b48c6-32a0-4f3e-91d3-7a01bd7c5d9b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

B. Zaręba

unread,
May 24, 2016, 7:00:17 AM5/24/16
to tesseract-ocr
Probably not useful for the OTP anymore, but if someone else bumps here looking for anwers:

Try:
textord_old_xheight 1 #this is important
textord_min_xheight 35

Also you may find this useful:
textord_max_noise_size 18
Reply all
Reply to author
Forward
0 new messages