How can define character size for tesseract?

650 views

Skip to first unread message

ali madad

unread,

Aug 21, 2018, 5:25:05 AM8/21/18

to tesseract-ocr

Dear researchers,

when I use tesseract for character recognition, there exist so many wrong results which the character width or height is about 4 pixels.

Is there any way to restrict the tesseract recognition by a range of character size?

thank you very much for your time.

Best Regards

vishnu thampan

unread,

Sep 18, 2018, 5:47:44 AM9/18/18

to tesseract-ocr

From your problem, it seems to be like you are trying to extract text from small image, like an image that contains very less characters or may be a single word. For improving the accuracy, you can use the PSM(Page Segmentation Mode) option. Tesseract comes with 12 different psm modes. The usage can be as:

tesseract.image_to_string(image,lang='eng', config='--psm 8')

psm 8: Consider the image as a single word in it

psm 7: Consider the image that has a single line in it

By default, tesseract uses psm 3
For more information: Go here

Reply all

Reply to author

Forward

0 new messages