Tesseract low accurate single character recognition

210 views

Skip to first unread message

Mosn

unread,

Jan 2, 2016, 12:53:27 PM1/2/16

to tesseract-ocr

Hello, I've been going through the topic posts on group for almost 3 days but absolutely cannot find any fix for my issue. I need to recognize single set of characters with tesseract but the accuracy is absolutely horrible, even after image processing and setting proper configurations of tesseract.

For start this is my original image: http://i.imgur.com/MwTswFA.jpg

I pre process,invert colors, resize and sharpen the image to get this : http://i.imgur.com/Pl6OVE3.png

I pass this to tesseract and most of the time I get back "W", This also happens with the A and other characters. confidence rate for M = ( "(64.64%) 'W'", "(57.08%) 'M'")

I also tried to fine tune tesseract setting and did following :

1. limited the characters with char_whitelist

2. disabled dictionaries

3. set page segmentation to 10 to process single char

4. also modified the language_model_penalty_non_dict_word and other settings related to it.

But non of this helped with the issue. I still cannot recognize a simple M. I might be able to do font training for I dont think font training can help with a upside "w" issue.

I really appreciate any help on this.

I am using tesseractOCR iOS.

cheers,

IMG_0494 2.JPG

Reply all

Reply to author

Forward

0 new messages