Problem recognizing perfectly readable numbers: "6" is read as "5"

145 views
Skip to first unread message

max.r...@auxilion.de

unread,
May 9, 2016, 10:24:39 AM5/9/16
to tesseract-ocr
Hi,

in the attached images in the smaller image the "6" in the last number is read as a "5". Only when I increase the resolution by quite an amount as seen in the larger image the "6" will be read correct. Is such a high resolution really necessary? Why is the number not detected correctly in the first image?

Thanks for your help and Best Regards,
Max
test_2_65_400.png
test_2_65_700.png

max.r...@auxilion.de

unread,
May 10, 2016, 4:24:05 AM5/10/16
to tesseract-ocr
In another case i also have the opposite situation. While detection is ok on a lower resolution an "8" becomes a "3" when reading it from a higher resolution image. Any advice on which resolution should work best or how I could improve reading those numbers?
test_1_80_400.png
test_1_80_700.png

Meh Hem

unread,
May 16, 2016, 11:38:06 PM5/16/16
to tesseract-ocr
Hi Max,

As you are getting a real nice image, with clearly defined glyphs I would suggest the next step would be training using the original resolution image.

jtess is a really easy way to train simple text like this.

I have actually got the opposite issue with one of my projects where increasing the resolution reduces my accuracy. 

max.r...@auxilion.de

unread,
May 17, 2016, 3:47:05 AM5/17/16
to tesseract-ocr
Thank you for your answer.
Meanwhile I found out that my problem seems to be caused by those vertical lines in the images. Before I was assuming only horizontal lines are bad for the detection but when also removing vertical lines detection is improved.
If I should encounter any more problems with these images, I will try out training with jtess.
Best Regards,
Max
Reply all
Reply to author
Forward
0 new messages