Difficult image, any tips would be appreciated

Chris E.

unread,

Nov 12, 2022, 12:57:55 PM11/12/22

to tesseract-ocr

Hi,

I want to OCR this kind of image, which is from a video grabber, unfortunately of pretty bad quality. With the default options of tesseract, it's pretty useless.

Before I start digging deeper into training tesseract, I would love to hear some recommendations. Would it be possible to achieve a good result from this kind of image with proper training?

Any further ideas/tips would be appreciated!

Greetings,

Chris

Lorenzo Bolzani

unread,

Nov 13, 2022, 6:15:17 AM11/13/22

to tesser...@googlegroups.com

Hi Chris,

you should try to get something like this:

I inverted the headers section and then did two different threshold on each part. If you are not interested in the titles you can just crop them out.

The image is blurry, maybe it was upscaled a little? If so, try different levels of upscale, probably better if full integers like 2x, 3x, etc. to see if it improves. Or see if other frames from the video might be better or improve the video capture (resolution, lighting, frame rate, etc.).

This is what I get:

Modes Dunchieachiungszet

= ro
oF wn [3
HF omen | mm
Gesamt 00s 0%

quite unusable but at least it is starting to find something.

I think training will help IF all your images have this kind of blurry text and you use actual crops from these images for training.

Bye

Lorenzo

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/edf2898d-e442-46a5-bf0c-46f38561c20en%40googlegroups.com.

Mehmet Furkan

unread,

Nov 13, 2022, 7:27:06 AM11/13/22

to tesseract-ocr

Waaw, good job! Could you share the source code of this ocr? If that's okay, I'll be really happy.

Lorenzo Bolzani

unread,

Nov 13, 2022, 7:41:36 AM11/13/22

to tesser...@googlegroups.com

I did it by hand with Gimp.

The code depends on what you know about the image. If it is fixed size and fixed location you can easily do this, for example, with python and opencv: crop, invert header, two different thresholds.

If the size/alignment are not fixed you could use SIFT to align the image with a fixed template (or use Hough lines to rotate it or something similar if there is not a lot of perspective correction to do).

If it is aligned but not fixed size, you can detect the darkest part with threshold and findContours (with open/close/erode to clean the image) or in simpler ways, it really depends how much the gray tones changes between frames. You could do a floodFill in a few know locations of the header with a different color and find the contours for this colored region (and use the rectangle rotation to rotate the image, if needed)

It may take a few hours of a few days depending on the images.

Bye

Lorenzo

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/c1c375a2-2581-4230-9997-235e210fa7acn%40googlegroups.com.

Chris E.

unread,

Nov 13, 2022, 9:10:51 AM11/13/22

to tesseract-ocr

Hi Lorenzo,

thank you so much for your ideas! Unfortunately, I don't think I can get a better image quality. It's a VGA signal that's being grabbed, and well, that's the result. Maybe I'll try a different converter.

I did some more tests, too, and the only way I found to get a little better results is to segment the image manually and then feed the individual segments into tesseract. My problem is, that I need to rely on the results (perhaps not 99%, but at least 90%), and that sounds pretty hard to achieve.

Greetings,

Chris

Chris E.

unread,

Nov 13, 2022, 9:25:11 AM11/13/22

to tesseract-ocr

BTW, Google Lens detects ALL text on the image perfectly.... ;)

Tom Morris

unread,

Nov 13, 2022, 3:12:57 PM11/13/22

to tesseract-ocr

The image has "mosquito noise" around the characters which indicates that it's been compressed with JPEG or similar algorithm. You should definitely try to avoid any compression at this low a resolution.

I think your idea of investigating different video capture devices is a good one. It looks to me like there is horizontal "ringing" or an echo in the signal which is showing up as two ghost images slightly offset from each other in the X axis. With a clean signal you'd have a much easier time. If you are forced to deal with this, you can construct your filter matrices to operate in the X axis, but leave the Y axis untouched.

Tom

Chris E.

unread,

Nov 13, 2022, 3:44:07 PM11/13/22

to tesseract-ocr

Hi Tom,

the compression artifacts are of course easy to avoid, but the “ghosting” in the image is definitely a severe problem. I noticed that, too, but I have no idea what the reason could be. Again, a different AD converter could help. I already tried to clean the “ghosting”, but had no success.

Another interesting fact, I found a different OCR solution called "https://ocr.space/", which seems to handle my kind of images pretty well. It's not a free service (decent free tier, tough), but that could be an alternative solution, if I can't manage to get a better image quality.

I would still prefer a local solution with tesseract, will post my updates.

Greetings,

Chris

Reply all

Reply to author

Forward