OCR pipeline with OpenCV

Nicolas Colomer

unread,

Jun 18, 2019, 3:45:17 PM6/18/19

to tesseract-ocr

Hello tesseract community,

I'm looking for help about one OCR use case I try to achieve using a combination of python, opencv and tesseract.

The goal is to recurrently take a picture of the screen of my heat pump controller to extract the metrology it exposes.

It's a (funny) DIY project with mix of electronics, raspberry pi zero and a pi camera stuff, 3d-printed support, etc.

Here is a photo of the project in its current state:

My problem is that the result is quite deceiving and not usable for now. I tried various hardware (camera focus, lighting) and software (image processing algorithm and filters) attempts, with help of various internet resources (tesseract doc, opencv doc, ocr blog posts), with no convincing success.

From this (sized down, see step0.jpeg attachement for original picture) image:

I end with this:

Finally, when I apply tesseract, best result I could obtain is:

. YALEURS MESUREES

EATER PEUF 21.a%

Se PAM 24.2%

SG EAL Be Las

ECT OLIF: EAW INT 22.4%

7 at

What I don't understand from this point:

the image looks quite readable to me, especially the number part
why is there a "7 at" reliquat at the end?
I tried to extract each line independently, no improvement
I tried to add white border, it doesn't help
I tried both tesseract 3 and tesseract 4 (LSTM), no significant difference

You'll find attached the pipeline.py file that contains the image preparation pipeline I use.

It is implemented in python3 with opencv (via opencv-python-headless 3.4.4.19) and tesseract (via pytesseract 0.2.6).

In addition, I attached the image output of each steps:

step0.jpeg: original file
step1-blur.jpeg: apply gaussian blur filter (helps later when applying adaptive threshold)
step2-rotate.jpeg: rotate the image 180°
step3-crop.jpeg: select only the interesting part
step4-perspective.jpeg: transform the image to "flatten" (ie. de-perspective) it
step5-threshold.jpeg: apply adaptive threshold filter

I think the problem is elsewhere in my image preparation pipeline and related to output quality, but I can't figure out what is causing poor results and how to improve it.

Any thoughts or guidances?

Thanks for your help,

Nicolas

step0.jpeg

step1-blur.jpeg

step2-rotate.jpeg

step3-crop.jpeg

step4-perspective.jpeg

step5-threshold.jpeg

pipeline.py

Mox Betex

unread,

Jun 18, 2019, 6:16:21 PM6/18/19

to tesseract-ocr

Did you train Tesseract?

Image is of poor quality for OCR, you have to improve it.

Also check the resolution of image.

ElGato ElMago

unread,

Jun 18, 2019, 9:32:20 PM6/18/19

to tesseract-ocr

Those images and fonts obviously are not for OCR. Need to improve images and train fonts.

Do you only need to read temparatures? Then some pattern recognition method in OpenCV might be easier to work with.

2019年6月19日水曜日 7時16分21秒 UTC+9 Mox Betex:

Lorenzo Bolzani

unread,

Jun 19, 2019, 5:49:49 AM6/19/19

to tesser...@googlegroups.com

Hi Nicolas, I think what you did is good, you just need to play with pre-processing more.

I usually process the images with Gimp until I can get a good results, then I try to do the same processing with opencv/PIL.

You do not strictly need to threshold the image, a very very strong contrast is enough and may work better. Play with curves, histogram normalization (cv2.normalize(MINMAX), cv2.equalizeHist, PIL.rescale_intensity, PIL.autocontrast), multiply the image with itself, sharpen. Gaussian difference could also give good results. A little blur/denoise should remove the small dust or close/erode after thresholding.

You can try CLAHE to normalize the illumination, this looks like a big problem. The left part is terrible, barely readable, maybe focus is not flat on the screen? The "EXT" text is very hard to recover. Try to get a better starting image if possible. The screen contrast in the other shots is very good, try to understand why it gets lost. Maybe the viewing angle? Camera settings? LCD is bad when seen from above? Maybe side/bottom may work better or a few degrees more tilted? Too much light washing out the black text (from the leds?)?

See the attached script for some ideas and examples (not fine tuned for this image, I used it for something else). Try the other one to understand where the text is coming from (I think you need libtesseract-dev to install tesserocr).

I would crop the image into five different lines, process each one individually, especially for adaptive stuff, OTSU, CLAHE, etc. You could also separate the text part (left) from the numbers and process them separately. Do you really need the text on the right? Isn't it fixed?

You do not need so much resolution, downscale the text so that each line is about 30/50px, try different the scales that works best. Usually it is better to downscale after the pre-processing.

I think dpi are used only of the page segmentation part, I never use them, I downscale the text and use single lines.

I do not think you need to fine tune the model right now, try as much as possible not to do it. If you cannot get good results with hand-tuned gimp pre-processing of individual lines consider fine tuning on this font. The big advantage of fine tuning is also that you can limit the set of characters. There is a digits only model around, you may try that on the numbers part.

This is what I get from tesseract:

$ tesseract -l eng --psm 6 step5-threshold.jpeg -

YY ALEIRES MESUREES
EXTER | EUR 21.9%
SHEE ANT 24.2%
EG EA 2a .e%T
HBETTOUE EAD ITNT 22.4%

Downscale the image to width 400:

YRLEURS MESUREES :
EETERIEDR 21 .9¢
SIE AMT 24.2%
CEG EAU 20.0%
SETOUR EAD INT 22.4%

This is almost perfect, but it is quite fragile, using 500 I get this:

YALEURS MESUREES :
ENTER EUR 21.9%
SAEIEE AMT 24.2%
EG EA 26.8%
SETOUR EAD INT 22.4%

Cutting out the celsius unit gives a little better results. Also a darker threshold, where the characters are more connected seems to work better. Maybe the font need some fine tuning, the third line with the zeros is the main problem. Cutting individual lines or numbers does not help.

I started over with gimp from step0 focusing on numbers and I get the attached image, with this one the results are more stable with different image downscales (400, 500, 600). Maybe the third line was just a bad case.

VAIL FURS NMESUREES
XTE (EI IF 21.9%
Sead T 24.2%
EAU 20.0%
EFAS INT 22.94%

Bye

Lorenzo

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/e5006f70-5c86-4c12-a115-b406edb12432%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

clean_test.py

ocr_boxes.py

step01.jpeg

Nicolas Colomer

unread,

Jun 19, 2019, 7:08:14 AM6/19/19

to tesser...@googlegroups.com

Thanks all for your answers!

@Mox Betex

Did you train Tesseract?

@ElGato ElMago

Those images and fonts obviously are not for OCR. Need to improve images and train fonts.

No, I use tesseract vanilla, only binary tuning parameters.

I'd like to avoid training my own model at first, but I add it to my "things to explore" list.

@ElGato ElMago

Do you only need to read temparatures? Then some pattern recognition method in OpenCV might be easier to work with.

Yes, only the numbers on the right are needed. Actually, this looks to be a very good idea (in that results may be more predictible), it's worth a try.

@Lorenzo Bolzani, thanks for your very detailed answer! This is valuable stuff that will help me iterate, for sure.

I'll keep you posted of my progress,

Cheers

Nicolas

You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/PohqMCLBc0g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAMgOLLzyJiiwaA_x9sUqaqHpuOGG%3DKCNs4jYpRU59bROxcn_tw%40mail.gmail.com.

Mox Betex

unread,

Jun 19, 2019, 8:13:27 AM6/19/19

to tesseract-ocr

@Mox Betex
Did you train Tesseract?

Yes, I have.

Reply all

Reply to author

Forward