Image pre-processing pipeling for a general image captured from camera.

111 views
Skip to first unread message

s4re...@gmail.com

unread,
Jun 11, 2017, 3:06:43 PM6/11/17
to tesseract-ocr
I am trying to do do OCR using tesseract on images. I am unable to figure out a proper pre processing technique for the same.

the problems I am facing is:

1. Low contrast images: The images have different texts with different font sizes. So what should be my approach to enhance the contrast of any image.

2. Problem of touching characters: Sometimes after applying adaptive thresholding I am facing the problem of touching characters (in which two adjacent characters are touching each other) What is the best way to figure out a solution for that.

3. Problem of non uniform illumination: How should I proceed if I want to solve the problem of non uniform illumination ? 

How can image segmentation solve my problem ?

I have added a sample image. Assume that the image is not rotated as it is there in the picture. But the variety of font sizes and the text segments in the image are exact replica of what I am asking about ? Apart from above mentioned steps, I would appreciate any  kind of suggestion for pre - processing of the above image. Let me know if you have worked out a solution for something related to this. 

Thanks


upload.jpg

Andres

unread,
Jun 14, 2017, 2:47:24 PM6/14/17
to tesseract-ocr
The things that you mentione are not just details, they are subject of wide study and specialization.

From all the books from computer vision / image processing that I have, the best one that addresses your enquiries is "Algorithms for Image Processing and Computer Vision", from J.R. Parker, 2nd edition. ISBN 978-0-470-64385-3

Cheers,

Andres

s4re...@gmail.com

unread,
Jun 15, 2017, 2:13:22 PM6/15/17
to tesseract-ocr
Hey! Thanks for your suggestion

Can you just give me a brief outline of the general pre processing step if I have this kind of images.

Andres

unread,
Jun 16, 2017, 12:53:37 AM6/16/17
to tesseract-ocr
Some things you can do:

1 - If your image is color, take profit of that, transform it to HSV and use H (mainly) to filter by color, with that you will take out most of the pixels that you don't want. Before starting programming, try that with photoshop or something similar to see how much could that help.
2 - Search for some blobs to figure out the size of your characters and then adjust the parameters of your adaptive threshold filter with that
3 - Find the paragraphs using your own method (this is not an easy thing) and then segment the characters (not easy either), extract the characters by yourself and OCR it separately

I would recommend you to start with 1 and 2. 3 requires a lot of work and good knowledge of image processing with libraries like OpenCV.
--

s4re...@gmail.com

unread,
Jun 16, 2017, 2:17:13 AM6/16/17
to tesseract-ocr
Thanks Andres! I will try implementing those methods.
Reply all
Reply to author
Forward
0 new messages