Use Tesseract to capture digits in a picture

474 views
Skip to first unread message

Chan

unread,
Jul 4, 2015, 2:24:42 AM7/4/15
to tesser...@googlegroups.com
Hi there,

I'm new to image processing and have little knowledge. Recently, i was assigned a image processing task to capture the digits from pictures and output as digits/text. An example picture is attached for reference. Here is my rough idea how to implement:


Step 1. locate the black block first;
Step 2. Enhance & improve the image quality so that be suitable for Tesseract OCR;
Step 3, Start the Tesseract to capture the digits.

However, limited by my experience, I have no idea which tools are capable for these Steps. Is there any suggestions/sample codes for refer?

Thanks in advance.
Example.jpg

Tom Morris

unread,
Jul 4, 2015, 1:25:25 PM7/4/15
to tesser...@googlegroups.com
I'd suggest looking at OpenCV.  It looks more like a computer vision task than an OCR task.  Some of the specific issues like dials not fully aligned in the window are things the OCR systems aren't designed to deal with, but you could use domain knowledge to deal with in a system like OpenCV.

Here's a description of implementing a meter reader in OpenCV:

Tom

Srinivasa TN

unread,
Sep 10, 2015, 7:18:05 AM9/10/15
to tesseract-ocr
Hi Tom,
   Can I know where can I get the images required for training emeocv?  (using ./emeocv -i ../../image/ -l -v DEBUG but I am trying to find out what images I should have in image directory)

Regards,
Seenu.

Tom Morris

unread,
Sep 10, 2015, 12:49:08 PM9/10/15
to tesseract-ocr
On Thursday, September 10, 2015 at 7:18:05 AM UTC-4, Srinivasa TN wrote:
   Can I know where can I get the images required for training emeocv?  (using ./emeocv -i ../../image/ -l -v DEBUG but I am trying to find out what images I should have in image directory)
 
Sorry, I'm not the author of the article.  I just found it and posted a pointer.

Having said that, I would have thought you'd want to use representative images of the style of meters that you'll be targeting, rather than the examples that were used in the original experiments.

Tom

Srinivasa TN

unread,
Sep 14, 2015, 4:24:07 AM9/14/15
to tesseract-ocr
I used an image in the original link (attached for reference - 2015014132000.png) and it is recognising correctly:

131601  -------

I used the image posted by OP (attached for reference - 2015014132100.png) and it is not recognizing the 3 in it:

4578  -------

I used another image of digital display (attached for reference - 2015014133400.png) and whole image is recognized as single digit 0:

0  -------

Any suggestions on how to make it recognize digital displays?

Regards,
Seenu.
20151014132000.png
20151014132100.png
20151014133400.png

Tom Morris

unread,
Sep 14, 2015, 2:22:48 PM9/14/15
to tesseract-ocr
Well, first of all, since this is an OpenCV question, I'd suggest you use an OpenCV mailing list or forum to ask it.

You don't mention re-training the model.  If you try and recognize a completely different style of digit using a model trained on the images in the example, I wouldn't expect it to work at all.

Tom
Reply all
Reply to author
Forward
0 new messages