OCR of photograph - bounding boxes

49 views
Skip to first unread message

Adam Ellsworth

unread,
Sep 27, 2017, 1:33:39 AM9/27/17
to tesseract-ocr
Hello,

I am trying to recognize the attached image which is from my water meter. I am having decent success if I crop and manipulate each digit separately with a script and then run tesseract once for each digit. That might work but it is not very flexible. If the image shifts a few pixels it will stop working.

Is this the best way to process such an image?

If so, is it possible to specify a box file for an image to be processed (not for training but for recognition)? Perhaps this is simple and I'm just not reading the documentation correctly. Since the image is hopefully static, I could use a box editor create a box file that tells tesseract where to expect the digits instead of tesseract guessing what is a digit and getting it wrong. It is still a very rigid approach but would be fewer steps than slicing the image before tesseract.

I appreciate any suggestions. Or is tesseract perhaps the wrong tool for image recognition like this, should I be considering OpenCV instead?

Thank you,
Adam
waterdigits.jpg
Reply all
Reply to author
Forward
0 new messages