Reading spinning numberwheels

Robert Andersson

unread,

Oct 20, 2024, 7:03:43 AM10/20/24

to tesseract-ocr

Hi. I'm on Windows using tesseract, any ideas of improving my very basic commandline to get better readings on "in between" images of my watermeter?

tesseract" C:\DATA\WATER\grayscale.png C:\DATA\WATER\newmeasure -l eng --psm 13 digits

It gives me 0634154 which is a very good guess of the second to last digit, the bottom of the 2 and top of 3 makes a 5 ;)

For the last digit it's the bottom of the 9 that makes 4 I assume.

Next idea is to crop the digit-parts into smaller pictures and calculate/process them separatly. 

Robert Andersson

unread,

Oct 20, 2024, 12:51:30 PM10/20/24

to tesseract-ocr

Hello again, I guess it's as good as it gets, not sure what I thought could be better from the above image. :D

I will just do more frequent readings of the meter and sort the "bad ones" out. Focusing the logic and ocr on the last digits after I get a "bad reading".

Very happy with the tesserct result, thanks for developing it! :)

Ger Hobbelt

unread,

Oct 21, 2024, 7:47:53 AM10/21/24

to tesseract-ocr

One minor addition, that could improve your data cleaning postprocess: check out tesseract tsv and hocr output formats for obtaining some indicative confidence numbers for each digit ocr'ed from your image (iirc there's an issue with vanilla tesseract, hence the advice to check both formats: iirc one of them reports char level confidence numbers while the other doesn't: score values of -1 :'-( )

I don't know how often you process that meter display or how fast your meter is running, but knowing that this thing can only count up also would aid data cleaning / smoothing when you have processed multiple images.

Tesseract may get confused with those last two digits (they rotate up and don't snap in place like the others), including the chance that no digit at all will be recognized at either spot, so ocr output validation, alignment (with previous samples) and/or obscuring the last two digits in the OCR input image are all options to consider for your particular input data stream. Here those per char confidence numbers may also assist (plus you can get char position-in-image coordinates from tesseract!) in your bespoke data postprocessing.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/937343ed-1736-45d2-9758-784b35f3ed8cn%40googlegroups.com.

Ger Hobbelt

unread,

Oct 21, 2024, 8:08:35 AM10/21/24

to tesseract-ocr

Oh, and DON'T crop into single digits: tesseract's AI is designed to read books: it likes to read words instead of single characters. Ok, your data isn't exactly a "human language word" but a sequence of digits generally decodes better and easier than single digits, being fed into the machine one at a time. (Read up on LSTM neural net if you are interested in the technical background.)

Usually you can also observe this in the reported per-char confidence numbers; at least in my trials single digits generally produce lower confidence scores; not every time but it adds up when done in bulk.

What I would do in your case is feed tesseract two images: one as you do now, plus one where the last two rotating digits are blanked (erased,) from the input image, then take both results and align / compare them to get a more definitive answer re OCR quality assessment for that particular input, before taking the produced number and feeding it into the subsequent data processing stage.

That way I could decide if any of the trailing two rotating digits has any potential 'worth' to me: you may get lucky having either in full view; your example image is a good example of the general worse case scenario. :-)

On Sun, 20 Oct 2024, 18:51 Robert Andersson, <qbert.a...@gmail.com> wrote:

Reply all

Reply to author

Forward