Tesseract-ocr duration time calculation

401 views
Skip to first unread message

beein...@gmail.com

unread,
Jul 10, 2016, 2:01:59 PM7/10/16
to tesseract-ocr
Hi,
is there any way how could be calculated an image OCR duration time before the recognation itself will start? I would like to make some kind of time status bar showing how long will it take to process the image. I tried to find dependence between image size or volume but there is no any dependence in it. Maybe pixels calculation before the OCR start will get some result? 

Tom Morris

unread,
Jul 16, 2016, 11:21:01 PM7/16/16
to tesseract-ocr
On Sunday, July 10, 2016 at 2:01:59 PM UTC-4, beein...@gmail.com wrote:
Hi,
is there any way how could be calculated an image OCR duration time before the recognation itself will start? I would like to make some kind of time status bar showing how long will it take to process the image. I tried to find dependence between image size or volume but there is no any dependence in it. Maybe pixels calculation before the OCR start will get some result? 

There's no such linear relation between number of pixels and computation time. Noisier images take longer to process, as do those with ambiguities which require extra processing to resolve.

Unfortunately, until you finish, you're not going to know how much processing is required -- and by then it's too late for a progress bar.

If you wanted to put some effort into it, perhaps you could sample a bunch of images with different characteristics and see what the processing time correlates to extract a model for the processing time, but I have my doubts.

Tess has an internal pipeline which should give more and more accurate estimates about how much work there is to do, but you'd need to dig into it a little bit to see whether you could leverage it for your purposes.

Tom

Zdenko Podobný

unread,
Jul 19, 2016, 10:53:39 AM7/19/16
to tesser...@googlegroups.com
Have a look at Text Fairy app[1] video 0:22. Are you looking for something like that? If yes, source code is available at github[2].


Zdenko

On Sun, Jul 10, 2016 at 11:16 AM, <beein...@gmail.com> wrote:
Hi,
is there any way how could be calculated an image OCR duration time before the recognation itself will start? I would like to make some kind of time status bar showing how long will it take to process the image. I tried to find dependence between image size or volume but there is no any dependence in it. Maybe pixels calculation before the OCR start will get some result? 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8bc09ca8-4774-4c7f-9567-547fd6080a2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages