Issue 589 in tesseract-ocr: tesseract 3.0.1 return empty result for some Very clear and big number image

10 views
Skip to first unread message

tesser...@googlecode.com

unread,
Nov 28, 2011, 4:55:43 AM11/28/11
to tesserac...@googlegroups.com
Status: New
Owner: ----

New issue 589 by derick...@gmail.com: tesseract 3.0.1 return empty result
for some Very clear and big number image
http://code.google.com/p/tesseract-ocr/issues/detail?id=589

What steps will reproduce the problem?
1. Run tesseract 3.0.1.exe by:
tesseract.exe tmp.bmp output makebox nobatch digits
2. see output.box file for result
3.

What is the expected output? What do you see instead?
1. result should be "698".
2. actually tesseract always return empty result on this image and some
silimiar image that is very clear. But my tesseract can recognize some very
poor image.

What version of the product are you using? On what operating system?
3.0.1

Please provide any additional information below.


Attachments:
tmp.bmp 32.1 KB

tesser...@googlecode.com

unread,
Nov 28, 2011, 5:09:50 AM11/28/11
to tesserac...@googlegroups.com

Comment #1 on issue 589 by derick...@gmail.com: tesseract 3.0.1 return
empty result for some Very clear and big number image
http://code.google.com/p/tesseract-ocr/issues/detail?id=589

And also tesseract is very sensitive to small noise even if the noise blob
is far away from clear blob. For example, if add a small black blob to
attached tmp.bmp at the left-bottom corner, it is far away from number
block. Then run tesseract, you will see tesseract may can return result
instead of empty result as stated above. But slight different noise blob
will impact the result: for example, some time, the result is "698", while
change the noise block a little( change the shape), the result may turn
into "598", while the noise block is not overlap with number blob at all.

tesser...@googlecode.com

unread,
Jun 8, 2012, 6:21:35 PM6/8/12
to tesserac...@googlegroups.com

Comment #2 on issue 589 by anotherm...@gmail.com: tesseract 3.0.1 return
empty result for some Very clear and big number image
http://code.google.com/p/tesseract-ocr/issues/detail?id=589

This needs to be combined with Issue 718. This issue applies to the
following digits 0,6,8,9. The workaround I now use for this is
to 'manually' add or append one of the non-affected digits to the image,
and then remove the extra digit from the OCR result, OR you
can 'prebreak'the image down into individual chars, and use the -psm 10 to
recognise each digit successfully.

tesser...@googlecode.com

unread,
Jun 17, 2012, 8:41:15 AM6/17/12
to tesserac...@googlegroups.com

Comment #3 on issue 589 by derick...@gmail.com: tesseract 3.0.1 return
empty result for some Very clear and big number image
http://code.google.com/p/tesseract-ocr/issues/detail?id=589

I'm using the same solution as you do :)

tesser...@googlecode.com

unread,
Jul 21, 2012, 11:27:45 AM7/21/12
to tesserac...@googlegroups.com

Comment #4 on issue 589 by zde...@gmail.com: tesseract 3.0.1 return empty
result for some Very clear and big number image
http://code.google.com/p/tesseract-ocr/issues/detail?id=589

Issue 718 has been merged into this issue.

tesser...@googlecode.com

unread,
Jun 30, 2015, 8:24:14 AM6/30/15
to tesserac...@googlegroups.com

Comment #5 on issue 589 by ja...@formigas.de: tesseract 3.0.1 return empty
result for some Very clear and big number image
https://code.google.com/p/tesseract-ocr/issues/detail?id=589

This is a really annoying bug. Could somebody explain what is going on so
that we could find a better workaround in the meantime?
Thanks for pointing it out, been sitting on this for hours already...

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

tesser...@googlecode.com

unread,
Jun 30, 2015, 9:05:37 AM6/30/15
to tesserac...@googlegroups.com

Comment #6 on issue 589 by ja...@formigas.de: tesseract 3.0.1 return empty
result for some Very clear and big number image
https://code.google.com/p/tesseract-ocr/issues/detail?id=589

Just to let you know, this error is still existent in the current trunk
version and it's been around since 2011 I think...
Reply all
Reply to author
Forward
0 new messages