how to tell tesseract the character is white not black

3,221 views
Skip to first unread message

wh cc

unread,
Feb 16, 2017, 1:47:02 AM2/16/17
to tesseract-ocr
Hi,
I have an image like this, it is recognized as "lLDIII", I guess the reason is that tessract considers the black line in "0" as the text. So Could I set some parameters to tell tesseract to recognize the white one?


Thanks.
Auto Generated Inline Image 1

ShreeDevi Kumar

unread,
Feb 16, 2017, 2:58:07 AM2/16/17
to tesser...@googlegroups.com
Preprocess the image, invert so that it is in white background, make it 300 di.

See the following for more tips


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/17d3b43e-e287-4f00-8414-70684aa1fdf6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

wh cc

unread,
Feb 16, 2017, 4:59:00 AM2/16/17
to tesseract-ocr
invert the image get the same result.


On Thursday, February 16, 2017 at 3:58:07 PM UTC+8, shree wrote:
Preprocess the image, invert so that it is in white background, make it 300 di.

See the following for more tips

On 16-Feb-2017 12:17 PM, "wh cc" <wenhu...@gmail.com> wrote:
Hi,
I have an image like this, it is recognized as "lLDIII", I guess the reason is that tessract considers the black line in "0" as the text. So Could I set some parameters to tell tesseract to recognize the white one?


Thanks.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Feb 17, 2017, 3:05:58 AM2/17/17
to tesser...@googlegroups.com, wenhu...@gmail.com
Which version of Tesseract are you using?

I get good results using Tesseract 4.0.0-alpha with --oem 1.

See attached files.


ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
wb-eng-oem0.txt
wb-eng-oem1.txt
wb-invert-300-eng-oem0.txt
wb-invert-300-eng-oem1.txt
wb-invert-300.png
wb.png

wenhuach

unread,
Feb 20, 2017, 1:37:06 AM2/20/17
to tesseract-ocr, wenhu...@gmail.com
 Thank you for your reply. I am using version 3.04. I have tried OEM 1, it's not very quite suit for my other cases. So  it's a better option to tell tesseract that the character is white or black, but I don't konw how.
Reply all
Reply to author
Forward
0 new messages