Tesseract with Grayscale

670 views
Skip to first unread message

Mayce Al

unread,
Oct 14, 2011, 10:50:22 AM10/14/11
to tesser...@googlegroups.com
Hi Guys,

I am using Tesseract default model for German Fraktur. I found out, it gave bad recognition for grayscale.

I was thinking of binarize them before recognize them with Tesseract, but i found out that Tesseract is already used Otsu Binarization through the recognition process.

Does anyone have an idea about any parameter to turn the thresholding inside Tesseract to have good results?,


Cheers, Mayce


Max Cantor

unread,
Oct 14, 2011, 11:33:17 PM10/14/11
to tesser...@googlegroups.com, tesser...@googlegroups.com
Try a sauvola binarization first. Factors between 0.15 and 3 are usually good. Lower factor - more speckles and noise. Higher factor - more washed out image
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Mayce Al

unread,
Oct 16, 2011, 4:37:21 PM10/16/11
to tesser...@googlegroups.com
Thanks a lot Max,

The question is that does Tesseract provide sauvola binarization through the command line?

I used the: tesseract image output -l lang -psm, but there is no way to call the binarization. 

So i guess it is provided through leptonica and I should write few lines of code to call it! right ?
Cheers, Mayce

Max Cantor

unread,
Oct 16, 2011, 10:57:13 PM10/16/11
to tesser...@googlegroups.com, tesser...@googlegroups.com
You'd. Need to call it yourself. BUT, the code for the tesseract executable does exactly that so you could just hack up your own execrable. On my phone now so can't give you line numbers but shouldn't be too hard to find. Look for PIX* types in the cpp files. 

Max

Mayce Al

unread,
Oct 17, 2011, 9:31:21 AM10/17/11
to tesser...@googlegroups.com
Thanks Max, exactly I found something like "pixSauvolaBinarize"
I appreciate your support
Cheers, Mayce

Reply all
Reply to author
Forward
0 new messages