New fast implementation of Sauvola binarizer

100 views
Skip to first unread message

Ilya Mezhirov

unread,
Jan 30, 2016, 5:46:31 PM1/30/16
to tesseract-ocr
Hi everyone,

I've written a binarizer that does Sauvola a lot faster than Leptonica. It also can double the resolution.
Works strictly with JPEGs and outputs G4-compressed TIFFs.

Check it out: https://gitlab.com/alih/turbinarize.git

Have fun!
Ilya

Tom Morris

unread,
Jan 31, 2016, 11:35:30 AM1/31/16
to tesseract-ocr
On Saturday, January 30, 2016 at 5:46:31 PM UTC-5, Ilya Mezhirov wrote:

I've written a binarizer that does Sauvola a lot faster than Leptonica. It also can double the resolution.
Works strictly with JPEGs and outputs G4-compressed TIFFs.

Check it out: https://gitlab.com/alih/turbinarize.git

Cool idea to work in the frequency domain to save processing time, but since it's GPL v3, it's incompatible with Tesseract's license.

Tom

Ilya Mezhirov

unread,
Feb 1, 2016, 9:54:29 AM2/1/16
to tesseract-ocr
Correct. It still can be used as a separate tool, like this:
    turbinarize input.jpg binary.tif
    tesseract binary.tif

Ilya
Reply all
Reply to author
Forward
0 new messages