Passing binarized image to Tesseract

75 views
Skip to first unread message

Syed Uzair

unread,
Aug 16, 2017, 7:50:41 AM8/16/17
to tesseract-ocr
Hello

I am doing binarization using separate algorithm and saving result as PNG and passing this PNG to Tesseract. Since i have binarized it already i want Tesseract to stop applying its binarization (Otsu's) on my binarized image.  According to SetImage documentation in thresholder.cpp

 // SetImage makes a copy of all the image data, so it may be deleted
55 // immediately after this call.
56 // Greyscale of 8 and color of 24 or 32 bits per pixel may be given.
57 // Palette color images will not work properly and must be converted to
58 // 24 bit.
59 // Binary images of 1 bit per pixel may also be given but they must be
60 // byte packed with the MSB of the first byte being the first pixel, and a
61 // one pixel is WHITE. For binary images set bytes_per_pixel=0.
62 void ImageThresholder::SetImage(const unsigned char* imagedata,
63  int width, int height,
64  int bytes_per_pixel, int bytes_per_line)


But i dont know how to set these parameters in Python code. I am using Tesserocr which is a Python wrapper to Tesseract's C++ API.  My Tesseract version is 3.05.

What is the correct way to pass a binarized image to Tesseract so that it doesnt apply binarization on it? Looking for an example code in Python using Tesserocr.

Thanks
Uzair
Reply all
Reply to author
Forward
0 new messages