We use tesseract in our document imaging app - first started with version 2.x and recently upgraded from 3.05 to 5.3.1, and something broke. We supply images to tesseract using SetImage and then SetRectangle. In one of our apps, we often OCR the top third of invoices to gather info on a vendor. This worked fine in 3.05 but not in 5.3.1. If I specify the full image dimensions in SetRectangle (as provided to SetImage), all works fine, but if I specify dimensions in SetRectangle to just do the top third of the image, I get total garbage back. We are providing one-bit B&W images to SetImage (white = 1)and specify the target area in pixels. Something changed between 3.05 and 5.3.1 to make this not work. Is there something I missed in the interim? Perhaps SetRectangle(x,y,w,h) wants dimensions that start on 8-bit bounds or something equally restrictive? Any suggestions welcome.
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/3959f739-c152-4526-93bc-3ea63b9e088an%40googlegroups.com.
IMO safer solution is to use the cropped image for SetImage.