Assertion failure

16 views

Skip to first unread message

George Papadopoulos

unread,

Nov 15, 2016, 5:44:35 AM11/15/16

to tesseract-ocr

Hello,

I ran into an assertion failure when I run tesseract on a scanned image. The output that I get is:

Page 1
Detected 146 diacritics
split_pt >0 && split_pt < word->chopped_word->NumBlobs():Error:Assert failed:in file ..\..\ccmain\tfacepp.cpp, line 186

I am testing on windows and the tesseract version is:

tesseract 3.04.02dev
leptonica-1.71 (Oct 21 2016, 18:04:17) [MSC v.1800 DLL Release x86]
libgif 4.1.6(?) : libjpeg 8c : libpng 1.4.3 : libtiff 3.9.4 : zlib 1.2.8

The image is a tif file, black & white, compressed with CCITT Group 4 Fax Encoding, resolution is 300 dpi, 2461 x 3478 pixels and the file size is 112K. Unfortunately I cannot attach the file.

The image is a form that includes some hand-written areas. If I redact the image with black boxes on top of every hand-written area, then tesseract is able to process the file without crashing. So my first thought was that the problem is in recognition of hand-writting.

However, I also tried resizing the original (unredacted) image to 1920 x 2714 and it also worked. So it seems like the hand-writting is no longer a problem, when the image is slightly smaller.

I am trying to use tesseract on an automated system that processes scanned images. Any ideas on how to resolve this?

Thank you very much

George

Reply all

Reply to author

Forward

0 new messages