Issue 595 in tesseract-ocr: Crashbug in tesseract 3.0.1

70 views
Skip to first unread message

tesser...@googlecode.com

unread,
Dec 14, 2011, 9:46:59 AM12/14/11
to tesserac...@googlegroups.com
Status: New
Owner: ----

New issue 595 by chef...@gmail.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

Using tesseract with custom OCRB training to decode ID cards and passports
machine readable zone (MRZ). Usually works well except for a few particular
cases (decoding ~ 50000 images triggered this bug a few times)

What steps will reproduce the problem?

tesseract /tmp/char.bmp tmp -l ocrb -psm 10 && cat tmp.txt
Tesseract Open Source OCR Engine v3.01 with Leptonica
Erreur de segmentation

Running with gdb:

Program received signal SIGSEGV, Segmentation fault.
0xb7e72332 in restore_underlined_blobs (block=0x81650c8) at underlin.cpp:65
65 &chop_cells);
(gdb) bt
#0 0xb7e72332 in restore_underlined_blobs (block=0x81650c8) at
underlin.cpp:65
#1 0xb7e22bc0 in tesseract::Textord::cleanup_rows_fitting (this=0x80a78e8,
page_tr=..., block=0x81650c8, gradient=0, rotation=..., block_edge=0,
testing_on=1 '\001') at makerow.cpp:623
#2 0xb7e22e15 in tesseract::Textord::fit_rows (this=0x80a78e8, gradient=0,
page_tr=..., blocks=0xbfffd984) at makerow.cpp:225
#3 0xb7e55ccd in tesseract::Textord::TextordPage (this=0x80a78e8,
pageseg_mode=tesseract::PSM_SINGLE_CHAR, width=24, height=33,
pix=0x8164760, blocks=0x80aa570,
to_blocks=0xbfffd984) at textord.cpp:306
#4 0xb7da0db0 in tesseract::Tesseract::SegmentPage (this=0x809c798,
input_file=0x80aa508, blocks=0x80aa570, osd_tess=0x0, osr=0xbfffd9fc) at
pagesegmain.cpp:177
#5 0xb7d794cc in tesseract::TessBaseAPI::FindLines (this=0xbffff334) at
baseapi.cpp:1413
#6 0xb7d798e0 in tesseract::TessBaseAPI::Recognize (this=0xbffff334,
monitor=0x0) at baseapi.cpp:523
#7 0xb7d7c185 in tesseract::TessBaseAPI::ProcessPage (this=0xbffff334,
pix=0x80aa4c0, page_index=0, filename=0xbffff607 "/tmp/char.bmp",
retry_config=0x0,
timeout_millisec=0, text_out=0xbffff384) at baseapi.cpp:732
#8 0xb7d7c4e2 in tesseract::TessBaseAPI::ProcessPages (this=0xbffff334,
filename=0xbffff607 "/tmp/char.bmp", retry_config=0x0, timeout_millisec=0,
text_out=0xbffff384) at baseapi.cpp:659
#9 0x08048fc2 in main (argc=7, argv=0xbffff454)
at ../api/tesseractmain.cpp:138


What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?
Tesseract 3.0.1 runing on Linux

Attachments:
char.bmp 1.8 KB
ocrb.traineddata 307 KB

tesser...@googlecode.com

unread,
Feb 28, 2012, 4:59:38 PM2/28/12
to tesserac...@googlegroups.com

Comment #1 on issue 595 by jbrei...@google.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

confirmed, but is still present

tesser...@googlecode.com

unread,
Feb 28, 2012, 5:03:46 PM2/28/12
to tesserac...@googlegroups.com

Comment #2 on issue 595 by jbrei...@google.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

Crashes with English as well on this image.

tesser...@googlecode.com

unread,
May 20, 2012, 7:03:36 AM5/20/12
to tesserac...@googlegroups.com

Comment #3 on issue 595 by MartinCh...@gmail.com: Crashbug in tesseract
3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

Hi, could you post all files you are using, not only ocrb.traineddata?

tesser...@googlecode.com

unread,
May 21, 2012, 10:15:13 AM5/21/12
to tesserac...@googlegroups.com

Comment #4 on issue 595 by chef...@gmail.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

Well apart from ocrb.traineddata, there is a file named char.bmp. Or am i
missing something ?

tesser...@googlecode.com

unread,
Aug 3, 2012, 4:13:47 PM8/3/12
to tesserac...@googlegroups.com

Comment #5 on issue 595 by zde...@gmail.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

can you please try to create ocrb.traineddata with 3.02 version (in svn)?

tesser...@googlecode.com

unread,
Nov 5, 2012, 9:13:28 AM11/5/12
to tesserac...@googlegroups.com
Updates:
Status: Invalid

Comment #6 on issue 595 by zde...@gmail.com: Crashbug in tesseract 3.0.1
http://code.google.com/p/tesseract-ocr/issues/detail?id=595

Closed because of missing input from reporter

Reply all
Reply to author
Forward
0 new messages