tesseract trained successfully but gives:Tesseract Open Source OCR Engine v3.03 with Leptonica Segmentation fault (core dumped)

156 views
Skip to first unread message

Dovhani Foneworx

unread,
Aug 21, 2014, 6:03:47 AM8/21/14
to tesser...@googlegroups.com
Hi guys, I have a problem, I have succesfully trained tesseract 3.03 in Ubunt 14.04 but when i run tesseract it is giving errors on an image and the image was part of the image.

I have join 4 images with imagemagice and make one big image and when i run tesseract it do the following:


fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ ls
aa.tif  dd.tif           kfc.normproto   kfc.Times_New_Roman.exp0.box  kfc.Times_New_Roman.exp0.txt  kfc.unicharset
bb.tif  font_properties  kfc.pffmtable   kfc.Times_New_Roman.exp0.tif  kfc.traineddata               kfc.unicharset2
cc.tif  kfc.inttemp      kfc.shapetable  kfc.Times_New_Roman.exp0.tr   kfc.unicharambigs             output_unicharset
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ tesseract aa.tif aak -l kfc
Tesseract Open Source OCR Engine v3.03 with Leptonica
Segmentation fault (core dumped)

Nick White

unread,
Aug 21, 2014, 10:04:34 AM8/21/14
to tesser...@googlegroups.com
Hi Dovhani,

Does this happen with all images when using your training, or just
one?

Nick
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to tesseract-oc...@googlegroups.com.
> To post to this group, send email to tesser...@googlegroups.com.
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/msgid/
> tesseract-ocr/57d62452-3bbd-4ae8-96a2-fb2d1404cee9%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Dovhani Foneworx

unread,
Aug 21, 2014, 10:19:44 AM8/21/14
to tesser...@googlegroups.com
Hi Nick, this happens when I test with all images.

and also, I have 4 images that i have joined together to make one single big image using imagemagic, and now im testing with all those individual image and also the big image aswell.

this images are till slip from same shop.

same problem happen.




You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/0lc7bn-Gl60/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.

Nick White

unread,
Aug 21, 2014, 10:49:45 AM8/21/14
to tesser...@googlegroups.com
In that case it must be a problem with your training data. Can you
let us know the exact commands you used to create it?

Alternatively, you could post a gdb backtrace, if you know how to do
that.

Nick
> CALc7wWW7OBr0wARQoTAPtTPd1u6Gj1erOUEaAEQYP%2B5MLqbqug%40mail.gmail.com.

Dovhani Foneworx

unread,
Aug 21, 2014, 10:55:02 AM8/21/14
to tesser...@googlegroups.com
The following is the process happened during training.





fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ tesseract kfc.Times_New_Roman.exp0.tif kfc.Times_New_Roman.exp0 box.train

Tesseract Open Source OCR Engine v3.03 with Leptonica
row xheight=40, but median xheight = 48
row xheight=40.4667, but median xheight = 48
row xheight=11, but median xheight = 48
row xheight=11, but median xheight = 48
row xheight=38.8333, but median xheight = 48
row xheight=26, but median xheight = 48
row xheight=63, but median xheight = 48
row xheight=61, but median xheight = 48
row xheight=39.8571, but median xheight = 48
row xheight=10, but median xheight = 48
row xheight=10, but median xheight = 48
row xheight=39.1667, but median xheight = 48
row xheight=39.1667, but median xheight = 48
row xheight=78.5, but median xheight = 48
row xheight=70.5, but median xheight = 48
row xheight=70, but median xheight = 48
row xheight=64.6667, but median xheight = 48
row xheight=90.5, but median xheight = 48
row xheight=90, but median xheight = 48
row xheight=65, but median xheight = 48
row xheight=64, but median xheight = 48
row xheight=64.1667, but median xheight = 48
row xheight=64.1667, but median xheight = 48
row xheight=64.1667, but median xheight = 48
row xheight=22.1875, but median xheight = 48
row xheight=40.75, but median xheight = 48
row xheight=40.75, but median xheight = 48
row xheight=40.1667, but median xheight = 48
row xheight=70, but median xheight = 48
row xheight=70.5, but median xheight = 48
row xheight=8264, but median xheight = 48
FAIL!
APPLY_BOXES: boxfile line 431/’ ((1346,11952),(1348,11954)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 1244/~ ((0,0),(3096,16512)): FAILURE! Couldn't find a matching blob
APPLY_BOXES:
   Boxes read from boxfile:    1244
   Boxes failed resegmentation:       2
APPLY_BOXES: Unlabelled word at :Bounding box=(2064,698)->(2067,700)
APPLY_BOXES: Unlabelled word at :Bounding box=(0,-8)->(3096,16520)
   Found 1242 good blobs.
   Leaving 11 unlabelled blobs in 0 words.
   2 remaining unlabelled words deleted.
TRAINING ... Font name = Times_New_Roman
Generated training data for 231 words
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$




fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ unicharset_extractor kfc.Times_New_Roman.exp0.box
Extracting unicharset from kfc.Times_New_Roman.exp0.box
Wrote unicharset file ./unicharset.
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$




fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ set_unicharset_properties -U unicharset -O outunicharset --script_dir=../../
Loaded unicharset of size 89 from file unicharsetSetting unichar propertiesOther case JOINED of Joined is not in unicharsetOther case |BROKEN|0|1 of |Broken|0|1 is not in unicharsetMirror > of < is not in unicharset
Other case H of h is not in unicharsetOther case Y of y is not in unicharsetOther case Z of z is not in unicharsetOther case Q of q is not in unicharsetMirror [ of ] is not in unicharset
Other case x of X is not in unicharsetOther case DE of de is not in unicharsetOther case J of j is not in unicharsetWriting unicharset to file outunichar...@foneworxtest.foneworx.co.za:




fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ shapeclustering -F font_properties -U outunicharset kfc.Times_New_Roman.exp0.tr
Reading kfc.Times_New_Roman.exp0.tr ...
Building master shape table
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85
Distance = 0.000000: Distance = 0.000000: Distance = 0.000000: Distance = 0.003497: Distance = 0.004219: Distance = 0.008547: Distance = 0.011494: Distance = 0.011905: Distance = 0.017964: Distance = 0.024390: Distance = 0.024793: Stopped with 11 merged, min dist 0.025424
Master shape_table:Number of shapes = 75 max unichars = 4 number with multiple unichars = 8
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$





fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ mftraining -F font_properties -U outunicharset -O chk.outunicharset kfc.Times_New_Roman.exp0.tr
Read shape table shapetable of 75 shapes
Reading kfc.Times_New_Roman.exp0.tr ...
Warning: no protos/configs for sh0072 in CreateIntTemplates()
Warning: no protos/configs for sh0073 in CreateIntTemplates()
Warning: no protos/configs for sh0074 in CreateIntTemplates()
Done!
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$





fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ cntraining kfc.Times_New_Roman.exp0.tr
Reading kfc.Times_New_Roman.exp0.tr ...
Clustering ...

Writing normproto ...
fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$

fone...@foneworxtest.foneworx.co.za:~/DM/Tesseracting/TESTDIR/Test2/images2/kfc$ combine_tessdata kfc.
Combining tessdata files
TessdataManager combined tesseract data files.
Offset for type  0 (kfc.config                ) is -1
Offset for type  1 (kfc.unicharset            ) is 140
Offset for type  2 (kfc.unicharambigs         ) is -1
Offset for type  3 (kfc.inttemp               ) is 5692
Offset for type  4 (kfc.pffmtable             ) is 609452
Offset for type  5 (kfc.normproto             ) is 610092
Offset for type  6 (kfc.punc-dawg             ) is -1
Offset for type  7 (kfc.word-dawg             ) is -1
Offset for type  8 (kfc.number-dawg           ) is -1
Offset for type  9 (kfc.freq-dawg             ) is -1
Offset for type 10 (kfc.fixed-length-dawgs    ) is -1
Offset for type 11 (kfc.cube-unicharset       ) is -1
Offset for type 12 (kfc.cube-word-dawg        ) is -1
Offset for type 13 (kfc.shapetable            ) is 620609
Offset for type 14 (kfc.bigram-dawg           ) is -1
Offset for type 15 (kfc.unambig-dawg          ) is -1
Offset for type 16 (kfc.params-model          ) is -1
Output kfc.traineddata created sucessfully.






Reply all
Reply to author
Forward
0 new messages