After generated trained data issue

287 views
Skip to first unread message

Venkata Vijaya Krishna NR

unread,
Sep 12, 2019, 4:17:27 AM9/12/19
to tesseract-ocr
HI,

       i am preparing trained data  tool jtessboxeditor and qt-box-editor in  windows 10  after generated trained data i am implemented in dot net application tesseract engine  
Note: how to generate qt-box-editor trained data file. i can able to prepared only box file  and after setting i choose tesseract path in C:/User/tesseract-OCR but trained data empty
Note:
1. In jtessboxeditor tool i am loading the tif file in box editor  if any error comes in box file letters  i am changing the row x position  width and height.
2. Next i choose trainer tab in Tesseract executable browser and choose combine tess_data
3. Next Trained data  browser choose and click created .box file and i will click  Run Tesseract for Traning
i am facing this issue  

** Run Tesseract for Training **
[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/tesseract, ims.exp0.tif, ims.exp0, box.train]
Tesseract Open Source OCR Engine v4.0.0.20181030 with Leptonica
Page 1
row xheight=28, but median xheight = 37.4674
row xheight=2, but median xheight = 37.4674
row xheight=2, but median xheight = 37.4674
FAIL!
APPLY_BOXES: boxfile line 340/ë ((753,2219),(772,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 341/û ((768,2219),(822,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 342/ì ((816,2219),(835,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 343/à ((832,2219),(886,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 344/ù ((880,2219),(902,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 345/î ((898,2219),(935,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 346/î ((931,2219),(967,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 347/ó ((964,2219),(1005,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 348/Þ ((999,2219),(1054,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 349/ì ((1047,2219),(1066,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 350/À ((1062,2219),(1105,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 351/Þ ((1100,2219),(1154,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 352/í ((1143,2219),(1160,2268)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 440/ë ((411,1817),(430,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 441/³ ((426,1817),(480,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 442/ì ((473,1817),(492,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 443/à ((488,1817),(543,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 444/ù ((537,1817),(559,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 445/î ((555,1817),(592,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 446/ï ((588,1817),(625,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 447/î ((620,1817),(657,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 448/í ((648,1817),(665,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 450/ï ((680,1817),(718,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 451/ð ((713,1817),(755,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 452/ù ((750,1817),(772,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 453/î ((768,1817),(805,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 454/î ((801,1817),(838,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 455/Þ ((834,1817),(888,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 456/í ((877,1817),(894,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 457/ë ((928,1817),(947,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 458/û ((943,1817),(998,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 459/ì ((991,1817),(1010,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 460/à ((1007,1817),(1061,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 461/ù ((1055,1817),(1077,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 462/î ((1073,1817),(1110,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 463/î ((1106,1817),(1143,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 464/ó ((1139,1817),(1180,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 465/Þ ((1174,1817),(1229,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 466/ì ((1222,1817),(1241,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 467/À ((1237,1817),(1280,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 468/Þ ((1275,1817),(1329,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES: boxfile line 469/í ((1318,1817),(1335,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES:
   Boxes read from boxfile:    1078
   Boxes failed resegmentation:      42
APPLY_BOXES: Unlabelled word at :Bounding box=(928,1819)->(1335,1864)
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1863)->(1335,1867)
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1816)->(1335,1820)
   Found 1036 good blobs.
   Leaving 23 unlabelled blobs in 0 words.
   3 remaining unlabelled words deleted.
Generated training data for 57 words
Page 2
row xheight=12, but median xheight = 24.4821
APPLY_BOXES:
   Boxes read from boxfile:     916
   Found 916 good blobs.
Generated training data for 50 words

[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/tesseract, ims.exp1.tif, ims.exp1, box.train]
Tesseract Open Source OCR Engine v4.0.0.20181030 with Leptonica
Page 1
row xheight=2, but median xheight = 38.4333
row xheight=2, but median xheight = 38.4333
row xheight=2, but median xheight = 38.4333
row xheight=2, but median xheight = 38.4333
FAIL!
APPLY_BOXES: boxfile line 355/ë ((1202,2219),(1217,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 356/À ((1217,2219),(1254,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 357/í ((1254,2219),(1266,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 483/ë ((1456,1817),(1471,1866)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 484/À ((1471,1817),(1509,1866)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 485/í ((1509,1817),(1520,1866)): FAILURE! Couldn't find a matching blob
APPLY_BOXES:
   Boxes read from boxfile:    1097
   Boxes failed resegmentation:       6
APPLY_BOXES: Unlabelled word at :Bounding box=(753,2265)->(1361,2269)
APPLY_BOXES: Unlabelled word at :Bounding box=(753,2218)->(1361,2222)
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1863)->(1615,1867)
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1816)->(1615,1820)
   Found 1091 good blobs.
   4 remaining unlabelled words deleted.
Generated training data for 56 words
Page 2
row xheight=12, but median xheight = 24.4821
APPLY_BOXES:
   Boxes read from boxfile:     917
   Found 917 good blobs.
Generated training data for 46 words

[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/tesseract, IMS.ims.exp2.tif, IMS.ims.exp2, box.train]
Tesseract Open Source OCR Engine v4.0.0.20181030 with Leptonica
Page 1
row xheight=2, but median xheight = 36.4674
row xheight=2, but median xheight = 36.4674
FAIL!
APPLY_BOXES: boxfile line 340/ë ((738,2219),(753,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 341/û ((755,2219),(803,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 342/ì ((805,2219),(816,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 343/à ((817,2219),(866,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 344/ù ((866,2219),(883,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 345/î ((884,2219),(916,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 346/î ((917,2219),(949,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 347/ó ((950,2219),(984,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 348/Þ ((986,2219),(1033,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 349/ì ((1034,2219),(1047,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 350/À ((1047,2219),(1085,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 351/Þ ((1086,2219),(1134,2268)): FAILURE! Couldn't find a matching blob
FAIL!
APPLY_BOXES: boxfile line 352/í ((1134,2219),(1144,2268)): FAILURE! Couldn't find a matching blob
APPLY_BOXES:
   Boxes read from boxfile:    1081
   Boxes failed resegmentation:      13
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1863)->(1399,1867)
APPLY_BOXES: Unlabelled word at :Bounding box=(411,1816)->(1399,1820)
   Found 1068 good blobs.
   2 remaining unlabelled words deleted.
Generated training data for 56 words
Page 2
row xheight=12, but median xheight = 24.5
row xheight=53, but median xheight = 24.5
APPLY_BOXES:
   Boxes read from boxfile:     921
   Found 921 good blobs.
Generated training data for 47 words

** Compute the Character Set **
[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/unicharset_extractor, ims.exp0.box, ims.exp1.box, IMS.ims.exp2.box]
Extracting unicharset from box file ims.exp0.box
Extracting unicharset from box file ims.exp1.box
Extracting unicharset from box file IMS.ims.exp2.box
Other case d of D is not in unicharset
Other case r of R is not in unicharset
Other case a of A is not in unicharset
Other case w of W is not in unicharset
Other case i of I is not in unicharset
Other case n of N is not in unicharset
Other case g of G is not in unicharset
Other case o of O is not in unicharset
Other case f of F is not in unicharset
Other case m of M is not in unicharset
Other case e of E is not in unicharset
Other case h of H is not in unicharset
Other case t of T is not in unicharset
Other case v of V is not in unicharset
Other case c of C is not in unicharset
Other case k of K is not in unicharset
Other case p of P is not in unicharset
Other case u of U is not in unicharset
Other case b of B is not in unicharset
Other case l of L is not in unicharset
Other case y of Y is not in unicharset
Other case Ë of ë is not in unicharset
Other case Ì of ì is not in unicharset
Other case Ù of ù is not in unicharset
Other case Î of î is not in unicharset
Other case Ó of ó is not in unicharset
Other case þ of Þ is not in unicharset
Other case Í of í is not in unicharset
Other case Ï of ï is not in unicharset
Other case Ð of ð is not in unicharset
Other case q of Q is not in unicharset
Other case z of Z is not in unicharset
Other case j of J is not in unicharset
Other case ú of Ú is not in unicharset
Wrote unicharset file unicharset

[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/set_unicharset_properties, -U, unicharset, -O, unicharset, --script_dir=C:\Users\Admin\Desktop\NewProject]
Loaded unicharset of size 71 from file unicharset
Setting unichar properties
Other case d of D is not in unicharset
Other case r of R is not in unicharset
Other case a of A is not in unicharset
Other case w of W is not in unicharset
Other case i of I is not in unicharset
Other case n of N is not in unicharset
Other case g of G is not in unicharset
Other case o of O is not in unicharset
Other case f of F is not in unicharset
Other case m of M is not in unicharset
Other case e of E is not in unicharset
Other case h of H is not in unicharset
Other case t of T is not in unicharset
Other case v of V is not in unicharset
Other case c of C is not in unicharset
Other case k of K is not in unicharset
Other case p of P is not in unicharset
Other case u of U is not in unicharset
Other case b of B is not in unicharset
Other case l of L is not in unicharset
Other case y of Y is not in unicharset
Other case Ë of ë is not in unicharset
Other case Ì of ì is not in unicharset
Other case Ù of ù is not in unicharset
Other case Î of î is not in unicharset
Other case Ó of ó is not in unicharset
Other case þ of Þ is not in unicharset
Other case Í of í is not in unicharset
Other case Ï of ï is not in unicharset
Other case Ð of ð is not in unicharset
Other case q of Q is not in unicharset
Other case z of Z is not in unicharset
Other case j of J is not in unicharset
Other case ú of Ú is not in unicharset
Setting script properties
Failed to load script unicharset from:C:\Users\Admin\Desktop\NewProject/Latin.unicharset
Warning: properties incomplete for index 3 = 8
Warning: properties incomplete for index 4 = 7
Warning: properties incomplete for index 5 = 6
Warning: properties incomplete for index 6 = 5
Warning: properties incomplete for index 7 = 4
Warning: properties incomplete for index 8 = 3
Warning: properties incomplete for index 9 = 2
Warning: properties incomplete for index 10 = 1
Warning: properties incomplete for index 11 = D
Warning: properties incomplete for index 12 = R
Warning: properties incomplete for index 13 = A
Warning: properties incomplete for index 14 = W
Warning: properties incomplete for index 15 = I
Warning: properties incomplete for index 16 = N
Warning: properties incomplete for index 17 = G
Warning: properties incomplete for index 18 = O
Warning: properties incomplete for index 19 = F
Warning: properties incomplete for index 20 = M
Warning: properties incomplete for index 21 = E
Warning: properties incomplete for index 22 = S
Warning: properties incomplete for index 23 = H
Warning: properties incomplete for index 24 = T
Warning: properties incomplete for index 25 = .
Warning: properties incomplete for index 26 = V
Warning: properties incomplete for index 27 = C
Warning: properties incomplete for index 28 = ±
Warning: properties incomplete for index 29 = 0
Warning: properties incomplete for index 30 = K
Warning: properties incomplete for index 31 = +
Warning: properties incomplete for index 32 = -
Warning: properties incomplete for index 33 = P
Warning: properties incomplete for index 34 = U
Warning: properties incomplete for index 35 = B
Warning: properties incomplete for index 36 = X
Warning: properties incomplete for index 37 = L
Warning: properties incomplete for index 38 = 9
Warning: properties incomplete for index 39 = (
Warning: properties incomplete for index 40 = )
Warning: properties incomplete for index 41 = /
Warning: properties incomplete for index 42 = Y
Warning: properties incomplete for index 43 = ¡
Warning: properties incomplete for index 44 = ë
Warning: properties incomplete for index 45 = û
Warning: properties incomplete for index 46 = ì
Warning: properties incomplete for index 47 = à
Warning: properties incomplete for index 48 = ù
Warning: properties incomplete for index 49 = î
Warning: properties incomplete for index 50 = ó
Warning: properties incomplete for index 51 = Þ
Warning: properties incomplete for index 52 = À
Warning: properties incomplete for index 53 = í
Warning: properties incomplete for index 54 = °
Warning: properties incomplete for index 55 = ³
Warning: properties incomplete for index 56 = ï
Warning: properties incomplete for index 57 = ð
Warning: properties incomplete for index 58 = »
Warning: properties incomplete for index 59 = x
Warning: properties incomplete for index 60 = :
Warning: properties incomplete for index 61 = Q
Warning: properties incomplete for index 62 = ,
Warning: properties incomplete for index 63 = Z
Warning: properties incomplete for index 64 = J
Warning: properties incomplete for index 65 = =
Warning: properties incomplete for index 66 = s
Warning: properties incomplete for index 67 = Û
Warning: properties incomplete for index 68 = Ú
Warning: properties incomplete for index 69 = ´
Warning: properties incomplete for index 70 = «
Writing unicharset to file unicharset

** Shape Clustering **
[E:\jTessBoxEditor-2.2.0\jTessBoxEditor\tesseract-ocr/shapeclustering, -F, IMS.font_properties, -U, unicharset, ims.exp0.tr, ims.exp1.tr, IMS.ims.exp2.tr]
Reading ims.exp0.tr ...
Reading ims.exp1.tr ...
Reading IMS.ims.exp2.tr ...
Building master shape table
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances...
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0
Stopped with 0 merged, min dist 999.000000
Computing shape distances... 0    
Reply all
Reply to author
Forward
0 new messages