Unicharambigs not working properly

26 views
Skip to first unread message

odnowa...@gmail.com

unread,
Jul 11, 2017, 12:37:54 PM7/11/17
to tesseract-ocr

I have problem with tesseract training with font i created. After whole process of generating bunch of tesseract files and combining them, my tesseract reads all "7" as "?". Font ha both chars.

I created unicharambigs file containing:


v1
1   ?   1   7   1


It's saved in Vi in unix fileformat and contains new line char after last line. It should replace all '?' for '7'.

Combining gives me result:

 
    Combining tessdata files
    TessdataManager combined tesseract data files.
    Offset for type  0 (SmAftersale.config                ) is -1
    Offset for type  1 (SmAftersale.unicharset            ) is 140
    Offset for type  2 (SmAftersale.unicharambigs         ) is 3047
    Offset for type  3 (SmAftersale.inttemp               ) is 3061
    Offset for type  4 (SmAftersale.pffmtable             ) is 350802
    Offset for type  5 (SmAftersale.normproto             ) is 351219
    Offset for type  6 (SmAftersale.punc-dawg             ) is -1
    Offset for type  7 (SmAftersale.word-dawg             ) is -1
    Offset for type  8 (SmAftersale.number-dawg           ) is -1
    Offset for type  9 (SmAftersale.freq-dawg             ) is -1
    Offset for type 10 (SmAftersale.fixed-length-dawgs    ) is -1
    Offset for type 11 (SmAftersale.cube-unicharset       ) is -1
    Offset for type 12 (SmAftersale.cube-word-dawg        ) is -1
    Offset for type 13 (SmAftersale.shapetable            ) is 357761
    Offset for type 14 (SmAftersale.bigram-dawg           ) is -1
    Offset for type 15 (SmAftersale.unambig-dawg          ) is -1
    Offset for type 16 (SmAftersale.params-model          ) is -1
    Output SmAftersale.traineddata created successfully.


Offset for "SmAftersale.unicharambigs" file is not -1, so i assume that file was read. But still, after all that, tesseract keeps reading all '7' as '?'.

What i did wrong or what did i missed?

Reply all
Reply to author
Forward
0 new messages