What to Do With Multiple .traineddata Files of the Same Language?

689 views
Skip to first unread message

Ferretsroq

unread,
Jul 8, 2015, 1:42:12 PM7/8/15
to tesser...@googlegroups.com
I'm trying to train tesseract for a new font. I'm on OSX if that matters. I've followed these steps so far, but I'm confused what to do once you have the eng.traineddata file for the font. If I just move it to the /usr/local/share/tessdata directory, then there is a conflict with the already existing eng.traineddata that came with tesseract. What am I supposed to do in case of this conflict? 

I'm attaching all of the files made during the process since I'm not sure if I've just messed up elsewhere down the line. They're compressed into a .zip since it's not letting me upload some of them.
fontTrainingFiles.zip

ShreeDevi Kumar

unread,
Jul 10, 2015, 2:41:31 AM7/10/15
to tesser...@googlegroups.com

Usually if you have multiple traineddata for same language, you would give a distinct name to each eg.  eng and en2

Then if you want to use both
-l eng+en2
Or
-l en2+eng
Depending on which one u want to give priority to

To use ur own traineddata en2 only
-l en2

- sent from my phone. excuse the brevity

On 8 Jul 2015 23:12, "Ferretsroq" <ferre...@gmail.com> wrote:
I'm trying to train tesseract for a new font. I'm on OSX if that matters. I've followed these steps so far, but I'm confused what to do once you have the eng.traineddata file for the font. If I just move it to the /usr/local/share/tessdata directory, then there is a conflict with the already existing eng.traineddata that came with tesseract. What am I supposed to do in case of this conflict? 

I'm attaching all of the files made during the process since I'm not sure if I've just messed up elsewhere down the line. They're compressed into a .zip since it's not letting me upload some of them.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/dbf95409-2a20-4a1f-9f14-f336394ffa9c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ShreeDevi Kumar

unread,
Jul 10, 2015, 2:50:47 AM7/10/15
to tesser...@googlegroups.com

See https://tesseract-ocr.googlecode.com/git/doc/tesseract.1.html for syntax of command

- sent from my phone. excuse the brevity

Reply all
Reply to author
Forward
0 new messages