Re: [tesseract-ocr] Training Tesseract Can't Find Files

40 views
Skip to first unread message

ShreeDevi Kumar

unread,
Nov 10, 2014, 9:07:18 PM11/10/14
to tesser...@googlegroups.com
What method are you using for training? 

Which version of tesseract?

What platform?

Please see instructions on

The following shell script will be useful, if using the latest source from git. 



It  gives a good overview of the training process.

tlog "\n=== Starting training for language '${LANG_CODE}'"

tlog "Cleaning workspace directory ${TRAINING_DIR}..."
mkdir -p ${TRAINING_DIR}
rm -fr ${TRAINING_DIR}/*

phaseI_generate_image
phaseUP_generate_unicharset
phaseD_generate_dawg
phaseE_extract_features
phaseC_cluster_prototypes
phaseS_cluster_shapes
phaseM_cluster_microfeatures
phaseB_generate_ambiguities
make_traineddata

tlog "\nCompleted training for language '${LANG_CODE}'\n"




ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Tue, Nov 11, 2014 at 12:54 AM, <ste...@fortyau.com> wrote:
I'm rather new to the tesseract game. I've followed the training steps and got to the point to where I need to combine. I notice I need at least the below:
  • tessdata/eng.config
  • tessdata/eng.unicharset
  • tessdata/eng.unicharambigs
  • tessdata/eng.inttemp
  • tessdata/eng.pffmtable
  • tessdata/eng.normproto
No I have followed the steps and have:
  • VIN.shapetable
  • VIN.unicharambigs
  • VIN.unicharset
Where do I get the config, inttemp, pffmtable, and normproto from? What are those files? The instructions for training do not include their generation anywhere from what I can see. Any help would be appreciated.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/feeb57ec-c2dd-46ca-8256-940f4d5b91e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages