recognize number only reduce the train data file size!

36 views
Skip to first unread message

alshikh alshikh

unread,
Nov 15, 2017, 10:04:03 AM11/15/17
to tesseract-ocr
hi every body thank you for the incredible work

I always have the same file after training (ara.traineddata)  always the same size and the same result i only need to train tesseract for three fonts only for test purpose I train tesseract just for number in one font also get the same output training data the same as the original I give the tesseract to get the files needed to train my command as follow did I do some think wrong pleas help me

training/tesstrain.sh \
--fonts_dir /home/idris/mylot \
--lang ara --linedata_only \
--noextract_font_properties --langdata_dir ./langdata \
--tessdata_dir ./tessdata \
--output_dir ~/mylottutorial \
--fontlist "mylotus Bold"


training/combine_tessdata -e ./tessdata/ara.traineddata \
  ~/mylottutorial/ara.lstm


training/lstmtraining --model_output ~/mylottutorial/plusminus/ \
--continue_from  ~/mylottutorial/ara.lstm \
--train_listfile ~/mylottutorial/ara.training_files.txt \
--traineddata    ./tessdata/ara.traineddata \
--old_traineddata ./tessdata/ara.traineddata \
--max_iterations 1000

training/lstmtraining --stop_training \
  --continue_from ~/mylottutorial/plusminus/_checkpoint \
  --model_output  ~/mylottutorial/plusminus/aranew.traineddata \
  --traineddata   ./tessdata/ara.traineddata

I notice the the deferent size of best and fast data file dos the size reflect in the speed ?
Reply all
Reply to author
Forward
0 new messages