training Indian Rupee symbol

44 views
Skip to first unread message

PD

unread,
Mar 4, 2020, 1:23:29 AM3/4/20
to tesseract-ocr
Hello,
To train Indian Rupee symbol(Symbol₹) i added it in multiple places in eng.training_text file. Around 20 places i added. I then ran tesstrain.sh using various arguments. I am running tesseract(4.1.1) in Ubuntu so used /usr/share/fonts as default fonts directory. After some time, i got the result without error . Message was "run lstmtraing". After this i ran lstmtraining command
training/lstmtraining --debug_interval 100 \
  --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
  --net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \
  --model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \
  --train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \
I now got base_checkpoint file. I ran above command for 5000 iterations.
After this, i again ran lstmtraining with stop_training command as argument.
training/lstmtraining --stop_training \
  --continue_from ~/tesstutorial/eng_from_chi/base_checkpoint \
  --traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
Any other stop do i need to perform ?
Regards
PD

Reply all
Reply to author
Forward
0 new messages