Hi,
I have the latest version of Tesseract and leptonica 1.74.4, ran the command
training/lstmtraining --model_output /home/ibr/latest_leptonica_4/lstmf_old_jpn/jpn \
--continue_from /home/ibr/latest_leptonica_4/jpn_tune/extracted/jpn.lstm \
--traineddata /home/ibr/latest_leptonica_4/lstmf_jpn_lep4/jpn/jpn.traineddata \
--old_traineddata /home/ibr/latest_leptonica_4/jpn_tune/original_traineddata/jpn.traineddata \
--train_listfile /home/ibr/latest_leptonica_4/jpn_tune/jpn.training_files.txt \
--max_iterations 18000
then the command :
training/lstmtraining --stop_training \
--continue_from /path/to/fine_tune/results/lang_checkpoint \
--traineddata /path/to/starter_traineddata/lang.traineddata \
--model_output /paht/to/new/tuned_lang.traineddata/lang.traineddata
to create the traineddata file, yet I found that the accuracy of the official "best_traineddata" is better that what I got, I saw on this ocr group and some comments of GitHub that too many iterations don't give you best results, so I was wondering what is the optimal iterations to get the best results?
the first command above is from, Fine Tune for +_ few characters, what are the commands for "Fine Tuning Impact"? I tried the command :
training/lstmtraining --model_output ~/tesstutorial/impact_from_small/impact \
--continue_from ~/tesstutorial/engoutput/base_checkpoint \
--traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
--train_listfile ~/tesstutorial/engeval/eng.training_files.txt \
--max_iterations 1200
but what is the next commands?
Thanks