I've trained 8000 samples with set of commands below:
echo "~/source/source.lstmf" > /home/j/trainingCurrentEng/data/list.eval
echo "~/source/source.lstmf" > /home/j/trainingCurrentEng/data/list.train
lstmtraining
--continue_from
/home/j/trainingCurrentEng/data/checkpoints/eng_trained_checkpoint
--traineddata /home/j/trainingCurrentEng/data/eng.traineddata
--traineddata /home/j/trainingCurrentEng/data/eng.traineddata
--train_listfile /home/j/trainingCurrentEng/data/list.train
--eval_listfile /home/j/trainingCurrentEng/data/list.eval --model_output
/home/j/trainingCurrentEng/data/checkpoints/eng_trained --learning_rate
0.001 --debug_interval 10 --max_iterations 8000000
lstmtraining
--stop_training --continue_from
/home/j/trainingCurrentEng/data/checkpoints/eng_trained_checkpoint
--traineddata /home/j/trainingCurrentEng/data/eng.traineddata
--model_output /home/j/trainingCurrentEng/data/eng_trained.traineddata
IMPORTANT/RESULT:
tesseract source.tiff output_text -l eng --tessdata-dir /home/j/trainingCurrentEng/data --psm 7
cat output_text.txt
abcdef
tesseract source.tiff output_text_1 -l eng_trained --tessdata-dir /home/j/trainingCurrentEng/data --psm 7
cat output_text_1.txt
laldlfk
Question:
Syntax one looks better, but after 8000 results I got Tesseract eng_trained model distorted, so it reads completely wrong
But If you read THE LAST sample trained/updated eng_trained, it reads this exact data flawlessly
What am I doing wrong? How to fix current syntax?
IMPORTANT: I use images in same color pallete: black background white(close to gray) font, without any masks applied.
j@j-Aspire-A515-58M:~/source$ ls
source.box source.lstmf source.tiff source.txt unicharset
Source Box:
a 251 52 355 178 0
b 356 51 444 176 0
c 446 22 530 175 0
d 534 22 622 173 0
e 626 60 766 174 0
f 768 59 870 173 0
source.txt
abcdefunicharset
9
NULL 0 Common 0
Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined # Joined [4a 6f 69 6e 65 64 ]a
|Broken|0|1 15 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1 # Broken
a 3 0,255,0,255,0,0,0,0,0,0 Latin 3 0 3 a # a [6f ]a
b 3 0,255,0,255,0,0,0,0,0,0 Latin 4 0 4 b # b [65 ]a
c 3 0,255,0,255,0,0,0,0,0,0 Latin 5 0 5 c # c [64 ]a
d 3 0,255,0,255,0,0,0,0,0,0 Latin 6 0 6 d # d [6b ]a
e 3 0,255,0,255,0,0,0,0,0,0 Latin 7 0 7 e # e [6d ]a
f 3 0,255,0,255,0,0,0,0,0,0 Latin 8 0 8 f # f [63 ]a