Hello Shree,
I tried that. The command was
lstmtraining --traineddata data/akk/akk.traineddata --old_traineddata /usr/share/tesseract-ocr/4.00/tessdata/akk-1m.traineddata --continue_from data/akk-1m/akk.lstm --model_output data/akk/checkpoints/akk --train_listfile data/akk/list.train --eval_listfile data/akk/list.eval --max_iterations 1000 --debug_level -1
and the output started with
Loaded file data/akk/checkpoints/akk_checkpoint, unpacking...
Successfully restored trainer from data/akk/checkpoints/akk_checkpoint
Loaded 1/1 pages (1-1) of document data/akk-ground-truth/P336598.000347.CuneiformComposite.exp0.lstmf
Loaded 1/1 pages (1-1) of document data/akk-ground-truth/P238121.000012.CuneiformNAOutline_Medium.exp0.lstmf
and ended with
Loaded 1/1 pages (1-1) of document data/akk-ground-truth/Q005388.000005.Segoe_UI_Historic.exp0.lstmf
At iteration 4716762/4760600/4760600, Mean rms=1.436%, delta=8.366%, char train=105.86%, word train=86.31%, skip ratio=0%, wrote checkpoint.
Finished! Error rate = 88.246
Do I have have to retrain completely from scratch, meaning without loading the previous checkpoint?
Maybe I should check out another approach from yours and try to train with one font excluded, so the LSTM converges.
Another thought: I tried training Akkadian with Tesseract 4 once before, but with ground truth consisting of short text files with multiple lines of text, not one-liners. Obviously I used PSM 6, not PSM 11. Is there anything wrong with this approach?