I followed the steps for fine-tuning Tesseract for handwriting recognition. I have the character images and the corresponding box files. Then I generated the .lstmf files, followed by the lstm_train.txt and lstm_test.txt files.
However, when I launch the training using these list files, it doesn't work. But when I test the training with only a single path in the train and test text files, it works perfectly — the training starts correctly.
Also, all the .lstmf files are generated properly, because I wrote a script that trains on each file one by one, continuing from the last checkpoint each time. This worked for all the .lstmf files.
I'm not sure if the issue is with the generation of the lstm_train.txt, or if lstmtraining only accepts a single .lstmf file as input?
Here is the code for generating the lstm_train.txt and lstm_test.txt files :
voici un extrait de fichier lstm_train.txt :