Hey everyone ,
I am train my own lstm model based using some specific images that I want tesseract to work efficiently on. I have used the command
$ lstmtraining --model_output=my_output.lstm --traineddata="C:\Program Files\Tesseract-OCR\tessdata\eng.traineddata" --old_traineddata="C:\Program Files\Tesseract-OCR\tessdata\eng.traineddata" --train_listfile=traindata.txt
but it is giving I high error rate
At iteration 40/40/40, Mean rms=5.874000%, delta=47.785000%, BCER train=99.487000%, BWER train=100.000000%, skip ratio=0.000000%, New worst BCER = 99.487000 wrote checkpoint.
Finished! Selected model with minimal training error rate (BCER) = 99.367
So my questions is What is the reason for this high error rate as my file contains normal english sentences.
I think maybe my custom model is not leveraging the preptrained "eng.lstm" model
Thanks