Lstmeval results

44 views
Skip to first unread message

Augustin Fourcaud

unread,
Jun 22, 2023, 8:38:51 AM6/22/23
to tesseract-ocr
hello, i finetuned the eng.traineddata model because i wanted it to reconize the greek lambda symbol. I got the ocrlambda.traineddata file and i want to evaluate it using lstmeval.

when i eval a checkpoint file with --trainedata parameter set to eng.traineddata i get terrible results with  this error on every iteration where the lambda appear.
************************
Encoding of string failed! Failure bytes: ce bb 4d 4e 44
Can't encode transcription: 'LY O kcXλMND' in language ''
Truth:8Az7V I vUOs
OCR  :i g8 A z. 7 V I vlU O0 s.l
Line BCER=1.000000, BWER=0.666667
*************************

but when i train with a --trainedata parameter set to  ocrlambda.traineddata i get really good results.

but on the doc a saw that the trainedata parameter  should be set to the  file that was given to the trainer.

is it an error or do i understand or do anything wrong ?

thanks a lot 

Augustin Fourcaud

unread,
Jun 26, 2023, 10:07:05 AM6/26/23
to tesseract-ocr
Do you need more information about my installation and my training command ? Or should i just evaluate using the a--trainedata ?

thanks in advance.

Reply all
Reply to author
Forward
0 new messages