lstmtraining query

69 views
Skip to first unread message

Samruddhi Dhake

unread,
Sep 16, 2021, 5:24:25 AM9/16/21
to tesseract-ocr
Hello,

lstmtraining --model_output="D:\Test\output" --continue_from="D:\Test\Dim_test.lstmf" --train_listfile="D:\Test\eng.training_files.txt"  --traineddata="D:\Test\eng\eng.traineddata" --debug_interval -1 -max_iterations 10

After running above command, I got,

Warning: given outputs 111 not equal to unicharset of 110.
Num outputs,weights in Series:
  1,36,0,1:1, 0
Num outputs,weights in Series:
  C3,3:9, 0
  Ft16:16, 160
Total weights = 160
  [C3,3Ft16]:16, 160
  Mp3,3:16, 0
  Lfys48:48, 12480
  Lfx96:96, 55680
  Lrx96:96, 74112
  Lfx256:256, 361472
  Fc110:110, 28270
Total weights = 532174
Built network:[1,36,0,1[C3,3Ft16]Mp3,3Lfys48Lfx96Lrx96Lfx256Fc110] from request [1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]
Training parameters:
  Debug interval = -1, weights = 0.1, learning rate = 0.001, momentum=0.5
null char=109
Loaded 2/2 lines (1-2) of document D:\Test\Dim_test.lstmf
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 0: GROUND  TRUTH : +1.5
Iteration 0: ALIGNED TRUTH : ++11..55
Iteration 0: BEST OCR TEXT : _B_f_t_t_t_t_f
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.855%, delta=27.586%, train=450%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 1: GROUND  TRUTH : +1.5
Iteration 1: ALIGNED TRUTH : ++11..55
Iteration 1: BEST OCR TEXT : _______
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.839%, delta=27.586%, train=362.5%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 2: GROUND  TRUTH : +1.5
Iteration 2: ALIGNED TRUTH : ++11..55
Iteration 2: BEST OCR TEXT : _+++++++
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.821%, delta=27.586%, train=325%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 3: GROUND  TRUTH : +1.5
Iteration 3: ALIGNED TRUTH : ++11..55
Iteration 3: BEST OCR TEXT : +++++++
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.798%, delta=25.862%, train=300%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 4: GROUND  TRUTH : +1.5
Iteration 4: ALIGNED TRUTH : ++11..55
Iteration 4: BEST OCR TEXT : +++++++
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.765%, delta=24.138%, train=285%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 5: GROUND  TRUTH : +1.5
Iteration 5: ALIGNED TRUTH : +11..55
Iteration 5: BEST OCR TEXT : ++++...
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.704%, delta=22.414%, train=266.667%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 6: GROUND  TRUTH : +1.5
Iteration 6: ALIGNED TRUTH : +1..555
Iteration 6: BEST OCR TEXT : ++.
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.519%, delta=19.704%, train=239.286%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 7: GROUND  TRUTH : +1.5
Iteration 7: ALIGNED TRUTH : +1..5555
Iteration 7: BEST OCR TEXT : +.
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.308%, delta=17.672%, train=215.625%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 8: GROUND  TRUTH : +1.5
Iteration 8: ALIGNED TRUTH : +1...555
Iteration 8: BEST OCR TEXT : .
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=5.101%, delta=16.092%, train=200%(100%), skip ratio=100%
Encoding of string failed! Failure bytes: ffffffc3 ffffff98 34 32 33 2e 31 20 30 2e 30
Can't encode transcription: '├ÿ423.1 0.0' in language ''
Iteration 9: GROUND  TRUTH : +1.5
Iteration 9: ALIGNED TRUTH : +1...55
Iteration 9: BEST OCR TEXT : .......5
File D:\Test\Dim_test.lstmf line 1 :
Mean rms=4.888%, delta=14.828%, train=200%(100%), skip ratio=100%
At iteration 10/10/20, Mean rms=4.888%, delta=14.828%, char train=200%, word train=100%, skip ratio=100%,  New worst char error = 200 wrote checkpoint.

Finished! Error rate = 100

Later, running command, 
lstmtraining --stop_training --continue_from="D:\Test\output_checkpoint" --traineddata="D:\Test\eng\eng.traineddata --model_output="D:\Test\abc.traineddata"

It am getting.
mgr_.Init(traineddata_path.c_str()):Error:Assert failed:in file ../../../../../src/training/lstmtrainer.h, line 96


My question is, does lstmtraining first generates output_checkpoints and then giving those to lstmtraining --stop_training, does it generates .trainneddata??? 
Thankyou.

Regards,
Samruddhi

Samruddhi Dhake

unread,
Sep 16, 2021, 5:28:50 AM9/16/21
to tesseract-ocr

Hi,
One more question to add here is, after running 2nd command mentioned above, I am getting assert in file lstmtrainer.h, but I didn't find this file in src/training in my folder Tesseract-OCR.
Can you please me with this too?

Regards,
Samruddhi

Samruddhi Dhake

unread,
Sep 23, 2021, 12:37:40 AM9/23/21
to tesseract-ocr
Hi,
Can anyone help me to resolve above issues?

Regards,
Samruddhi

Reply all
Reply to author
Forward
0 new messages