couldn't find components in fine tuned traineddata file

32 views
Skip to first unread message

Surya VaraPrasad Alla

unread,
Apr 19, 2024, 9:35:25 AM4/19/24
to tesseract-ocr
when using the fine tuned model, addressed below error:

(1, "read_params_file: Can't open tessedit_char_blacklist=,;: Error: Tesseract (legacy) engine requested, but components are not present in tesstrain/data/eng_pcb/eng_pcb.traineddata!! Failed loading language 'eng_pcb' Tesseract couldn't load any languages! Could not initialize tesseract.")

fine tuned model(below is the comment at the end of training, looks successful):
At iteration 393/10000/10017, Mean rms=0.188%, delta=0.284%, char train=1.149%, word train=2.877%, skip ratio=0.3%,  New worst char error = 1.149 wrote checkpoint.

Finished! Error rate = 1.116

lstmtraining \
--stop_training \
--continue_from ./data/eng_pcb/checkpoints/eng_pcb_checkpoint \
--traineddata ./data/eng_pcb/eng_pcb.traineddata \
--model_output data/eng_pcb.traineddata
Loaded file ./data/eng_pcb/checkpoints/eng_pcb_checkpoint, unpacking...

tesseract -v:
tesseract 4.1.1
 leptonica-1.82.0
  libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0
 Found AVX512BW
 Found AVX512F
 Found AVX2
 Found AVX
 Found FMA
 Found SSE
 Found libarchive 3.6.0 zlib/1.2.11 liblzma/5.2.5 bz2lib/1.0.8 liblz4/1.9.3 libzstd/1.4.8

Help:
Can someone look at what is missing in here?
Reply all
Reply to author
Forward
0 new messages