Deserialize header failed: while making lstmf file

240 views
Skip to first unread message

Meet Yogi

unread,
Sep 23, 2021, 3:07:26 AM9/23/21
to tesseract-ocr
I'm using command
tesseract tiff_file_path name_of_lstm_file lstm.train
for example
tesseract batch3.tiff batch3 lstm.train

while doing so I'm getting the following error
Tesseract Open Source OCR Engine v4.1.1 with Leptonica
Page 1
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 335
Empty page!!
Page 2
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 316
Deserialize header failed: /home/petpooja/meet/ocr/tesseract_annotated_images/training_data_phase_1/cn_id_in_gstno_cleaned_images/batch3.lstmf
Failed to read training data from /home/petpooja/meet/ocr/tesseract_annotated_images/training_data_phase_1/cn_id_in_gstno_cleaned_images/batch3.lstmf!
Error during processing.

How can it be solved?
I have checked the box file and verified annotations there is no issue with it

Sim Tov

unread,
Sep 23, 2021, 8:36:44 AM9/23/21
to tesseract-ocr
What is inside your training_text file? I had similar issue when the lines in that file were too long... try to make them as short as 5-7 words (and then break it with a newline)
Reply all
Reply to author
Forward
0 new messages