Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Tesseract to train for our custom image is not working as expected.

26 views
Skip to first unread message

Rajeswari Gopal

unread,
Apr 6, 2025, 11:36:41 PMApr 6
to tesseract-ocr
Hi Team,
We are trying to create  custom trained data for our own Tiff image.
These are the steps followed

1) created the box file using the below command
tesseract customimage.tif -1 eng lstmbox

2) created the lstm file for the box file created
tesseract customimage  boxfile -psm 6 lstm.train

3 ) lstmtraining --continue_from eng.lstm --traineddata tessdata/eng.traineddata --train_listfile trainingfile.txt --model_output custom_model --debug_level -1

4) lstmtraining --stop_training --continue_from Custom_Model_checkpoint --traineddata tessdata/eng.traineddata --<model_output CustomModel.TrainedData

I'm able to generate the custom_traineddata , but it is coping the unicharset of the eng.traineddata and the related files

Not taking the unicharset for the custom box file created for our custom image.

Our requirement is to create custom trained data with the base as eng.traineddata and use the custom unicharset and the related files created for our own custom image

Correct me , if there is any issue with my understanding.

Any help would be greatly appreciated


Reply all
Reply to author
Forward
0 new messages