Improving speed of a fine-tuned tessdata-best data file

292 views
Skip to first unread message

Kunal Singh

unread,
May 17, 2020, 1:43:30 AM5/17/20
to tesseract-ocr
Hello,

I am using a fine-tuned traineddata file (from tessdata_best). But its' speed is lot slower than tessdata (legacy+LSTM) or tessdata_fast.
Now, is there any way to make the fine-tuned traineddata file faster, by sacrificing slight accuracy? Can we possibly reduce some of the layers of LSTM model?
Any suggestions would be great.

Regards,
Kunal

Shree Devi Kumar

unread,
May 17, 2020, 2:12:00 AM5/17/20
to tesseract-ocr
Convert it to a fast model.

combine_tessdata -c

compresses the traineddata file.

You can also do it when you stop lstmtraining with --convert-to-int flag

Please check syntax -

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b51213f9-9a3c-4ede-807c-036136e2e8c6%40googlegroups.com.

Kunal Singh

unread,
May 17, 2020, 3:16:39 AM5/17/20
to tesseract-ocr
Thanks for the reply. It worked. 
combine_tessdata -c <disk-location/of/current/eng.traineddata>


On Sunday, May 17, 2020 at 11:42:00 AM UTC+5:30, shree wrote:
Convert it to a fast model.

combine_tessdata -c

compresses the traineddata file.

You can also do it when you stop lstmtraining with --convert-to-int flag

Please check syntax -

On Sun, May 17, 2020, 11:13 Kunal Singh <singh.k...@gmail.com> wrote:
Hello,

I am using a fine-tuned traineddata file (from tessdata_best). But its' speed is lot slower than tessdata (legacy+LSTM) or tessdata_fast.
Now, is there any way to make the fine-tuned traineddata file faster, by sacrificing slight accuracy? Can we possibly reduce some of the layers of LSTM model?
Any suggestions would be great.

Regards,
Kunal

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesser...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages