Tesstraining Non-Word Traineddata File

54 views
Skip to first unread message

Gary

unread,
Jan 20, 2022, 3:47:01 PM1/20/22
to tesseract-ocr
I've already configured tesseract to not use dictionaries when processing non-word images, but what is the best practice for tesstraining a non-word traineddata file?

Currently, I'm tesstraining using tessdata_best and my corrected ground-truth.

In my situation, is it better not to training with tessdata_best?  Does tessdata_best contain word dictionaries that will needlessly bloat my traineddata file?

Thank you for your time and guidance.

Respectfully,


Gary
Reply all
Reply to author
Forward
0 new messages