Training with Font Files

79 views
Skip to first unread message

Rob

unread,
Nov 28, 2018, 12:02:44 PM11/28/18
to tesseract-ocr
Hello,

i want to create a traineddata file based on a few different fonts. I'm using Tesseract 4.0 with LSTM.
Whats the easiest way? Is there a Tool to train Tesseract with font files directly (.tff- files) or do i have to create Text images based on the Font and then use those to train?

Thanks in advance.

Raniem

unread,
Nov 30, 2018, 7:22:20 AM11/30/18
to tesseract-ocr
You can use tesstrain.sh where you pass the font name you are trying to use after adding this font to your system.
Complete details are mentioned here Please check Use Tesstrain part for your reference .

Training data is created using tesstrain.sh as follows: Note that your fonts location may vary.

src/training/tesstrain.sh --fonts_dir /usr/share/fonts --lang eng --linedata_only \
  --noextract_font_properties --langdata_dir ../langdata \
  --tessdata_dir ./tessdata --output_dir ~/tesstutorial/engtrain


Good Luck
Regards 
Reply all
Reply to author
Forward
0 new messages