When using text2image for training, I get the error: Could not find font named... how can I know the correct name of a font?

709 views
Skip to first unread message

Ernesto Borio

unread,
Feb 15, 2018, 3:03:20 AM2/15/18
to tesseract-ocr
When using text2image for training, I get the error:
 

$ text2image --text=charset.txt --outputbase=[eng].[HeroicCondensedBoldRegular].exp0 --font='Heroic Condensed Bold Regular' --fonts_dir=.


(process:29818): Pango-WARNING **: couldn't load font "Heroic Bold Condensed", modified variant/weight/stretch as fallback, expect ugly output.

Could not find font named Heroic Condensed Bold Regular.

Pango suggested font Helvetica.

Please correct --font arg.


The font is Heroic Condensed Bold Regular
At least, that's the full name that MacOS returns when I get info on the font.

What's wrong here? Am I naming the font incorrectly?

I'm following this documentation:
https://github.com/tesseract-ocr/tesseract/wiki/Training-Tesseract#questions-about-the-training-process

Thanks!

ShreeDevi Kumar

unread,
Feb 15, 2018, 3:21:28 AM2/15/18
to tesser...@googlegroups.com
You can check available fonts on your system by using --find_fonts with text2image, to find font names used by tesseract

example command with output - please modify path to match your setup

text2image --find_fonts --text ./langdata/eng/eng.training_text --outputbase ./langdata/eng/  --min_coverage 0.999  --fonts_dir=./fonts/


Total chars = 6694
Font AA_NAGARI_SHREE_L3 failed with 1865 hits = 27.86%
Adobe Devanagari : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 0 to file ./langdata/eng/.Adobe_Devanagari.tif
Akchyarunicode : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 1 to file ./langdata/eng/.Akchyarunicode.tif
Akchyarunicode : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 2 to file ./langdata/eng/.Akchyarunicode.tif
Arial Heavy : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 3 to file ./langdata/eng/.Arial_Heavy.tif
Arial Italic Condensed : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 4 to file ./langdata/eng/.Arial_Italic_Condensed.tif
Arial Unicode MS : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 5 to file ./langdata/eng/.Arial_Unicode_MS.tif
Font BRH Devanagari failed with 6666 hits = 99.58%
Font BRH Devanagari Extra failed with 6666 hits = 99.58%
Font BRH Devanagari RN failed with 6666 hits = 99.58%
Calibri Bold Italic : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 6 to file ./langdata/eng/.Calibri_Bold_Italic.tif
Charter Indologique Capital : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 7 to file ./langdata/eng/.Charter_Indologique_Capital.tif
Courier New Italic : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 8 to file ./langdata/eng/.Courier_New_Italic.tif
Font GIST-DVOTKishor failed with 6645 hits = 99.27%
Font GIST-DVOTMohini failed with 6645 hits = 99.27%
Font GIST-MROTDhruv failed with 6645 hits = 99.27%
Font GIST-MROTVinit failed with 6645 hits = 99.27%
Font GIST-SDOTDhruv failed with 6645 hits = 99.27%
Font GIST-SDOTVinit failed with 6645 hits = 99.27%
Lohit Devanagari : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 9 to file ./langdata/eng/.Lohit_Devanagari.tif
Old Standard Indologique Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 10 to file ./langdata/eng/.Old_Standard_Indologique_Bold.tif
Segoe UI : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 11 to file ./langdata/eng/.Segoe_UI.tif
Segoe UI Heavy : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 12 to file ./langdata/eng/.Segoe_UI_Heavy.tif
Font Sharad76 failed with 1510 hits = 22.56%
Shobhika : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 13 to file ./langdata/eng/.Shobhika.tif
Shobhika Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 14 to file ./langdata/eng/.Shobhika_Bold.tif
Tahoma Bold : 6694 hits = 100.00%, raw = 112 = 100.00%
Rendered page 15 to file ./langdata/eng/.Tahoma_Bold.tif
Font Yashomudra failed with 6648 hits = 99.31%
Font Yashomudra Bold failed with 6648 hits = 99.31%
Font Yashomudra Bold Italic failed with 6648 hits = 99.31%
Font Yashomudra Italic failed with 6648 hits = 99.31%
Font YashomudraLight failed with 6648 hits = 99.31%
Font YashomudraLight Italic failed with 6648 hits = 99.31%
Font YashomudraMedium failed with 6648 hits = 99.31%
Font YashomudraMedium Italic failed with 6648 hits = 99.31%
Font YashomudraSemiBold Bold failed with 6648 hits = 99.31%
Font YashomudraSemiBold Bold Italic failed with 6648 hits = 99.31%


ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/56836dc3-edd2-4712-b2a8-e69c8c478a0f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages