pango library doesn't recognize my font .

82 views
Skip to first unread message

이경준

unread,
Mar 13, 2018, 6:40:27 AM3/13/18
to tesseract-ocr
Hi. my name is june. Hi shree. I have a question. I'm using bash script you gave me.


in the script ..


# the EVAL handles the quotes in the font list
eval $tesstrain_dir/tesstrain.sh \
   --lang $Lang \
   --linedata_only\
   --noextract_font_properties \
   --exposures "0" \
   --fonts_dir $fonts_dir \
   --fontlist $fonts_for_training \
   --langdata_dir $langdata_dir \
   --training_text $langdata_dir/$Lang/$Lang.$plusTraining_text \
   --tessdata_dir $bestdata_dir \
   --output_dir $train_output_dir
 
P.S everything variables is assgined. and (e.g. fonts_for_training="Baekmuk Batang")


Run script(above). But I have an error . It doesn't work


So I have to delete " --fontlist $fonts_for_training " and  I make a pair of tesstrain1.sh & language-specific1.sh (for training_fonts) 

In this case It does work. 


I review my system (ubuntu 16.04.03 LTS) $ fc-list 

korean. 

I have lots of korean fonts 

But, it doesn't work 

Why pango library doesn't recognize the fonts I installed.


이경준

unread,
Mar 13, 2018, 6:41:54 AM3/13/18
to tesseract-ocr


2018년 3월 13일 화요일 오후 7시 40분 27초 UTC+9, 이경준 님의 말:
tesstrain1.sh
language-specific1.sh

ShreeDevi Kumar

unread,
Mar 13, 2018, 6:48:44 AM3/13/18
to tesser...@googlegroups.com
remove these two lines and try

   --fonts_dir $fonts_dir \
   --fontlist $fonts_for_training \


this overrides what is given in language-specific.sh

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4783cb3a-09ad-47dd-8d0f-099c2fdfafe6%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

이경준

unread,
Mar 13, 2018, 7:02:00 AM3/13/18
to tesseract-ocr
Thank U. I have lots of  Korean fonts, But, Only baekmuk fonts do work .

but, I really want to know why pango library. doesn't recognize.......

2018년 3월 13일 화요일 오후 7시 48분 44초 UTC+9, shree 님의 말:
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

ShreeDevi Kumar

unread,
Mar 13, 2018, 8:09:45 AM3/13/18
to tesser...@googlegroups.com
Give the following command - after changing directories to match your setup

text2image --find_fonts \
--fonts_dir /usr/share/fonts \
--text ../langdata/kor/kor.training_text \
--min_coverage .9  \
--render_per_font false \
--outputbase ../langdata/kor/kor \
|& grep raw | sed -e 's/ :.*/" \\/g'  | sed -e 's/^/  "/' >../langdata/kor/fontslist.txt

and then check the selected fonts in 
../langdata/kor/fontslist.txt


이경준

unread,
Mar 13, 2018, 12:02:11 PM3/13/18
to tesseract-ocr
Thank U . I have a fontslist file 

but vim fontlist.txt 

There are no fonts ?? 

It means that I cannot use korena fonts?? 

2018년 3월 13일 화요일 오후 9시 9분 45초 UTC+9, shree 님의 말:

ShreeDevi Kumar

unread,
Mar 13, 2018, 12:07:11 PM3/13/18
to tesser...@googlegroups.com
Did you use the fonts_dir where they are installed???

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

이경준

unread,
Mar 13, 2018, 12:27:46 PM3/13/18
to tesseract-ocr
yes .... ㅜㅜ 

and I saw the this issue _ github https://github.com/tesseract-ocr/tesseract/issues/688

2018년 3월 14일 수요일 오전 1시 7분 11초 UTC+9, shree 님의 말:
Message has been deleted

이경준

unread,
Mar 13, 2018, 12:35:20 PM3/13/18
to tesseract-ocr



Thank U . I delete last line you taught me . 

I can see  lots of korean fonts before

In there everything can be used for training??? 

but I have an error 

argument fonts "(specifi_font') are not assigned (like that) 

ShreeDevi Kumar

unread,
Mar 13, 2018, 1:16:05 PM3/13/18
to tesser...@googlegroups.com
change double quote to single quote

" to '

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages