Could not find font named AR PL UMing Patched Light

87 views
Skip to first unread message

roberty...@gmail.com

unread,
Jul 25, 2017, 9:47:27 PM7/25/17
to tesseract-ocr
Hello,

I'm trying to train my own traineddata with Tess4.0 following the tutorail: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Replace-Top-Layer

When executing the command:
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang chi_sim \
--training_text ../training_data/part.txt \
--linedata_only --noextract_font_properties \
--langdata_dir ../langdata --tessdata_dir ./tessdata \
--output_dir ~/tesstutorial/chisim

An error appears: "Could not find font named AR PL UMing Patched Light", showed in the follow img.

Then I search for the package of "AR PL UMing Patched Light.ttf" with Baidu, Google and some other search engines, but cannot find the result.

Can you help me? I don't know if there are other solutions for this problem.

ShreeDevi Kumar

unread,
Jul 25, 2017, 11:31:23 PM7/25/17
to tesser...@googlegroups.com

You need to update it to match the fonts available with you for the script you are training and include the correct location for the fonts directory.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/825ee74a-854f-4a46-b911-3e3c6bd56427%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

roberty...@gmail.com

unread,
Jul 26, 2017, 1:39:06 AM7/26/17
to tesseract-ocr
Yeah, I know that. But I lack the font of AR PL UMing Patched Light, which cannot be found in the Internet.

I'm afraid that I may need to find this package (the font of AR PL UMing Patched Light) from you. If you don't mind sharing your resources, thanks sincerely.

在 2017年7月26日星期三 UTC+8上午11:31:23,shree写道:

You need to update it to match the fonts available with you for the script you are training and include the correct location for the fonts directory.

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Wed, Jul 26, 2017 at 7:17 AM, <roberty...@gmail.com> wrote:
Hello,

I'm trying to train my own traineddata with Tess4.0 following the tutorail: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00---Replace-Top-Layer

When executing the command:
training/tesstrain.sh --fonts_dir /usr/share/fonts --lang chi_sim \
--training_text ../training_data/part.txt \
--linedata_only --noextract_font_properties \
--langdata_dir ../langdata --tessdata_dir ./tessdata \
--output_dir ~/tesstutorial/chisim

An error appears: "Could not find font named AR PL UMing Patched Light", showed in the follow img.

Then I search for the package of "AR PL UMing Patched Light.ttf" with Baidu, Google and some other search engines, but cannot find the result.

Can you help me? I don't know if there are other solutions for this problem.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.

ShreeDevi Kumar

unread,
Jul 26, 2017, 2:48:13 AM7/26/17
to tesser...@googlegroups.com
I do not have this font.

The training is done at Google. They probably use a number of commercial fonts in addition to freely available fonts. The fonts are not provided as part of the training data.

You have to get your own set of fonts to train or wait for the new traineddata by Ray (expected in next few weeks).

ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.

To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.

roberty...@gmail.com

unread,
Jul 26, 2017, 3:34:21 AM7/26/17
to tesseract-ocr
OK. Thanks for the reply from Shree sincerely.

在 2017年7月26日星期三 UTC+8下午2:48:13,shree写道:
Reply all
Reply to author
Forward
0 new messages