How to add some new chars to traineddata?

105 views
Skip to first unread message

Xun Huang

unread,
Oct 13, 2022, 1:00:15 AM10/13/22
to tesseract-ocr
I got a traineddata that contain 0-9 digits, but i wanna add some symbols such as "/","%".

Is impossible to add some new symbols to traineddata but don`t retrain?

Saman Kurdi

unread,
Oct 13, 2022, 2:21:24 AM10/13/22
to tesser...@googlegroups.com
Hello

This might help


On Thu, Oct 13, 2022 at 07:00 Xun Huang <huang...@163.com> wrote:
I got a traineddata that contain 0-9 digits, but i wanna add some symbols such as "/","%".

Is impossible to add some new symbols to traineddata but don`t retrain?

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8728d329-6072-4150-9004-b3a35fdef66bn%40googlegroups.com.

Shree Devi Kumar

unread,
Oct 16, 2022, 1:59:56 PM10/16/22
to tesseract-ocr
Thank you for sharing information regarding successful training of Kurdish traineddata for Tesseract.

Please also let us know whether the traineddata is available for others to use. You may want to contribute to the tess_contrib repo.

Let us know whether the recognition covers 0-9 digits in Arabic/Persian/English and the accuracy for those.




Saman Kurdi

unread,
Oct 16, 2022, 6:11:35 PM10/16/22
to tesser...@googlegroups.com
It covers 0-9 number with lower accuracy rate, so far GT files and the images are publicly available on  

Shree Devi Kumar

unread,
Oct 17, 2022, 1:15:49 PM10/17/22
to tesseract-ocr
Thank you.

When ready, please add links  to https://github.com/tesseract-ocr/tessdata_contrib so that others can easily find your work.

Reply all
Reply to author
Forward
0 new messages