What is the difference between script *.traineddata and normal *.traineddata models

286 views
Skip to first unread message

Essam Zaky

unread,
Mar 19, 2020, 12:42:19 PM3/19/20
to tesseract-ocr
Hi Dears

What is the difference between script *.traineddata and normal *.traineddata models ?
example their are script Arabic.traineddata and ara.traineddata

and when to use them ?


Best Regards
Essam

Shree Devi Kumar

unread,
Mar 19, 2020, 9:09:02 PM3/19/20
to tesseract-ocr
Script traineddata have been trained on all languages written in that script plus English. So Script/Arabic would have been trained with ara, fas, urd  (etc) + eng .Please check the Readme file in tessdata_best / tessdata_fast repo for explanation by Ray.

You have to try both for your use case to see which works better. 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/f29d4fef-cc7f-4582-8720-4861d922ce92%40googlegroups.com.

Essam Zaky

unread,
Mar 20, 2020, 8:13:35 AM3/20/20
to tesseract-ocr
Thanks @Shreeshrii

So the following commands recognize Arabic/English text
tesseract AE.jpg AE1 -l ara+eng
tesseract AE.jpg AE2 -l script/Arabic

Shree Devi Kumar

unread,
Mar 20, 2020, 8:28:20 AM3/20/20
to tesseract-ocr
Yes and the result of the two commands could be different.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages