Arabic OCR

86 views
Skip to first unread message

Baraah Kassab

unread,
Nov 25, 2025, 6:15:39 AMNov 25
to tesseract-ocr
I have problem with the following file 
the Arabic number not appear

i am used tesseract 5 with C# code 

the result was 


١‏ التخطيط للمشروع ‎١‏ تحدید الاهداف ‎١‏ وضع الجدول الزمني ‎١‏ . الجدول الزمني ۲ ,. الجدول الزمني ‎.٢‏ التخطيط للمشروع ١ر۲‏ تحدید الاهداف ۲ . وضع الجدول الزمني ‎.۲,۲١‏ الجدول الزمني ۲ . الجدول الزمني ۳ ,. الجدول الزمني ۳ الجدول الزمني
testarabic number (1).pdf

Baraah Kassab

unread,
Nov 27, 2025, 1:16:10 AMNov 27
to tesseract-ocr
this is a very critical issue on my application , need urgent hrlp

Sara Elshobaky

unread,
Nov 27, 2025, 4:55:22 AMNov 27
to tesser...@googlegroups.com
Try to use the Arabic.traineddata Script model instead of the ara model, it better handles the arabic (hindi) digits 

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/d7ea424b-a46b-488b-8d1f-e656f71c120cn%40googlegroups.com.

Baraah Kassab

unread,
Nov 30, 2025, 12:46:20 AMNov 30
to tesseract-ocr
Dear, 
this is a main train data file or i need to combine it with other train data?

when i use it with my c# code it cause the following Error
Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details.

and notice that i change the file name to be ara.traineddata, it is this correct or i need to keep the name as is , and what the language i will pass it to engine??


Reply all
Reply to author
Forward
Message has been deleted
0 new messages