No recognizing Arabic numbers, but recognizes letters

47 views
Skip to first unread message

Aijolomohi Egwaikhide

unread,
Apr 6, 2020, 2:12:22 PM4/6/20
to tesseract-ocr
Hi, I am working with OCR to recognize arabic words and numbers(dates) from a scanned pdf (I have done some enhancing on it), but it cant seem to accurately read arabic numbers but it reads the letters properly. what can i do to make this better?

An example of a numbers (in date format) 

Screen Shot 2020-04-06 at 10.41.21 AM.png

Screen Shot 2020-04-06 at 10.42.42 AM.png

The first one is the input, the second one is the output and it read it as English even though i specify that it is all arabic - when I put the document in google docs and read as a word document, it reads it fine
Reply all
Reply to author
Forward
0 new messages