No recognizing Arabic numbers, but recognizes letters

47 views

Skip to first unread message

Aijolomohi Egwaikhide

unread,

Apr 6, 2020, 2:12:22 PM4/6/20

to tesseract-ocr

Hi, I am working with OCR to recognize arabic words and numbers(dates) from a scanned pdf (I have done some enhancing on it), but it cant seem to accurately read arabic numbers but it reads the letters properly. what can i do to make this better?

An example of a numbers (in date format)

Screen Shot 2020-04-06 at 10.41.21 AM.png

Screen Shot 2020-04-06 at 10.42.42 AM.png

The first one is the input, the second one is the output and it read it as English even though i specify that it is all arabic - when I put the document in google docs and read as a word document, it reads it fine

Reply all

Reply to author

Forward

0 new messages