Wolf Assi
unread,Mar 10, 2022, 3:35:00 AM3/10/22Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
I have noticed that the "ara-Scheherazade" trained data was trained for the "Traditional Arabic" font. I have tried it, it performs well but with low accuracy, and has a problem when it comes to arabic numerals as the numbers are inverted. I want to fix the issue. I have tried to fine-tune it for it to better suit my data, but the fine-tuning is not working as it is also mentioned in the documentation that in order to fine-tune, I need to use the trained data found in the tess_data best repo.
The main aim I'm trying to achieve is to manage to recognize both arabic letters and numbers. I know that there is a small issue with tesseract concerning both arabic letters and numbers, but the fact that the "ara-Scheherazade" font manages to recognize both but with a low accuracy means that it can be done, and I want to try and make it better. So does anyone know what can I do??