How to correct characters for Arabic Language in Tesseract 4.0 LSTM?

75 views
Skip to first unread message

Ahmad Moawad

unread,
Apr 9, 2017, 2:18:07 PM4/9/17
to tesseract-ocr

Hello All,

I use Tesseract 4.0 and the result for Arabic language is greater than previous version of Tesseract, but there are some errors related to some characters. My question is how to correct these characters, Should I use jTessBoxEditor 2.0 beta for that or not. Because i tried to correct some characters using jTessBoxEditor and copy the result to Tesseract directory.
Unfortunately no progress and Tesseract doesn't recognize the characters.

universal reseller

unread,
Apr 9, 2017, 2:38:26 PM4/9/17
to tesser...@googlegroups.com
tesseract 4 is in alpha testing
at the near future ray/google will release an other build for this version
​some changes will improve right to left languages trainging..

Quan Nguyen

unread,
Apr 10, 2017, 11:34:48 PM4/10/17
to tesseract-ocr
Once you correct the box file using the editor, you'll then have to manually execute the commands and/or scripts for training as depicted in https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00 .
Reply all
Reply to author
Forward
0 new messages