set_unicharset_properties per.Arial.exp0.box (by reading this issue: https://github.com/tesseract-ocr/tesseract/issues/318 and put Arabic.unicharset and Arabic.xheights in script_dir path ) set_unicharset_properties -U unicharset -O new_unicharset -X xheights --script_dir=/home/bita/langdata
and for testing the result I've taken a screen shot from one part of my training text and increase the resolution up to 300 dpi by GIMP (I tried to make an image that doesn't have noise) , but the accuracy is not good at all.
How can I increase the accuracy? which font size should I choose when I take the screenshot? the structure of Persian Language is much different from English, for example the shape of one character is modify depending on where it is locate in word (first, middle ,last) but in unicharset for all of these,the main character recognized. also the character are connected in words (somethings like handwritten in English)
so does Tesseract work for language like Persian or Arabic?