Problem in tesseract for Urdu language

126 views
Skip to first unread message

Aniqa Dilawari

unread,
Mar 2, 2011, 6:01:26 AM3/2/11
to tesser...@googlegroups.com
I have trained tesseract for Urdu image (which is a multipage tif image having 20 pages: C.tif(it is zipped in .rar)) and boxfile (C.box) 
After training the data, i gave image Urdu4.tif for recognition. The output of the file is as outputC4.txt
In this file all the characters are not recognized. At position 2 the recognized id should be 665664 instead of 665663. 

How is it possible to find out which characters are not recognized by Tesseract? 

outputC4.txt
C.box
C.rar
Urdu4.rar

cust...@gmail.com

unread,
Mar 3, 2016, 12:44:41 PM3/3/16
to tesseract-ocr, aniqa.d...@gmail.com
listen anniqa i need help in urdu tesseract 
please reply
Reply all
Reply to author
Forward
0 new messages