Hello,
I am trying to train the Tesseract 4.0 with LTSM on Arabic/Hindi Digits in windows OS. I found that I need to create box file. Thus, I'm using JTessBoxEditor 2.0 for creating tiff and box files. However, it fails when I used JTessBoxEditor 2.0 to generate the .traindata file. Note that I choose combine_tessdata.exe as tesseract executable, ara.arial.exp0.box as training data, and training with existing box as a training mode.
The output is the followings:
esseract Open Source OCR Engine v4.0.0-beta.1-108-gf291 with Leptonica
Page 1
Bad box coordinates in boxfile string! ١ ٤٥٤ ٣١٦٣ ٤٦٣ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٢ ٤١٣ ٣١٦٣ ٤٢٨ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣١٦٣ ٣٩٣ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣١٦٣ ٣٥٠ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣١٦٨ ٣١٤ ٣١٨٥ ٠
Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣١٦٣ ٢٧٣ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٧ ٢١٩ ٣١٦٣ ٢٣٨ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٨ ١٨٠ ٣١٦٣ ٢٠٠ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٩ ١٤٥ ٣١٦٣ ١٥٩ ٣١٩٠ ٠
Bad box coordinates in boxfile string! ٠ ١٠٩ ٣١٦٧ ١١٧ ٣١٧٨ ٠
Bad box coordinates in boxfile string! ١ ٤٥٤ ٣٠١٥ ٤٦٣ ٣٠٤٢ ٠
Bad box coordinates in boxfile string! ٢ ٤١٣ ٣٠١٥ ٤٢٨ ٣٠٤٢ ٠
Bad box coordinates in boxfile string! ٣ ٣٧٣ ٣٠١٥ ٣٩٣ ٣٠٤٢ ٠
Bad box coordinates in boxfile string! ٤ ٣٣٨ ٣٠١٥ ٣٥٠ ٣٠٤٢ ٠
Bad box coordinates in boxfile string! ٥ ٢٩٨ ٣٠٢٠ ٣١٤ ٣٠٣٧ ٠
Bad box coordinates in boxfile string! ٦ ٢٥٨ ٣٠١٥ ٢٧٣ ٣٠٤٢ ٠
Could you please tell me where I did wrong or how to fix this error?
Best Regards,
Marwa M. Khan