Hi Shreeshrii,
I took your command exactly as it is and ran it (made sure the tessdata_best directory is present in $HOME
with best ben.traineddata) and ran into an extremely weird error.
Here is the log:
find data/ben-ground-truth -name '*.gt.txt' | xargs cat | sort | uniq > "data/ben/all-gt"
combine_tessdata -u /root/tessdata_best/ben.traineddata data/ben/ben
Version string:4.00.00alpha:ben:synth20170629:[1,48,0,1Ct3,3,16Mp3,3Lfys64Lfx64Lrx64Lfx512O1c1]
0:config:size=377, offset=192
17:lstm:size=10605707, offset=569
18:lstm-punc-dawg:size=3154, offset=10606276
19:lstm-word-dawg:size=427618, offset=10609430
20:lstm-number-dawg:size=426, offset=11037048
21:lstm-unicharset:size=6866, offset=11037474
22:lstm-recoder:size=1003, offset=11044340
23:version:size=80, offset=11045343
Extracting tessdata components from /root/tessdata_best/ben.traineddata
Wrote data/ben/ben.config
Wrote data/ben/ben.lstm
Wrote data/ben/ben.lstm-punc-dawg
Wrote data/ben/ben.lstm-word-dawg
Wrote data/ben/ben.lstm-number-dawg
Wrote data/ben/ben.lstm-unicharset
Wrote data/ben/ben.lstm-recoder
Wrote data/ben/ben.version
unicharset_extractor --output_unicharset "data/ben/my.unicharset" --norm_mode 2 "data/ben/all-gt"
Bad box coordinates in boxfile string! কি জানি কেন প্রদ্যুম্নের বার বার মনে আসছিল সেই জীর্ণ পরিচ্ছদপরা
Extracting unicharset from plain text file data/ben/all-gt
Wrote unicharset file data/ben/my.unicharset
merge_unicharsets data/ben/ben.lstm-unicharset data/ben/my.unicharset "data/ben/unicharset"
Loaded unicharset of size 111 from file data/ben/ben.lstm-unicharset
Loaded unicharset of size 76 from file data/ben/my.unicharset
Wrote unicharset file data/ben/unicharset.
PYTHONIOENCODING=utf-8 python3 generate_wordstr_box.py -i "data/ben-ground-truth/24-022.tif" -t "data/ben-ground-truth/24-022.gt.txt" > "data/ben-ground-truth/24-022.box"
Traceback (most recent call last):
File "generate_wordstr_box.py", line 7, in <module>
import bidi.algorithm
ModuleNotFoundError: No module named 'bidi'
Makefile:207: recipe for target 'data/ben-ground-truth/24-022.box' failed
make: *** [data/ben-ground-truth/24-022.box] Error 1
I should mention I double checked the 24-022.gt.txt and 24-022.tif files and both of them are valid. Any reason why this might be happening? How can I fix this?