Hello,
i am actually using Tesseract 3.02 (Windows) to recognize a large number of pages (few millions).
I am actually launching multiple Tesseract process with Python, by launching multiple .bat files that loop over directories.
What the .bat files really do is that they call the tesseract command for each file (For %%A in (%_SourcePath%) Do %_Tesseract% %%A %_OutputPath%%%~nA)
Intuitively, i am thinking of the cost of this call as Init_cost + Process_cost, so that the total cost for N files is N * ( Init_cost + Process_cost)
Would the API be faster for this task ? (I think, by making an Init only one time when the new instance is created, so that total cost is Init_cost + N * Process_cost ?
Many thanks in advance for your usefull work :-)