Hi People,
As I read in the documentation, the input to retraining tesseract was a line ( an image of line of text with accompanying groundtruth) Is it possible for me to train using Paragraphs, as the dataset contains groundtruths only paragraph-wise?
Will it help in increasing accuracy? Do you guys know of some tools to detect line in the paragraph?
I think, if I use OpenCV Image processing to separate a paragraph to text lines, It would fail for some images. Please suggest me a better solution if possible. Thanks in advance.
Regards,
Krishna Prasad A S