I would like to train tesseract on some sample images of text that I have.
When training tesseract on images, should I pre-process them as I would normally to improve inference, i.e. using black text on white background and binarization via thresholding?
Also can I reuse the pre-trained data for the training? If so, do you advise me to process the images? (and in which way?)
Thank you for your interest.