Best way to train tesseract on images

56 views

Skip to first unread message

Nicolas Scotto Di Perto

unread,

Oct 11, 2019, 5:28:51 AM10/11/19

to tesseract-ocr

Hello,

I would like to train tesseract on some sample images of text that I have.

When training tesseract on images, should I pre-process them as I would normally to improve inference, i.e. using black text on white background and binarization via thresholding?

Also can I reuse the pre-trained data for the training? If so, do you advise me to process the images? (and in which way?)