How are the "tessdata_best" models created?

30 views

Skip to first unread message

Julian Gilbey

unread,

Aug 12, 2019, 5:27:58 PM8/12/19

to tesseract-ocr

I've been reading the wiki, and it says in the explanation of the training process (in the "Using tesstrain.sh" section):

"For making a general-purpose LSTM-based OCR engine, it is woefully inadequate, but makes a good tutorial demo."

So my two questions are: (1) in what ways is this "woefully inadequate", and (2) how are the tessdata_best models made? Is it just that they are trained for many more iterations in the same way, perhaps using more fonts? Or is there more to it than that?