Hi
I'm KOREAN
I'm studying Tesseract 4.0
This page is very useful to study tesseract 4.0
But, I'm poor at Reading English & Understanding Tesseract training 4.0
In short, the next senentes cannot be understood by me.
Creating Training Data
As with base Tesseract, there is a choice between rendering synthetic training data from fonts, or labelling some pre-existing images (like ancient manuscripts for example). In either case, the required format is still the tiff/box file pair, except that the boxes only need to cover a textline instead of individual characters. 'Newline' boxes with tab as the character must be inserted between textlines to indicate the end-of-line. Multi-word boxes require a different box format, as the space would confuse the parser
I have no idea .... Could you explain this sentence to me & I want to see the example of the box file /tiff (by tesseract 4.0)
Thank you .