There is no way to add one font to traineddata, it has to be done on all fonts at one time
The box tiff pairs are not provided by Google developers who provided the traineddata (some fonts are proprietary).
You can attempt to recreate the training using the source files from langdata repository. The lists of fonts used for each language is in langugage-specific.sh linked from tesstrain.sh in tesseract/training
I have tested the traineddata concatination using "+" operator. But my question is whether there exist a way to train one more font to existing trained data ? or does it need entire box-tif pairs at each time ? if required, from where can I get this box-tif training data set of current malayalam and english traineddata ?
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/b853579f-ed5f-4acb-9835-4102517a8a85%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.