Adding new characters and Finetuning for a new font, how to preserve the accuracy? Merge models?

75 views

Skip to first unread message

RuePat07

unread,

Apr 10, 2025, 10:47:36 AM4/10/25

to tesseract-ocr

I was training tesseract to add an unusual font and some additional characters. However after finetuning on it, it has been hallucinating on default fonts. Is there a way to preserve the accuracy of the previous fonts and just include this one new font and the new characters?
When i trained with the new font and added characters, the finetuned model does good with these symbols and font both.
When i tried inferencing with combining these two traineddata file "eng_new+eng" or "eng+eng_new" they give poor results , the new added characters are mostly inferred wrong.

Is there a way to merge these models somehow?

Jamie sturgeon

unread,

Apr 11, 2025, 8:38:15 AM4/11/25

to tesser...@googlegroups.com

Try inferencing with a different combo and check tesseract settings.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/tesseract-ocr/70004fd7-72a5-4198-aecb-816ec7feda33n%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages