I saw that tesseract make the mistakes of turning russian vowels with accents(ò,à,...)(used for educational purposes most of the time) into other russian letters, and saw that someone, with the same problem, had created trained data(if i understood correctly) for russian with accents
The problem is, i can not find a way to make it a traineddata file, to test it and later use it in my code. I found the tesstrain git, but was not able to make it work with the data found.
I honestly don't know if I am missing something, not understanding correctly something, or if we simply don't train data with these types of files anymore.
If you got any clue, that would help me a lot.
Thank you!
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/201355ba-dafd-49fd-b0a7-3b42fd8175d8n%40googlegroups.com.
I'm trying to fix this issue. By what i have read, i think i need to re-train the russian language in tesseract for it to support accents.I found this folder in langdata, but can't find a way to use it to re-train the russian language.How can i use the rus_accent folder and its files to easily re-train the russian language ?