Tesseract installation and training for a new font

102 views
Skip to first unread message

Tntpker

unread,
Dec 24, 2016, 3:23:03 AM12/24/16
to tesseract-ocr
Hi, I have 2 questions which I hope someone could help me with:

1. How do I correctly install tesseract for it to work for me in cmd on windows? Whenever I try to ruin it from cmd I keep getting errors. Also tried putting it in PATH environment but also didn't seem to work. Tesseract location is: c:\Program files(x86)\Tesseract-OCR.
2. So far I've incorporated tesseract in my script to detect text from image after preprocessing the image to almost completely reduce noise. It's fairly accurate, however I've found that  specific letters would get the wrong detection  for the specific font I am analysing (for example tesseract will always identify a g instead of a y for this specific font). I came across training tesseract and this tool powered by Anyline (http://ocr7.com/) that makes traineddata files for a specific font automatically. So I got this traineddata file for the specific font I am analyasing from the website, but have no idea how to use it to make it work for my script? From what Ive understood you need to put it in tessdata folder and then try to run it from cmd?

Hope anybody could help me out on these 2 questions. Thanks in advance.
Reply all
Reply to author
Forward
0 new messages