how to use tesseract-ocr for other language?

2,174 views
Skip to first unread message

hiral

unread,
Feb 19, 2009, 11:44:13 AM2/19/09
to tesseract-ocr
I have run tesseract-ocr code for english ocr. through command line as
well, through windows GUI.
but not getting, how can i start support other language through
tesseract ocr engine?

has anyone any exact steps for guidence?

i want to start it with Hindi fonts.

please help

hiral

unread,
Feb 19, 2009, 11:50:31 AM2/19/09
to tesseract-ocr
I have gone throug
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract

but not getting, how can use it?

also, bbtesseract.exe i tried, from this group it self, it works for
english, but i am not getting, how can i get boxs for hindi fonts?
or , how can i edit .txt file for given hindi .tif file?

i have seen, people have done so much gr8 work, but i am stuck to
start tesseract-ocr for other languages.

please help.
hiral

Lothar

unread,
Feb 19, 2009, 2:32:02 PM2/19/09
to tesseract-ocr
copy the language files to you tessdata folder.
You will get the following languages at http://code.google.com/p/tesseract-ocr/downloads/list
:

- spanish
- dutch
- italian
- french
- english
- german
- fraktur (old german)
- bangla
- portugese
- vietnamese

And call tesseract with the "-l" commandline

example: "tesseract test.tif result -l eng" for english language

Hope it helps....

Regards,

Lothar

www.dornieden.org

Debayan Banerjee

unread,
Feb 19, 2009, 2:33:20 PM2/19/09
to tesser...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages