Thai box-tiff

100 views
Skip to first unread message

cpx0rpc

unread,
Feb 29, 2012, 8:13:22 PM2/29/12
to tesseract-dev
Hello,
I'm Phakpoom (Patrick) Chinprutthiwong. I wonder if you can send me
the .box, and .tiff files for Thai language. My friends and I are
thinking about improving the OCR for Thai language, but we can't find
it on the download page. And if it's not too much too ask, I want to
know what is the best way to deal with language like Thai, which has
multiple characters above and below each other. Is it possible to make
a word list for every possible arrangement of the two characters?

Thank you,
Phakpoom (Patrick) Chinprutthiwong

Pavel Mazniker

unread,
Apr 28, 2012, 8:54:48 AM4/28/12
to tesser...@googlegroups.com
Hi,
 
I am also interesting in thai language text recognition in complex patterns in images.
 
How can the .box and .tiff files help to improve OCR for thai language ?
 
Thanks.
 

 

Pavel Mazniker

unread,
May 7, 2012, 1:29:07 AM5/7/12
to tesser...@googlegroups.com
Hi,

Where can I get

cube language model params tha.cube.lm


for Thai language ?


Because when I put tess.Init("tessdata",thai,tesseract::OEM_TESSERACT_CUBE_COMBINED) I got :


Cube ERROR (CubeRecoContext::Load): unable to read cube language model params from /usr/local/share/tessdata/tha.cube.lm

Cube ERROR (CubeRecoContext::Create): unable to init CubeRecoContext object

init_cube_objects(true, &tessdata_manager):Error:Assert failed:in file tessedit.cpp, line 211


And there is only tha.trainddata file available to download for Thai language yet.


Thanks in advance.
Reply all
Reply to author
Forward
0 new messages