Tesseract with Thai language

46 views
Skip to first unread message

pvsk....@gmail.com

unread,
Jan 28, 2019, 2:50:30 AM1/28/19
to tesseract-ocr
Hi,

I am using Tesseract OCR v 4 for extracting text form an Thai language image file. I am able to extract the Thai characters perfectly on Windows environment whereas when I extract the same on Ubuntu I found spaces between the characters in the extracted text.

Can any one help me out on this?

Thanks in advance.

KM

易鑫

unread,
Jan 29, 2019, 11:34:55 PM1/29/19
to tesseract-ocr
Please upload your image file,I  can try in my environment.

<pvsk....@gmail.com> 于2019年1月28日周一 下午3:50写道:
--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/4538ee12-d9ab-4851-a8f3-bdbb8a8f3ffd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Shree Devi Kumar

unread,
Jan 30, 2019, 3:06:42 AM1/30/19
to tesser...@googlegroups.com
> I am able to extract the Thai characters perfectly on Windows environment whereas when I extract the same on Ubuntu I found spaces between the characters in the extracted text.

What are the exact versions of tesseract in both environments?

`tesseract -v`

Also, which trineddata file are you using on each (tessdata, tessdata_best or tessdata_fast)



For more options, visit https://groups.google.com/d/optout.


--

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
Reply all
Reply to author
Forward
0 new messages