Tesseract output format: doc or docx

834 views
Skip to first unread message

ss.suk...@gmail.com

unread,
Mar 22, 2018, 2:47:00 AM3/22/18
to tesseract-ocr
Can I use tesseract in Ubuntu to get .docx or .doc output(word format).

Currently .txt output is received from tesseract.

Zdenko Podobny

unread,
Mar 22, 2018, 3:13:06 AM3/22/18
to tesser...@googlegroups.com
tesseract can produce output in txt, pdf and hocr (html). 
Tesseract focus is to provide ocr engine and not complex document output like docx or ods.

Zdenko

2018-03-22 7:47 GMT+01:00 <ss.suk...@gmail.com>:
Can I use tesseract in Ubuntu to get .docx or .doc  output(word format).

Currently .txt output is received from tesseract.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscribe@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/83e0b0ab-dddb-43b5-a5d5-5389572160d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages