How to convert hocr to MS word .docx file

1,007 views
Skip to first unread message

abdu

unread,
May 2, 2018, 7:42:04 PM5/2/18
to tesseract-ocr
Is there a program that's ready to convert hocr file to MS word .docx or .doc file ?
thanks in advance..

Zdenko Podobny

unread,
May 3, 2018, 2:49:53 AM5/3/18
to tesser...@googlegroups.com
MS word ;-)
  1. rename test.hoct to test.hocr.html
  2. open test.hocr.html in real text editor (e.g. notepad++) and delete lines 2 and 3 otherwise word will produce error message
  3. open  test.hocr.html in word.

Zdenko


št 3. 5. 2018 o 1:42 abdu <budik...@gmail.com> napísal(a):
Is there a program that's ready to convert hocr file to MS word .docx or .doc file ?
thanks in advance..

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/43c6b2a6-f950-4ecf-b5e8-c07334ba1504%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages