hocr output file extension in tesseract 3.03

220 views
Skip to first unread message

ppo...@upei.ca

unread,
May 5, 2014, 2:25:33 PM5/5/14
to tesser...@googlegroups.com
It looks like the hocr output file extension changed from .html to .hocr (in the install packages available for ubuntu 14.04).  Can I change this back to .html in my hocr config file?  If so what is the parameter name and value?

Thanks,
Paul

zdenko podobny

unread,
May 5, 2014, 4:35:33 PM5/5/14
to tesser...@googlegroups.com
.hocr extention is hardcode in tesseract api[1]. You can not change if by parameter.
But you can run something like this:
    tesseract phototest.tif - hocr >phototest.html


Zdenko


--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/348c42bc-405b-4b49-a951-f1048dab0533%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages