How to OCR two or three column document using tesseract
1,435 views
Skip to first unread message
Justin erno
unread,
Jul 27, 2015, 5:40:45 AM7/27/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hi all, My goal is ocr, multiple column including text in a document. And get out put file in a correct format. Is there any method to identify column in a document using tesseract?
Thanks in advance :)
Helmut Wollmersdorfer
unread,
Jul 29, 2015, 3:44:18 AM7/29/15
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr, spkm...@gmail.com
Am Montag, 27. Juli 2015 11:40:45 UTC+2 schrieb Justin erno:
Hi all, My goal is ocr, multiple column including text in a document. And get out put file in a correct format. Is there any method to identify column in a document using tesseract?
Did a two column document yesterday. Tesseract recognices it automatically. In plain text output column 2 appears after column 1.
In hOCR or PDF output you need to deal with the coordinates.