You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
Hi, I have some scanned pdf files which contain table on each page , some tables have borders and some without border and lines. I want to extract the formatted table with data in it to a word or excel format.I am totally new to tesseract-ocr and don't know how to use this in C++ or C#. Proper Guidance on detection of table and saving output using tesseract is highly appreciated. Thanks
Best Regards Adnan
John Muccigrosso
unread,
Feb 12, 2017, 2:48:49 PM2/12/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tesseract-ocr
You might want to use Tabula instead, provided that the pdf contains the text and numbers and not just images of them.