Q:
We are trying to build an ontology based search engine. We are
running into a problem where we need to identify some sections like tables headers,
headings, footers etc. from a pdf document. Is there a way to accomplish this
using pdftron?
--------
A:
PDFNet could be used to implement structure recognition (or to extract existing structure, if available).
For some background on structure recognition please see:
PDFGenie technology described in the article is also available as a PDFNet SDK add-on (in 'pdftron.PDF.Convert.ToHtml()' when SetReflow() option is set ).