Convert PDF to HTML in a single page continuous view

56 views
Skip to first unread message

Support

unread,
Jan 3, 2014, 7:58:31 PM1/3/14
to pdfne...@googlegroups.com
Q:

We are using pdftron.PDF.Convert.ToHtml to convert a PDF to HTML. Conversion quality is good but it converts each PDF page in a separate HTML file. I am using below code
 
string strSourcePath = System.Configuration.ConfigurationSettings.AppSettings["SourcePath"];
string strDestinationPath = System.Configuration.ConfigurationSettings.AppSettings["DestinationPath"];
pdftron.PDF.Convert.HTMLOutputOptions o = new pdftron.PDF.Convert.HTMLOutputOptions();
pdftron.PDF.Convert.ToHtml(strSourcePath, strDestinationPath, o);
 
Is  there any way get single HTML file using this tool ?

--------
A:

The PDFNet conversion module provides no built-in way to merge each page into a single HTML file.  However, you could certainly do so yourself --- the output is valid XML, so you could use any XML library to make changes. 


Another option may be to use a master HTML page and iframes to inject individual HTML pages as show in the following samples:

 

http://www.pdftron.com/pdfnet/pdf2html/demo/viewer.html?d=/pdf2html/177.progworld&pages=3

http://www.pdftron.com/pdfnet/pdf2html/demo.html

 

You can generate the master HTML page (that  references other HTML pages via iframes) using PDFNet (e.g. to get page # use PDFDoc.GetPageCount() etc).







Reply all
Reply to author
Forward
0 new messages