Is that possible to pass input file for conversion as InputStream in Java

260 views
Skip to first unread message

rasa...@gmail.com

unread,
Jan 18, 2016, 1:31:10 PM1/18/16
to PDFTron PDFNet SDK
Convert.wordToPdf methods only accept input as filename. Is that possible to introduce a new method that accepts InputStream instead. PDFDoc.save method accepts OutputStream, so it makes sense to be able to read from InputStream specially when it comes from other sources like URL.

Thanks,

Ryan

unread,
Jan 18, 2016, 4:54:10 PM1/18/16
to pdfne...@googlegroups.com
Yes. There is now a WordToPdf API that takes in a filter. For Java it is the following.
https://www.pdftron.com/pdfnet/docs/PDFNetJava/com/pdftron/pdf/Convert.html#wordToPdf%28com.pdftron.sdf.Doc,%20com.pdftron.filters.Filter,%20com.pdftron.pdf.WordToPDFOptions%29

Latest builds here
https://www.pdftron.com/pdfnet/downloads.html

Now, how do you use it with an InputStream? The following code will do this for you. Note that currently everything needs to be loaded in memory. This is because document formats like docx and pdf, require random access to bytes, so the entire stream needs to be loaded.

stream = new FileInputStream(file);
com
.pdftron.filters.MemoryFilter memoryFilter = new com.pdftron.filters.MemoryFilter(stream.available(), false); // false = sink
com
.pdftron.filters.FilterWriter writer = new com.pdftron.filters.FilterWriter(memoryFilter); // helper filter to allow us to write to buffer
int buf_sz = 1024 * 1024; // set intermediate buffer to 1MiB
byte[] buf = new byte[buf_sz];
int read;
int total_read = 0;
while ((read = stream.read(buf)) != -1) {
   
if(read < buf_sz) {
       
// last read will (certainly) contain less bytes, so write just those
       
for(int i = 0; i < read; ++i) {
            writer
.writeUChar(buf[i]);
       
}
   
} else {
        writer
.writeBuffer(buf);
   
}
    total_read
+= read;
}
writer
.flush(); // Don't forget to flush!
memoryFilter
.setAsInputFilter(); // switch from sink to source
Convert.officeToPdf(pdfdoc, memoryFilter, null);




Ryan

unread,
Jun 7, 2017, 2:30:34 PM6/7/17
to pdfne...@googlegroups.com
Another customer asked for C# code, to convert Office file entirely in memory.

// For demo purpose use FileStream
FileStream fs = new FileStream(input_path + "simple-word_2007.docx", FileMode.Open);
pdftron
.Filters.MemoryFilter memoryFilter = new pdftron.Filters.MemoryFilter((int)fs.Length, false); // false = sink
pdftron
.Filters.FilterWriter writer = new pdftron.Filters.FilterWriter(memoryFilter); // helper filter to allow us to write to buffer
int bytes_read = 0;
byte[] buf = new byte[10 * 1024]; // 10 MiB buffer
do
{
    bytes_read
= fs.Read(buf, 0, buf.Length);
   
if(bytes_read < buf.Length)
   
{
       
for(int i = 0; i < bytes_read; i++)
       
{
            writer
.WriteUChar(buf[i]);
       
}
   
}
   
else
   
{
        writer
.WriteBuffer(buf);
   
}
} while (bytes_read > 0);
writer
.Flush();
memoryFilter
.SetAsInputFilter(); // switch from sink to source
PDFDoc pdfdoc = new PDFDoc();
pdftron
.PDF.Convert.OfficeToPDF(pdfdoc, memoryFilter, null);
// For demo purpose write back to disk
pdfdoc
.Save(output_path + "simple-word_2007.docx.pdf", SDFDoc.SaveOptions.e_linearized);
// But most likely you want to save in memory
byte[] pdfData = pdfdoc.Save(
SDFDoc.SaveOptions.e_linearized);

Reply all
Reply to author
Forward
0 new messages