Docx to PDF conversion: PDF/A support

1,845 views
Skip to first unread message

rama

unread,
Nov 7, 2013, 5:57:32 AM11/7/13
to xdocr...@googlegroups.com
Hi All,

For Docx to PDF conversion XDocReport uses either iText or Apache FOP. Both of these underlying frameworks offer the possibility to set PDF/A Conformance level but unfortunately i could not
find any way to set PDF/A Conformance level using XDocReport. 

Is it possible to set PDF/A Conformance level using XDocReport or not? This is the decision point in my project whether to use XDocReport or not.

Please help me :)

Angelo zerr

unread,
Nov 7, 2013, 6:03:53 AM11/7/13
to xdocr...@googlegroups.com
Hi rama,

At first PDF converter with Apache FOP is just a POC for XDocReport, we have gaven this implementation(FOP uses a lof of Memory and it's not performant because it requires FO and code is awfull because it is based on XSLT).

I don't know "PDF/A Conformance level", but if you know how to set this configuration with iText we could modify the converter.

Regards Angelo


2013/11/7 rama <inform...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "xdocreport" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xdocreport+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

rama

unread,
Nov 7, 2013, 6:47:39 AM11/7/13
to xdocr...@googlegroups.com
Hi Angelo,

see PdfWriter.setPDFXConformance(PdfWriter.PDFA1A/PdfWriter.PDFA1B)

Thanks.

Angelo zerr

unread,
Nov 7, 2013, 8:22:09 AM11/7/13
to xdocr...@googlegroups.com
Hi rama,

Many thank's for your explanation.


Now you have IPdfWriterConfiguration#configure( PdfWriter writer ) interface that you can implement to customize the PdfWriter as you wish.

Here a sample to use it : 

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
        org.apache.poi.xwpf.converter.pdf.PdfOptions options = new PdfOptions();
        options.setConfiguration( new IPdfWriterConfiguration()
        {
            
            @Override
            public void configure( PdfWriter writer )
            {
                writer.setPDFXConformance( PdfWriter.PDFA1A );
            }
        } );
        PdfConverter.getInstance().convert( document, out, options );
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Tell me if this idea please you, after that I will update the wiki with this new feature.

Regards Angelo


2013/11/7 rama <inform...@gmail.com>
Hi Angelo,

--

rama

unread,
Nov 8, 2013, 3:38:55 AM11/8/13
to xdocr...@googlegroups.com
Thanks. I dont use Converter directly but through IXDocReport.  For me it is important to utilize the Templating engine during the conversion process and from examples i can see only way to use templating engine through IXDocReport. But IXDocReport.convert method uses fr.opensagres.xdocreport.converter.Options as parameter. This Options class is different from PdfOptions or its parent class. Is it possible to provide such a configuration there? Or tell me how can i use templating engine using Converters directly?


On Thursday, November 7, 2013 11:57:32 AM UTC+1, rama wrote:

Angelo zerr

unread,
Nov 8, 2013, 3:46:29 AM11/8/13
to xdocr...@googlegroups.com
If you wish to use org.apache.poi.xwpf.converter.pdf.PdfOptions, you must use fr.opensagres.xdocreport.converter.Options#subOptions like this : 

 org.apache.poi.xwpf.converter.pdf.PdfOptions pdfOptions = ...
Options.getTo(ConverterTypeTo.PDF).via(ConverterTypeVia.XWPF).subOptions(pdfOptions);

IXDocReport report = ...
report.convert(context, options, out);

Regards Angelo



2013/11/8 rama <inform...@gmail.com>

--

rama

unread,
Nov 8, 2013, 4:39:26 AM11/8/13
to xdocr...@googlegroups.com
Perfect. Until when can i expect this fix. I would test this with a snapshot version. Is there any possibility to use maven dependency to a specific snapshot version?


On Thursday, November 7, 2013 11:57:32 AM UTC+1, rama wrote:

Angelo zerr

unread,
Nov 8, 2013, 4:42:53 AM11/8/13
to xdocr...@googlegroups.com

rama

unread,
Nov 8, 2013, 5:42:09 AM11/8/13
to xdocr...@googlegroups.com
Thanks, actually i have to use XDocReporrt for docx --> pdf & xhtml conversion. I know Docx4J is slower & memory hungry than others but on my test system Docx4J gives better results in terms of formatting. 

In iText conversion to pdf: one page docx was converted to 2 pages pdf. tabs were not correctly formatted. footer seems correct, haven't tried with header yet! Is there any limitation on using the image in Header section???

In Docx4J conversion to pdf: pdf seems much better but only image was misplaced. dont know why???
In DocX4J conversion to xhtml: result is fine BUT many HTML closing tags were missing. Is there any option to do HTML cleaning during conversion process?

Do you know where can i find the Docx4J formatting limitations (pdf & xhtml)?


On Thursday, November 7, 2013 11:57:32 AM UTC+1, rama wrote:

Angelo zerr

unread,
Nov 8, 2013, 5:49:32 AM11/8/13
to xdocr...@googlegroups.com
but on my test system Docx4J gives better results in terms of formatting. 

Please create several issues with simple docx in order to we try to improve our iText converter.

For docx4j question, please post your question on docx4j forum.

Regards Angelo




2013/11/8 rama <inform...@gmail.com>
Thanks, actually i have to use XDocReporrt for docx --> pdf & xhtml conversion. I know Docx4J is slower & memory hungry than others but on my test system Docx4J gives better results in terms of formatting. 

--
Reply all
Reply to author
Forward
0 new messages