Status: New
Owner: ----
Labels: Type-Defect Priority-Medium
New issue 168 by
Mr.M.McM...@gmail.com: PDF Converter loses formatting
http://code.google.com/p/xdocreport/issues/detail?id=168
The problem is described below, but firstly can I request that you:
1. Provide a list of the PDF converter improvements that you mention
in Issue 159 ("I suggest you to use 1.0.0-SNAPSHOT (use maven for
that or download at hand on the for docx converter because teh
converter is very very lot improved (I'm improving again). The
docx converter 0.9.8 is very bad.").
2. Indicate when 1.0.0-SNAPSHOT will be released - presumable as
version 1.0.0.
This would be very useful info for me in deciding whether to use xdocreport
going forwards.
What steps will reproduce the problem?
1. Run FormattingTests.docx through the PDF converer code (eg see
attached modified java junit and associated docx file).
2. Observe the output in the PDF conversion (see attached pdf file).
What is the expected output?
. It is expected that the pdf formatting matches the docx exactly. The
following is an analysis of the differences. Note that in addition to
these, header and footer formatting did not work at all well.
. Tables:
. Row height of less than 1cm is converted to 1cm.
. A table which is not of full page width will be centred in
the page.
. Coloured table borders are converted to black.
. Free text:
. The number of characters per line appears to have increased
between docx and pdf. The font size produced in the PDF appears
to be slightly larger than the source. This is difficult to
determine and requires further analysis to confirm what the
exact nature of the difference is.
. Font/style:
. The PDF and DOCX rendering of the different fonts and sizes
differs slightly (as mentioned in the free text section).
. Header 3 styling appears to be too small.
. Strikethrough appears as normal text.
. Subscript appears as normal text.
. Superscript appears as normal text.
. Highlighting is lost.
. Bullets:
. Microsoft Word bullets are lost
. Microsoft Word numbering is lost
. Microsoft Word multilevel lists lose numbering and indentation
beyond the first item.
Note however that all of these bullet representations can be
reproduced as non-Microsoft bullets using normal text and will
survive the pdf translation (see example in attached files).
. Tabs:
. Tabs within text are lost.
. Images:
. Text alongside an image results in both the text and the image
being misplaced.
What do you see instead?
. See above.
What version of the product are you using?
. XDocReport 0.9.8
On what operating system?
. Windows XP.
Please provide any additional information below.
Attachments:
DocxProjectWithVelocity2PDF.java 9.9 KB
FormattingTests.docx 22.3 KB
FormattingTests.pdf 13.6 KB